Allow higher fp16 tolerance for phlippe_resnet on CUDA 12.8 (#154109)

huydhn · facebook-github-bot · commit 1b4c31997c57 · 2025-05-22T16:44:32.000-07:00
Summary: After pytorch/pytorch#154004, one of the model `phlippe_resnet` needs higher tolerance for fp16 on CUDA 12.8. I can reproduce it locally with: ``` python benchmarks/dynamo/torchbench.py --accuracy --timing --explain --print-compilation-time --inductor --device cuda --training --amp --only phlippe_resnet E0522 02:47:12.392000 2130213 site-packages/torch/_dynamo/utils.py:2949] RMSE (res-fp64): 0.00144, (ref-fp64): 0.00036 and shape=torch.Size([]). res.dtype: torch.float32, multiplier: 3.000000, tol: 0.001000, use_larger_multiplier_for_smaller_tensor: 0 ``` I'm not sure what exactly happens behind the scene, but this should help fix the CI failure. Also remove some left over expected accuracy results for CUDA 12.4 which we are not using anymore on CI for benchmark jobs. X-link: pytorch/pytorch#154109 Approved by: https://github.com/Skylion007, https://github.com/malfet Reviewed By: yangw-dev Differential Revision: D75251772 fbshipit-source-id: 1cbf629f60f84bb5d2a51ad884370edbb923c388
diff --git a/userbenchmark/dynamo/dynamobench/torchbench.yaml b/userbenchmark/dynamo/dynamobench/torchbench.yaml
@@ -48,6 +48,7 @@ tolerance:
     - doctr_reco_predictor
     - drq
     - hf_Whisper
+    - phlippe_resnet
 
   higher_bf16:
     - doctr_reco_predictor