Skip to content

Commit 942979e

Browse files
desertfirefacebook-github-bot
authored andcommitted
Update how peak memory is measured (#150534)
Summary: In the dashboard measurement script, AOTI needs to run Eager first to register the output pytree, so the peak memory compression ratio on the dashboard is always close to 1. Update AOTI run to use an extra warmup run, so the peak memory compression ratio measures the result at the run time instead of the compile time. X-link: pytorch/pytorch#150534 Approved by: https://github.com/yushangdi Reviewed By: clee2000 Differential Revision: D72395560 fbshipit-source-id: f37e493d851ea665f88972effbd225e8250a022f
1 parent e5c9164 commit 942979e

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

userbenchmark/dynamo/dynamobench/common.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3735,6 +3735,10 @@ def run(runner, args, original_dir=None):
37353735
# AOTInductor doesn't support control flow yet
37363736
runner.skip_models.update(runner.skip_models_due_to_control_flow)
37373737
runner.skip_models.update(runner.skip_models_due_to_export_not_supported)
3738+
3739+
# For AOTI, we only measure the memory compression ratio at the run time
3740+
# instead of the compile time, so use a warmup run to trigger AOTI compilation.
3741+
args.use_warm_peak_memory = True
37383742
elif args.backend == "torchao":
37393743
assert "cuda" in args.devices, "Quantization requires CUDA device."
37403744
assert args.bfloat16, "Quantization requires dtype bfloat16."

0 commit comments

Comments
 (0)