Skip to content

Commit 4733379

Browse files
authored
Parameter count in summary table (#773)
* draft parameter count * added disclaimer * added missing param counts
1 parent 1c1c619 commit 4733379

File tree

1 file changed

+11
-9
lines changed

1 file changed

+11
-9
lines changed

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,17 @@ Some these benchmarks are rather slow or take a long time to run on the referenc
4545
# MLPerf Training v4.1 (Submission Deadline Oct 11, 2024)
4646
*Framework here is given for the reference implementation. Submitters are free to use their own frameworks to run the benchmark.
4747

48-
| model | reference implementation | framework | dataset
49-
| ---- | ---- | ---- | ---- |
50-
| RetinaNet | [vision/object detection](https://github.com/mlcommons/training/tree/master/single_stage_detector) | pytorch | OpenImages
51-
| Stable Diffusionv2 | [image generation](https://github.com/mlcommons/training/tree/master/stable_diffusion) | pytorch | LAION-400M-filtered
52-
| BERT-large | [language/nlp](https://github.com/mlcommons/training/tree/master/language_model/tensorflow/bert) | tensorflow | Wikipedia 2020/01/01
53-
| GPT3 | [language/llm](https://github.com/mlcommons/training/tree/master/large_language_model) | paxml,megatron-lm | C4
54-
| LLama2 70B-LoRA | [language/LLM fine-tuning](https://github.com/mlcommons/training/tree/master/llama2_70b_lora) | pytorch | SCROLLS GovReport
55-
| DLRMv2 | [recommendation](https://github.com/mlcommons/training/tree/master/recommendation_v2/torchrec_dlrm) | torchrec | Criteo 3.5TB multi-hot
56-
| RGAT | [GNN](https://github.com/mlcommons/training/tree/master/graph_neural_network) | pytorch | IGBFull
48+
| model | reference implementation | framework | dataset | model parameter count
49+
| ---- | ---- | ---- | ---- | ----
50+
| RetinaNet | [vision/object detection](https://github.com/mlcommons/training/tree/master/single_stage_detector) | pytorch | OpenImages | 37M
51+
| Stable Diffusionv2 | [image generation](https://github.com/mlcommons/training/tree/master/stable_diffusion) | pytorch | LAION-400M-filtered | 865M
52+
| BERT-large | [language/nlp](https://github.com/mlcommons/training/tree/master/language_model/tensorflow/bert) | tensorflow | Wikipedia 2020/01/01 | 340M
53+
| GPT3 | [language/llm](https://github.com/mlcommons/training/tree/master/large_language_model) | paxml,megatron-lm | C4 | 175B
54+
| LLama2 70B-LoRA | [language/LLM fine-tuning](https://github.com/mlcommons/training/tree/master/llama2_70b_lora) | pytorch | SCROLLS GovReport | 70B
55+
| DLRMv2 | [recommendation](https://github.com/mlcommons/training/tree/master/recommendation_v2/torchrec_dlrm) | torchrec | Criteo 3.5TB multi-hot | 167M
56+
| RGAT | [GNN](https://github.com/mlcommons/training/tree/master/graph_neural_network) | pytorch | IGBFull | 25M
57+
58+
*Note model parameter count is not the same as active parameter that are being trained in the benchmark.
5759

5860
# MLPerf Training v4.0 (Submission Deadline May 10, 2024)
5961
*Framework here is given for the reference implementation. Submitters are free to use their own frameworks to run the benchmark.

0 commit comments

Comments
 (0)