Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix: training arguments print format
#1552 opened Apr 24, 2025 by vicoooo26 Loading…
Fp8 LM-head
#1551 opened Apr 22, 2025 by dhia680 Loading…
fix some bug
#1544 opened Apr 17, 2025 by Thaurun Loading…
lora offload
#1540 opened Apr 15, 2025 by sanandaraj5597 Loading…
Lora offload
#1539 opened Apr 15, 2025 by sanandaraj5597 Loading…
Swiglu fusion
#1538 opened Apr 15, 2025 by rachitgarg91 Loading…
Add fused swiglu for MLP
#1536 opened Apr 15, 2025 by michal2409 Loading…
Fix AttributeError in MultiTokenPredictionLayer
#1529 opened Apr 12, 2025 by shenyunhang Loading…
Fix typo on distrib_optimizer.py
#1505 opened Mar 26, 2025 by wplf Loading…
fix: MultiLatentAttention cp_comm_type
#1499 opened Mar 24, 2025 by RandMist Loading…
Fix llama_mistral loader by using args.true_vocab_size
#1491 opened Mar 20, 2025 by zhuzilin Loading…
vscode/cursor devcontainer
#1483 opened Mar 14, 2025 by yzhang123 Loading…
Set hashlib.md5 usedforsecurity=False, #1471
#1472 opened Mar 12, 2025 by jsta Loading…
Draft: Youngeun/a2a hiding
#1460 opened Mar 10, 2025 by lhb8125 Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.