Skip to content

使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
linssonSUSUSU opened this issue Apr 22, 2025 · 1 comment
Closed
1 task done

使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

linssonSUSUSU opened this issue Apr 22, 2025 · 1 comment
Labels
duplicate This issue or pull request already exists

Comments

@linssonSUSUSU
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

(llamafactory) ubuntu@little:~/bigdata/LLaMA-Factory$ llamafactory-cli env

  • llamafactory version: 0.9.3.dev0
  • Platform: Linux-5.15.0-134-generic-x86_64-with-glibc2.35
  • Python version: 3.10.0
  • PyTorch version: 2.8.0.dev20250407+cu128 (GPU)
  • Transformers version: 4.49.0
  • Datasets version: 3.2.0
  • Accelerate version: 1.2.1
  • PEFT version: 0.15.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA GeForce RTX 5090
  • GPU number: 1
  • GPU memory: 31.36GB
  • Bitsandbytes version: 0.45.5
  • Git commit: 903db09

Reproduction

````text`
在autodl上使用vGPU-32G微调时最大使用batch_size为3,但现在使用5090时即使降低batch size也会报超显存

   attn_output = F.scaled_dot_product_attention(q, k, v, attention_mask, dropout_p=0.0)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.11 GiB. GPU 0 has a total capacity of 31.36 GiB of which 5.76 GiB is free. Including non-PyTorch memory, this process has 25.58 GiB memory in use. Of the allocated memory 24.73 GiB is allocated by PyTorch, and 260.07 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)


### Others

_No response_
@linssonSUSUSU linssonSUSUSU added bug Something isn't working pending This problem is yet to be addressed labels Apr 22, 2025
@hiyouga
Copy link
Owner

hiyouga commented Apr 22, 2025

see FAQs

@hiyouga hiyouga closed this as completed Apr 22, 2025
@hiyouga hiyouga reopened this Apr 22, 2025
@hiyouga hiyouga closed this as completed Apr 22, 2025
@hiyouga hiyouga added duplicate This issue or pull request already exists and removed bug Something isn't working pending This problem is yet to be addressed labels Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants