使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

linssonSUSUSU · 2025-04-22T01:04:05Z

Reminder

I have read the above rules and searched the existing issues.

System Info

(llamafactory) ubuntu@little:~/bigdata/LLaMA-Factory$ llamafactory-cli env

llamafactory version: 0.9.3.dev0
Platform: Linux-5.15.0-134-generic-x86_64-with-glibc2.35
Python version: 3.10.0
PyTorch version: 2.8.0.dev20250407+cu128 (GPU)
Transformers version: 4.49.0
Datasets version: 3.2.0
Accelerate version: 1.2.1
PEFT version: 0.15.0
TRL version: 0.9.6
GPU type: NVIDIA GeForce RTX 5090
GPU number: 1
GPU memory: 31.36GB
Bitsandbytes version: 0.45.5
Git commit: 903db09

Reproduction

````text`
在autodl上使用vGPU-32G微调时最大使用batch_size为3，但现在使用5090时即使降低batch size也会报超显存

   attn_output = F.scaled_dot_product_attention(q, k, v, attention_mask, dropout_p=0.0)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.11 GiB. GPU 0 has a total capacity of 31.36 GiB of which 5.76 GiB is free. Including non-PyTorch memory, this process has 25.58 GiB memory in use. Of the allocated memory 24.73 GiB is allocated by PyTorch, and 260.07 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)



### Others

_No response_

The text was updated successfully, but these errors were encountered:

hiyouga · 2025-04-22T08:30:19Z

see FAQs

linssonSUSUSU added bug Something isn't working pending This problem is yet to be addressed labels Apr 22, 2025

hiyouga closed this as completed Apr 22, 2025

hiyouga reopened this Apr 22, 2025

hiyouga closed this as completed Apr 22, 2025

hiyouga added duplicate This issue or pull request already exists and removed bug Something isn't working pending This problem is yet to be addressed labels Apr 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

linssonSUSUSU commented Apr 22, 2025

hiyouga commented Apr 22, 2025

使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

使用5090lora微调qwen2.5-vl-7b报CUDA out of memory #7798

Comments

linssonSUSUSU commented Apr 22, 2025

Reminder

System Info

Reproduction

hiyouga commented Apr 22, 2025