Skip to content

[Question]: 多机部署支持 #10727

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
suzewei opened this issue Jun 12, 2025 · 3 comments
Open

[Question]: 多机部署支持 #10727

suzewei opened this issue Jun 12, 2025 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@suzewei
Copy link

suzewei commented Jun 12, 2025

请提出你的问题

目前的PaddleNLP是否支持多机部署,并允许用户自定义划分 DP/PP/TP ?

@suzewei suzewei added the question Further information is requested label Jun 12, 2025
@gongel
Copy link
Member

gongel commented Jun 16, 2025

训练是支持的,请问具体是哪个模型和需求呢?

@suzewei
Copy link
Author

suzewei commented Jun 16, 2025

训练是支持的,请问具体是哪个模型和需求呢?

想使用两台P800机器16卡,跑满血版deepseek V3,启动命令:

  • python -m paddle.distributed.launch --devices=0,1,2,3,4,5,6,7 --master=192.168.0.16:8090 --nnodes 2 --nproc_per_node 8 --rank 0 deepseek_V3.py

  • python -m paddle.distributed.launch --devices=0,1,2,3,4,5,6,7 --master=192.168.0.16:8090 --nnodes 2 --nproc_per_node 8 --rank 1 deepseek_V3.py

但是运行后不知道为什么还是会爆显存溢出问题,明明两机16卡空间应该是够的,单卡98G的显存

deepseek_V3.py 代码:
from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3") model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-V3", dtype="float16") input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd") outputs = model.generate(**input_features, max_new_tokens=128) print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))

@suzewei
Copy link
Author

suzewei commented Jun 17, 2025

训练是支持的,请问具体是哪个模型和需求呢?

辛苦看下启动命令是否有问题,还是两机16卡的P800本身跑不了满血版的deepseek V3吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants