We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
目前的PaddleNLP是否支持多机部署,并允许用户自定义划分 DP/PP/TP ?
The text was updated successfully, but these errors were encountered:
训练是支持的,请问具体是哪个模型和需求呢?
Sorry, something went wrong.
想使用两台P800机器16卡,跑满血版deepseek V3,启动命令:
python -m paddle.distributed.launch --devices=0,1,2,3,4,5,6,7 --master=192.168.0.16:8090 --nnodes 2 --nproc_per_node 8 --rank 0 deepseek_V3.py
python -m paddle.distributed.launch --devices=0,1,2,3,4,5,6,7 --master=192.168.0.16:8090 --nnodes 2 --nproc_per_node 8 --rank 1 deepseek_V3.py
但是运行后不知道为什么还是会爆显存溢出问题,明明两机16卡空间应该是够的,单卡98G的显存
deepseek_V3.py 代码: from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3") model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-V3", dtype="float16") input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd") outputs = model.generate(**input_features, max_new_tokens=128) print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))
from paddlenlp.transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3") model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-V3", dtype="float16") input_features = tokenizer("你好!请自我介绍一下。", return_tensors="pd") outputs = model.generate(**input_features, max_new_tokens=128) print(tokenizer.batch_decode(outputs[0], skip_special_tokens=True))
辛苦看下启动命令是否有问题,还是两机16卡的P800本身跑不了满血版的deepseek V3吗
gongel
No branches or pull requests
请提出你的问题
目前的PaddleNLP是否支持多机部署,并允许用户自定义划分 DP/PP/TP ?
The text was updated successfully, but these errors were encountered: