Skip to content

AutoModel cant load Qwen/Qwen2.5-0mni-7B #37794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks
liwenju0 opened this issue Apr 25, 2025 · 3 comments
Open
4 tasks

AutoModel cant load Qwen/Qwen2.5-0mni-7B #37794

liwenju0 opened this issue Apr 25, 2025 · 3 comments
Labels

Comments

@liwenju0
Copy link

System Info

from transformers import AutoModel
model = AutoModel.from pretrained("Qwen/Qwen2.5-0mni-7B",torch dtype="auto"trust remote code=True)

will raise error:

Image

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import AutoModel
model = AutoModel.from pretrained("Qwen/Qwen2.5-0mni-7B",torch dtype="auto"trust remote code=True)

Expected behavior

load sucessfully

@jiangyukunok
Copy link
Contributor

@liwenju0
My understanding is AutoModel is designed for simple single-modality models, while Qwen is multi-modal and could make it hard for AutoModal to figure out how to assemble encoders/decoders.

You can workaround using the customized APIs: https://github.com/huggingface/transformers/blob/main/docs/source/en/model_doc/qwen2_5_omni.md

@zucchini-nlp
Copy link
Member

@liwenju0 for multimodal models we didn't have a base model in most cases because the models are usually a composition of LM and encoders. Thus the straightforward way was to compose a generative model only. I am currently adding a base model for multimodals in #37033 but it might not be usable with official checkopints

If you wanted to obtain last hidden state from Qwen-Omni, you can still load the ConditionalGeneration and get model outputs with out.hidden_states[-1]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants