Skip to content

Model.from_pretrained breaks when using SinusoidalEmbedding #37671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
4 tasks
ZhiyuanChen opened this issue Apr 22, 2025 · 0 comments
Open
4 tasks

Model.from_pretrained breaks when using SinusoidalEmbedding #37671

ZhiyuanChen opened this issue Apr 22, 2025 · 0 comments
Labels

Comments

@ZhiyuanChen
Copy link
Contributor

ZhiyuanChen commented Apr 22, 2025

System Info

  • transformers version: 4.51.0
  • Platform: macOS-15.3.1-arm64-arm-64bit
  • Python version: 3.12.9
  • Huggingface_hub version: 0.30.2
  • Safetensors version: 0.5.3
  • Accelerate version: 1.6.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.6.0 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed

Who can help?

SinusoidalEmbedding does not require state_dict, and since there are some bugs related to loading/saving its states (#31387), a work around is to override its related functions:

    def state_dict(self, destination=None, prefix="", keep_vars=False):
        return {}

    def load_state_dict(self, *args, state_dict, strict=True):
        return

    def _load_from_state_dict(
        self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs
    ):
        return

This used to works, but a recent update breaks it (Going back to transformers 4.50 worked fine).

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

pip install multimolecule

Failure:

from transformers import AutoConfig, AutoModel, AutoTokenizer, pipeline
from multimolecule.models import ErnieRnaForSecondaryStructurePrediction as Model

model = Model.from_pretrained("multimolecule/ernierna-ss")

model.to("cuda")

NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

Works:

from transformers import AutoConfig, AutoModel, AutoTokenizer, pipeline
from multimolecule.models import ErnieRnaForSecondaryStructurePrediction as Model

model = Model(AutoConfig.from_pretrained("multimolecule/ernierna-ss"))

model.to("cuda")

Expected behavior

No Error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant