Skip to content

Model.from_pretrained breaks when using SinusoidalEmbedding #37671

Closed
@ZhiyuanChen

Description

@ZhiyuanChen

System Info

  • transformers version: 4.51.0
  • Platform: macOS-15.3.1-arm64-arm-64bit
  • Python version: 3.12.9
  • Huggingface_hub version: 0.30.2
  • Safetensors version: 0.5.3
  • Accelerate version: 1.6.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.6.0 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed

Who can help?

SinusoidalEmbedding does not require state_dict, and since there are some bugs related to loading/saving its states (#31387), a work around is to override its related functions:

    def state_dict(self, destination=None, prefix="", keep_vars=False):
        return {}

    def load_state_dict(self, *args, state_dict, strict=True):
        return

    def _load_from_state_dict(
        self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs
    ):
        return

This used to works, but a recent update breaks it (Going back to transformers 4.50 worked fine).

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

pip install multimolecule

Failure:

from transformers import AutoConfig, AutoModel, AutoTokenizer, pipeline
from multimolecule.models import ErnieRnaForSecondaryStructurePrediction as Model

model = Model.from_pretrained("multimolecule/ernierna-ss")

model.to("cuda")

NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

Works:

from transformers import AutoConfig, AutoModel, AutoTokenizer, pipeline
from multimolecule.models import ErnieRnaForSecondaryStructurePrediction as Model

model = Model(AutoConfig.from_pretrained("multimolecule/ernierna-ss"))

model.to("cuda")

Expected behavior

No Error

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions