Skip to content

qlora微调Qwen2.5-Omni模型报错ValueError: Processor was not found, please check and update your processor config. #7778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
leidaoyu opened this issue Apr 20, 2025 · 4 comments
Labels
bug Something isn't working pending This problem is yet to be addressed

Comments

@leidaoyu
Copy link

leidaoyu commented Apr 20, 2025

Reminder

  • I have read the above rules and searched the existing issues.

System Info

llamafactory version: 0.9.3.dev0
Python version: 3.11.11
PyTorch version: 2.5.1+cu124
Transformers version: 4.52.0.dev0

Reproduction

"""
Traceback (most recent call last):
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\multiprocess\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\utils\py_utils.py", line 680, in _write_generator_to_queue
for i, result in enumerate(func(**kwargs)):
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\arrow_dataset.py", line 3516, in _map_single
for i, batch in iter_outputs(shard_iterable):
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\arrow_dataset.py", line 3466, in iter_outputs
yield i, apply_function(example, i, offset=offset)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\arrow_dataset.py", line 3389, in apply_function
processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\data\processor\supervised.py", line 99, in preprocess_dataset
input_ids, labels = self._encode_data_example(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\data\processor\supervised.py", line 43, in _encode_data_example
messages = self.template.mm_plugin.process_messages(prompt + response, images, videos, audios, self.processor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\data\mm_plugin.py", line 1584, in process_messages
self._validate_input(processor, images, videos, audios)
raise ValueError("Processor was not found, please check and update your processor config.")
ValueError: Processor was not found, please check and update your processor config.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Users\76425.conda\envs\torch\Scripts\llamafactory-cli.exe_main
.py", line 7, in
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\cli.py", line 115, in main
COMMAND_MAPcommand
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\train\tuner.py", line 107, in run_exp
_training_function(config={"args": args, "callbacks": callbacks})
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\train\tuner.py", line 69, in _training_function
run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\train\sft\workflow.py", line 51, in run_sft
dataset_module = get_dataset(template, model_args, data_args, training_args, stage="sft", **tokenizer_module)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\data\loader.py", line 310, in get_dataset
dataset = _get_preprocessed_dataset(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425\Desktop\LLaMA-Factory\src\llamafactory\data\loader.py", line 256, in _get_preprocessed_dataset
dataset = dataset.map(
^^^^^^^^^^^^
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\arrow_dataset.py", line 557, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\arrow_dataset.py", line 3166, in map
for rank, done, content in iflatmap_unordered(
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\utils\py_utils.py", line 720, in iflatmap_unordered
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\datasets\utils\py_utils.py", line 720, in
[async_result.get(timeout=0.05) for async_result in async_results]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\76425.conda\envs\torch\Lib\site-packages\multiprocess\pool.py", line 774, in get
raise self._value
ValueError: Processor was not found, please check and update your processor config.

Others

按照transformers的说明pip install git+https://github.com/huggingface/[email protected]还是报错

@leidaoyu leidaoyu added bug Something isn't working pending This problem is yet to be addressed labels Apr 20, 2025
@Kuangdd01
Copy link
Collaborator

pip install git+https://github.com/Kuangdd01/transformers.git@qwen25omni 暂时先用这个试一下

@leidaoyu
Copy link
Author

pip install git+https://github.com/Kuangdd01/transformers.git@qwen25omni 暂时先用这个试一下

非常感谢!这个错解了,但是又出现一个新的错,方便看看是什么原因吗?
我的训练集json格式是:
{
"id": "example_1",
"conversations": [
{
"role": "system",
"content": "你是一个可以理解中英文语音的助手,请把输入音频转换成文本"
},
{
"role": "user",
"content": "<|AUDIO|audio_path<|AUDIO|>"
},
{
"role": "assistant",
"content": "asr结果"
}
]
}

Image

@Kuangdd01
Copy link
Collaborator

参考data/mllm_audio_demo.jdon组织一下数据格式

@leidaoyu
Copy link
Author

参考data/mllm_audio_demo.jdon组织一下数据格式

我把数据处理成了如下格式,但是仍然报错
{
"messages": [
{
"role": "system",
"content": "你是一个可以理解中英文语音的助手,请把输入音频转换成文本"
},
{
"role": "user",
"content": ""
},
{
"role": "assistant",
"content": "asr_result"
}
],
"audios": [
"audio.wav"
]
}

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

2 participants