Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

oschleic · 2025-04-18T22:59:21Z

While following the multimodal workflow guide for Triton Server, I encountered an assertion error:

AssertionError: Vision preprocessor for preparing images before encoding is None

Relevant Code

Upon investigation, I noticed that VisionPreProcessor is only initialized for mllama, llava_onevision, and qwen2_vl:
Code Reference

However, 'llava' is included in an earlier assertion confirming it as a supported model type. This mismatch causes a failure when running inference.

Proposed Fix:
I recommend adding a llava_process method to VisionPreProcessor, ensuring LLaVA models correctly initialize preprocessing when needed:
VisionPreProcessor class

Questions for Maintainers:

Was LLaVA deliberately excluded from the vision preprocessing logic?
Would extending VisionPreProcessor in this way be the best approach?
Are there other dependencies or configurations I should check before implementing this change?

Please advise on whether this approach aligns with your intended workflow. Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

oschleic commented Apr 18, 2025

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

Comments

oschleic commented Apr 18, 2025