Skip to content

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
oschleic opened this issue Apr 18, 2025 · 0 comments
Open

Vision Preprocessor Not Initialized for LLaVA in Triton Workflow #737

oschleic opened this issue Apr 18, 2025 · 0 comments

Comments

@oschleic
Copy link

While following the multimodal workflow guide for Triton Server, I encountered an assertion error:

AssertionError: Vision preprocessor for preparing images before encoding is None

Relevant Code

Upon investigation, I noticed that VisionPreProcessor is only initialized for mllama, llava_onevision, and qwen2_vl:
Code Reference

However, 'llava' is included in an earlier assertion confirming it as a supported model type. This mismatch causes a failure when running inference.

Proposed Fix:
I recommend adding a llava_process method to VisionPreProcessor, ensuring LLaVA models correctly initialize preprocessing when needed:
VisionPreProcessor class

Questions for Maintainers:

  • Was LLaVA deliberately excluded from the vision preprocessing logic?
  • Would extending VisionPreProcessor in this way be the best approach?
  • Are there other dependencies or configurations I should check before implementing this change?

Please advise on whether this approach aligns with your intended workflow. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant