LLama 3.2-vision without second stage OCR #95

giuliofrey · 2025-01-18T12:28:21Z

giuliofrey
Jan 18, 2025

Hi,

Thank you for providing this service. I see that LLaMA 3.2-vision is already quite capable and might not need a second stage OCR. To my knowledge, the Python wrapper currently does not allow for this.

Am I doing something wrong?

choinek · 2025-01-18T17:32:38Z

choinek
Jan 18, 2025
Maintainer

Hi!

The service is not performing the second stage of OCR.

We have two phases (or stages as you referred to):

Parameter: "strategy" (initial phase) – This phase converts the document into a unified format that can be processed by AI
- Currently, we only have OCR strategies for PDFs and images. However, we are working on Docling support, which will enable us to handle almost all types of text documents (e.g., word files) via our API – see [feat] Add docling support #54
Parameter: "model" (final phase) – This phase uses AI to structure the document into its final format

In the near future, we will create documentation to provide an architectural visualization of the entire process. :)

But please clarify what do you mean by "Python wrapper currently does not allow for this."
I'm not sure what you're referring to. Are you encountering any errors? If so, could you share more details?

0 replies

pkarw · 2025-01-21T15:04:20Z

pkarw
Jan 21, 2025
Maintainer

I think you mean you can't setup the dynamic prompt for Ollama based models, if you'd be able to you could potentially do the extraction and remodeling within single step right?

1 reply

giuliofrey Feb 6, 2025
Author

Yes exactly!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLama 3.2-vision without second stage OCR #95

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

LLama 3.2-vision without second stage OCR #95

giuliofrey Jan 18, 2025

Replies: 2 comments · 1 reply

choinek Jan 18, 2025 Maintainer

pkarw Jan 21, 2025 Maintainer

giuliofrey Feb 6, 2025 Author

giuliofrey
Jan 18, 2025

Replies: 2 comments 1 reply

choinek
Jan 18, 2025
Maintainer

pkarw
Jan 21, 2025
Maintainer

giuliofrey Feb 6, 2025
Author