About points that can be optimized in parse PDF to Markdown #137

pengjunfeng11 · 2025-02-03T08:47:59Z

In the _prepare_messages method in modellitellm.py, I found that the project uses the markdown produced by the previous LLM as a few-shot example for the next image conversion by the LLM.

This might lead to potential issues: for instance, if there is a formatting deviation during one of the conversions in a 200-page PDF, the deviation could accumulate and become increasingly significant, similar to an effect like 0.9^200 = 0.000000000705508.

Do you think it would be possible to add some intermediate steps, perhaps by using some multimodal LLMs, to perform self-consistency checks on the output?

The text was updated successfully, but these errors were encountered:

pengjunfeng11 · 2025-02-03T08:54:36Z

I found that this feature is optional, but it is enabled by default. Nonetheless, this improvement still seems necessary. I wonder what the author thinks about this matter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About points that can be optimized in parse PDF to Markdown #137

About points that can be optimized in parse PDF to Markdown #137

pengjunfeng11 commented Feb 3, 2025

pengjunfeng11 commented Feb 3, 2025

About points that can be optimized in parse PDF to Markdown #137

About points that can be optimized in parse PDF to Markdown #137

Comments

pengjunfeng11 commented Feb 3, 2025

pengjunfeng11 commented Feb 3, 2025