You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the _prepare_messages method in modellitellm.py, I found that the project uses the markdown produced by the previous LLM as a few-shot example for the next image conversion by the LLM.
This might lead to potential issues: for instance, if there is a formatting deviation during one of the conversions in a 200-page PDF, the deviation could accumulate and become increasingly significant, similar to an effect like 0.9^200 = 0.000000000705508.
Do you think it would be possible to add some intermediate steps, perhaps by using some multimodal LLMs, to perform self-consistency checks on the output?
The text was updated successfully, but these errors were encountered:
I found that this feature is optional, but it is enabled by default. Nonetheless, this improvement still seems necessary. I wonder what the author thinks about this matter.
In the
_prepare_messages
method inmodellitellm.py
, I found that the project uses the markdown produced by the previous LLM as a few-shot example for the next image conversion by the LLM.This might lead to potential issues: for instance, if there is a formatting deviation during one of the conversions in a 200-page PDF, the deviation could accumulate and become increasingly significant, similar to an effect like 0.9^200 = 0.000000000705508.
Do you think it would be possible to add some intermediate steps, perhaps by using some multimodal LLMs, to perform self-consistency checks on the output?
The text was updated successfully, but these errors were encountered: