-
Notifications
You must be signed in to change notification settings - Fork 131
Description
We need to continue unified structure learning on the Docowl1.5-stage1 model using some private data, followed by LoRA fine-tuning for the Document Parsing task. We modified the parameters in the 'finetune-docowl.sh' script based on the original paper, setting tune_vision2text=True, freeze_vision_model=False, and freeze_base_model=True in order to perform unified structure learning. After this stage, the model was able to perform inference normally. We then proceeded with fine-tuning the model for the 'Document Parsing' task using the 'finetune-docowl_lora.sh' script, aiming to further improve its performance. During this fine-tuning process, the model’s loss decreased as expected. However, after applying LoRA fine-tuning, the model's inference results became confused. That said, we were able to achieve the desired results by directly applying LoRA fine-tuning to the Docowl1.5 S1 model.
We would appreciate any suggestions you might have regarding our experimental design to help us achieve the expected results.