-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the objective function for model fine-tuning #25
Comments
Hi! It's a data-driven implementation. |
Thank you for your prompt reply. I'm fine-tuning the original llava model using lora on 10k data (the data has been formatted and labeled in the style of the COT in the paper) in the hope of getting the output of the four stages of the COT as mentioned in the paper. But when I make the attempt, the output of the model appears garbled (containing Greek text, Spanish text, and some confusing coding), can you help me determine what the possible reasons for this situation are? Have you also experienced garbled output when training the model and how did you resolve this issue? |
I haven't encountered similarly outputs, but I think you can divide the process into 3 steps: First you can finetune a llava model on a normal dataset and make sure this works. Second you may check the encoding method of llava. If it will interpret <X> as some special tokens, this will lead to errors. Finally you can train again and see the results. |
Thanks for your suggestion, I have tried fine tuning on a normal dataset and this experiment turned out to be normal, I will go on to try the second point you mentioned. Thanks a lot for your help. |
Great work, I am curious about the objective function for model fine-tuning, the paper mentions Supervised FineTuning (SFT) approach for fine-tuning training, and my question is how to ensure that the model performs formatted output, i.e., the model output contains the tags SUMMARY,CAPTION,REASONING,CONCLUSION tags.
Does this result require the design of additional target functions? Or is it just a data-driven implementation.
The text was updated successfully, but these errors were encountered: