Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix llm hp optimization error #2576

Open
wants to merge 4 commits into
base: release-1.9
Choose a base branch
from

Conversation

helenxie-bit
Copy link
Contributor

What this PR does / why we need it:
This PR aims to fix errors when using Katib LLM hyperparameter optimization API—which depends on the Trainer SDK v1.9.0—for running the example in the user guide.

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #2575

Checklist:

  • Docs included if any changes are user facing

Signed-off-by: helenxie-bit <[email protected]>
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jinchihe for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: helenxie-bit <[email protected]>
@helenxie-bit
Copy link
Contributor Author

Please review when you have time @andreyvelich @mahdikhashan . Thank you!

@helenxie-bit
Copy link
Contributor Author

helenxie-bit commented Mar 29, 2025

The E2E test for train API failed due to the following error TypeError: Object of type LoraRuntimeConfig is not JSON serializable. I'm working on fixing it.

Updated 2025-03-31:
I fixed the issue by updating the following line of code:

json.dumps(
trainer_parameters.lora_config.__dict__, cls=utils.SetEncoder
),

to:

json.dumps(trainer_parameters.lora_config.to_dict(), cls=utils.SetEncoder),

This change follows the official documentation, which recommends using LoraConfig.to_dict() for serialization.

@mahdikhashan Can you help test if this fix your issues? Since I remember you've met with the same issue.

@helenxie-bit helenxie-bit changed the title fix llm hp optimization error [WIP] fix llm hp optimization error Mar 29, 2025
Signed-off-by: helenxie-bit <[email protected]>
…46 of 🤗 Transformers. Use instead'

Signed-off-by: helenxie-bit <[email protected]>
@helenxie-bit helenxie-bit changed the title [WIP] fix llm hp optimization error Fix llm hp optimization error Mar 31, 2025
@mahdikhashan
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants