[linen] Linesearch (and lbfgs) support for TrainState #4471

emiresenov · 2025-01-06T11:36:25Z

What does this PR do?

This PR modifies .update() to support the optional additional arguments value and value_fn for the GradientTransformationExtraArgs class.

These changes are in the same vein as the merged PR #4351, addressing #4144 for the TrainState class.

Checklist

This PR fixes a minor issue (e.g.: typo or small bug) or improves the docs (you can dismiss the other
checks if that's the case).
This change is discussed in a Github issue/
Linesearch (and lbfgs) support #4351 , Support for optax lbfgs and related optimizers with NNX #4144
The documentation and docstrings adhere to the
documentation guidelines.
This change includes necessary high-coverage tests.
Linesearch (and lbfgs) support #4351 includes a test with these modifications. Additionally, the modifications follow the spec for GradientTransformationExtraArgs. If you want me to write a file with test cases for the TrainState class, let me know. Still, I cannot find any tests for this class in the repository.

google-cla · 2025-01-06T11:36:29Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

cgarciae · 2025-01-07T18:20:29Z

flax/training/train_state.py

-    updates, new_opt_state = self.tx.update(
-      grads_with_opt, self.opt_state, params_with_opt
-    )
+    if value is None or value_fn is None:


I think if either of them is not None we should pass them.

Good idea. I'll update the PR

I figure the same logic should hold for grad since e.g. scale_by_polyak also usesGradientTransformationExtraArgs and only takes value as an additional keyword argument in .update().

I moved all three optional args to an update_kwargs dictionary, adding each to .update() only if they are not None. Additionally, my former commit should not have sent the **kwargs to .update() sinceTrainState uses these for something else – I mixed up the usage from the merged PR I was referring to.

See the changes in my latest commit and let me know if there are any issues.

cgarciae · 2025-01-08T03:24:41Z

This version looks better. Can you fix pre-commit issue? Run

pip install pre-commit
pre-commit run --all-files

cgarciae · 2025-01-08T03:29:21Z

Tangent: curious if nnx.Optimizer works with scale_by_polyak? I've never used it but we recently added some support for Linesearch in Optimizer as well.

emiresenov · 2025-01-08T05:52:09Z

This version looks better. Can you fix pre-commit issue? Run
pip install pre-commit
pre-commit run --all-files

Fixed, see my latest commit!

emiresenov · 2025-01-08T06:11:59Z

Tangent: curious if nnx.Optimizer works with scale_by_polyak? I've never used it but we recently added some support for Linesearch in Optimizer as well.

I haven't used it either, but I see no reason why it wouldn't work since the last commit of #4351 uses the same logic. I think the first commit of that PR (the one I mistakenly replicated in my first commit) suggested a solution that wouldn't work for scale_by_polyak used by optax.polyak_sgd since it didn't incorporate grad as an optional if only value was passed, presumably because the author only had Linesearch in mind.

However, this potential issue was fixed in the last commit of the PR when the author moved everything to their added kwargs in .update(). With this change, each additional value to the GradientTransformationExtraArgs update function is passed only if not None – this makes the solution more generalized and also enables scale_by_polyak which only takes value as an additional keyword argument.

Update train_state.py

9e0def5

emiresenov changed the title ~~Update train_state.py~~ [nnx] Linesearch (and lbfgs) support for TrainState Jan 6, 2025

cgarciae reviewed Jan 7, 2025

View reviewed changes

emiresenov force-pushed the trainstate-update branch from 52c4bc4 to 9e0def5 Compare January 7, 2025 20:03

optional params to kwargs

ddeb4dc

cgarciae changed the title ~~[nnx] Linesearch (and lbfgs) support for TrainState~~ [linen] Linesearch (and lbfgs) support for TrainState Jan 8, 2025

Fix trailing whitespace via pre-commit hook

7e77771

cgarciae approved these changes Jan 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[linen] Linesearch (and lbfgs) support for TrainState #4471

[linen] Linesearch (and lbfgs) support for TrainState #4471

emiresenov commented Jan 6, 2025 •

edited

Loading

google-cla bot commented Jan 6, 2025

cgarciae Jan 7, 2025

emiresenov Jan 7, 2025

emiresenov Jan 7, 2025 •

edited

Loading

cgarciae commented Jan 8, 2025

cgarciae commented Jan 8, 2025 •

edited

Loading

emiresenov commented Jan 8, 2025

emiresenov commented Jan 8, 2025 •

edited

Loading

[linen] Linesearch (and lbfgs) support for TrainState #4471

Are you sure you want to change the base?

[linen] Linesearch (and lbfgs) support for TrainState #4471

Conversation

emiresenov commented Jan 6, 2025 • edited Loading

What does this PR do?

Checklist

google-cla bot commented Jan 6, 2025

cgarciae Jan 7, 2025

Choose a reason for hiding this comment

emiresenov Jan 7, 2025

Choose a reason for hiding this comment

emiresenov Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

cgarciae commented Jan 8, 2025

cgarciae commented Jan 8, 2025 • edited Loading

emiresenov commented Jan 8, 2025

emiresenov commented Jan 8, 2025 • edited Loading

emiresenov commented Jan 6, 2025 •

edited

Loading

emiresenov Jan 7, 2025 •

edited

Loading

cgarciae commented Jan 8, 2025 •

edited

Loading

emiresenov commented Jan 8, 2025 •

edited

Loading