FEAT - add fit_intercept support for LBFGS #326

floriankozikowski · 2025-07-29T12:18:29Z

Context of the PR

Closes #325

Contributions of the PR

add fit_intercept support for LBFGS
add default warm_start=False, so LBFGS works with debug script in ProxNewton and Poisson Datafit Yields Incorrect Predictions #320

Checks before merging PR

added documentation for any new feature
added unit tests
edited the what's new (if applicable)

skglm/solvers/lbfgs.py

mathurinm · 2025-07-30T19:16:41Z

skglm/solvers/lbfgs.py

+                Xw = X @ w[:-1] + w[-1]
+                datafit_grad = datafit.gradient(X, y, Xw)
+                penalty_grad = penalty.gradient(w[:-1])
+                intercept_grad = datafit.intercept_update_step(y, Xw)


there may be an issue here, because intercept_update_step does not compute the gradient, but a scaled version of it (it's mulliplied by the stepsize).
The safest way to do it would be to call raw_grad().sum(), which is equivalent to np.ones(n_features) @ raw_grad(), e.g. the gradient with respect to a feature full of ones.

Sounds good @Badr-MOUFAD ?

Good catch @mathurinm,

I have a little concern because it seems that in some part of the code (Logostic datafit) intercept_update_step account for the stepsize, whereas in other parts (Quadratic, Huber, Poisson, ...) intercept_update_step evaluate the gradient

That being said, I agree that the safest option is to sum(raw_grad(y, Xw)).

For Quadratic (hand Huber I guess) it's because the stepsize is 1 (lc is $| X_j |^2 / n = 1$

you are right @mathurinm, thanks

What should we do for Poisson and Gamma datafits ? they implement intercept_update_step with the stepsize convention, but actually since they are non-quadratic it does't not make sense for them

floriankozikowski · 2025-07-31T09:26:21Z

@Badr-MOUFAD I tried the refactor (also considering your comment @mathurinm )
Let me know what you think. Didn't find an option to make this refactor shorter

floriankozikowski · 2025-07-31T09:31:23Z

Btw, looking at the initial issue #320 that initiated this PR, the intercept fitting improved the sklearn speed difference, but it is still slower.

--- Fitting Time Comparison ---
skglm (LBFGS): 8.0444 seconds
sklearn (L-BFGS): 2.4578 seconds

sklearn was 3.27x faster.

I guess in this PR, we only focus on the intercept, but maybe we should open a new issue investigating other causes for this.

If you approve the refactor, I can delete the debug script (issue320) and we can merge.

Badr-MOUFAD · 2025-07-31T10:18:19Z

Thanks @floriankozikowski for the timing comparison.
Weird that sklearn is x3 faster, but I agree to tackle this issue in a separate PR

Can you pls add the intercept to the unittest and update the whats'new page

floriankozikowski · 2025-08-01T16:37:09Z

@Badr-MOUFAD thanks for the feedback! It should be complete now and I'd say we can merge. I won't have access to my laptop the next two weeks, so if anything comes up, I will look at it mid-August again.

add fit_intercept support for LBFGS

f036743

Badr-MOUFAD reviewed Jul 30, 2025

View reviewed changes

skglm/solvers/lbfgs.py Outdated Show resolved Hide resolved

mathurinm reviewed Jul 30, 2025

View reviewed changes

refactor

1173877

fetch upstream main, edit whats new, and add intercept to unit test

54a0d0b

This was referenced Aug 2, 2025

ENH improve speed of BFGS solver #328

Open

ENH clarify intercept_update_step for Poisson and Gamma #329

Open

mathurinm approved these changes Aug 2, 2025

View reviewed changes

mathurinm merged commit 29d67fa into scikit-learn-contrib:main Aug 2, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT - add fit_intercept support for LBFGS #326

FEAT - add fit_intercept support for LBFGS #326

Uh oh!

floriankozikowski commented Jul 29, 2025

Uh oh!

Uh oh!

mathurinm Jul 30, 2025

Uh oh!

Badr-MOUFAD Jul 31, 2025 •

edited

Loading

Uh oh!

mathurinm Jul 31, 2025

Uh oh!

Badr-MOUFAD Jul 31, 2025 •

edited

Loading

Uh oh!

floriankozikowski commented Jul 31, 2025

Uh oh!

floriankozikowski commented Jul 31, 2025

Uh oh!

Badr-MOUFAD commented Jul 31, 2025

Uh oh!

floriankozikowski commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FEAT - add fit_intercept support for LBFGS #326

FEAT - add fit_intercept support for LBFGS #326

Uh oh!

Conversation

floriankozikowski commented Jul 29, 2025

Context of the PR

Contributions of the PR

Checks before merging PR

Uh oh!

Uh oh!

mathurinm Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Badr-MOUFAD Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mathurinm Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Badr-MOUFAD Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

floriankozikowski commented Jul 31, 2025

Uh oh!

floriankozikowski commented Jul 31, 2025

Uh oh!

Badr-MOUFAD commented Jul 31, 2025

Uh oh!

floriankozikowski commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Badr-MOUFAD Jul 31, 2025 •

edited

Loading

Badr-MOUFAD Jul 31, 2025 •

edited

Loading