- 
                Notifications
    
You must be signed in to change notification settings  - Fork 40
 
Description
Expected behavior
Description
This may seem like a very specific situation and not really a bug, so when I train the lightgbm model (via sklearn API) with early stopping callback, and tune parameters via OptunaSearchCV, there seems to be a issue that occur multiple models share the same early callback object. If  OptunaSearchCV(n_jobs = 1) this does not seem to be an issue, since a single new model is trained each time, the callback object reinitiates for each new model, however, when n_jobs > 1, multiple models start training at the same time, and it seems they are updating the same underlying callback object, which might sometimes cause incorrect early stopping for the models.
I obeserved this first when I set OptunaSearchCV(n_jobs = 2) (thus 2 trials of optuna search happen at the same time or 2 models are trained at the same time), and the best iterations for the 2 models would sometimes be the same, which should not be possible if they are trying different hyperparameters. As seen in the logs in the example below, optuna starts 2 trials concurrently each time (thus the 2 consecutive best iteration logs), and sometimes the two trials have the same best iteration (this is especially prominent if each trial takes a long time to complete).
Additional Comments
Currently, I wrote a wrapper function around the lgbm.early_stopping that creates a new early_stop_callback  if it is a new model and stores it in a dictionary (using the env.model as a key), and basically give each model its own callback. Is it possible implement OptunaSearchCV to create new callbacks for each new model it trains to accomodate parallel training of multiple models with callbacks?
Environment
- Optuna version:3.6.1
 - Optuna Integration version:4.4.0
 - Python version:3.10.17
 - OS:Linux-5.14.0-427.42.1.el9_4.x86_64-x86_64-with-glibc2.34
LightGBM version or commit hash:
4.6.0 
Error messages, stack traces, or logs
output:
[I 2025-06-30 12:26:31,838] A new study created in memory with name: no-name-3caa9c36-2a44-4493-93fd-f04edfd5c3ec
Training until validation scores don't improve for 100 rounds
Training until validation scores don't improve for 100 rounds
Early stopping, best iteration is:
[19]	valid_0's average_precision: 0.995816
Early stopping, best iteration is:
[19]	valid_0's average_precision: 0.995816
[/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739](https://vscode-remote+tunnel-002bmyhpc.vscode-resource.vscode-cdn.net/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739): UserWarning: X does not have valid feature names, but LGBMClassifier was fitted with feature names
  warnings.warn(
[/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739](https://vscode-remote+tunnel-002bmyhpc.vscode-resource.vscode-cdn.net/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739): UserWarning: X does not have valid feature names, but LGBMClassifier was fitted with feature names
  warnings.warn(
Training until validation scores don't improve for 100 rounds
Training until validation scores don't improve for 100 rounds
Early stopping, best iteration is:
[31]	valid_0's average_precision: 0.995391
Early stopping, best iteration is:
[31]	valid_0's average_precision: 0.995391
[/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739](https://vscode-remote+tunnel-002bmyhpc.vscode-resource.vscode-cdn.net/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739): UserWarning: X does not have valid feature names, but LGBMClassifier was fitted with feature names
  warnings.warn(
[/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739](https://vscode-remote+tunnel-002bmyhpc.vscode-resource.vscode-cdn.net/users/ysu13/miniforge3/envs/drug_sensitivity_ml/lib/python3.10/site-packages/sklearn/utils/validation.py:2739): UserWarning: X does not have valid feature names, but LGBMClassifier was fitted with feature names
  warnings.warn(
Training until validation scores don't improve for 100 rounds
Training until validation scores don't improve for 100 rounds
Early stopping, best iteration is:
[50]	valid_0's average_precision: 0.996493
Early stopping, best iteration is:
[50]	valid_0's average_precision: 0.996493Steps to reproduce
Reproducible example
from lightgbm import LGBMClassifier
from optuna.integration import OptunaSearchCV
import optuna
import lightgbm as LGB
from sklearn.model_selection import StratifiedKFold
from sklearn import datasets
X, y= datasets.make_classification(n_samples=1000, n_features=2000, n_classes=2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=20)
lgbm_search_params= {
        "max_depth": optuna.distributions.IntDistribution(3, 20),
        "min_child_samples": optuna.distributions.IntDistribution(3, 30),
        "min_child_weight": optuna.distributions.IntDistribution(3, 20),
        "learning_rate": optuna.distributions.FloatDistribution(0.01, 0.2),
        "num_leaves": optuna.distributions.IntDistribution(10, 200),
        "subsample": optuna.distributions.FloatDistribution(0.5, 1.0)
}
lgbm = LGBMClassifier(n_jobs=8, is_unbalance = False, verbose = -1, n_estimators = 500,  metric = None, subsample_freq= 1))
search_estimator_best = OptunaSearchCV(
            lgbm,
            param_distributions=lgbm_search_params,
            cv=StratifiedKFold(n_splits=5, shuffle=True, random_state=42),
            scoring='average_precision',
            n_jobs=2,
            n_trials=6,
            verbose=2
)
search_estimator_best.fit(X_train, y_train, **{'eval_set' : [(X_test, y_test)], 
                             'eval_metric' : 'average_precision', 
                             'callbacks' : [LGB.early_stopping(stopping_rounds=100)]})Additional context (optional)
I previously opened this issue over at lightgbm
see microsoft/LightGBM#6957
@jameslamb