Skip to content

Commit 3397c41

Browse files
authored
Merge pull request #52 from vecxoz/dev2
Global maintenance 2025 turn 3 pull
2 parents 1698f7b + bcbca76 commit 3397c41

21 files changed

+1414
-747
lines changed

.github/workflows/actions.yaml

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
name: Build
2+
3+
on: [push]
4+
5+
jobs:
6+
build:
7+
runs-on: ubuntu-latest
8+
strategy:
9+
matrix:
10+
python-version: ["3.9", "3.10", "3.11", "3.12"]
11+
12+
steps:
13+
- name: Checkout the ${{ github.repository }} repository
14+
uses: actions/checkout@v4
15+
- name: Set up Python ${{ matrix.python-version }}
16+
uses: actions/setup-python@v5
17+
with:
18+
python-version: ${{ matrix.python-version }}
19+
- name: Install dependencies
20+
run: |
21+
python -m pip install --upgrade pip
22+
pip install .[test]
23+
- name: Test with Pytest
24+
run: |
25+
pytest --cov=vecstack --cov-report=term-missing tests
26+
- name: Coveralls
27+
uses: coverallsapp/github-action@v2

.travis.yml

Lines changed: 0 additions & 40 deletions
This file was deleted.

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changelog
22

3+
### v0.5.0 -- September 8, 2025 -- Maintenance release
4+
5+
* Python 3.9+
6+
* Testing: pytest and pytest-cov
7+
* CI: GitHub Actions
8+
9+
* Scikit-learn API:
10+
* Fixed `_set_params` method which was not resetting individual estimators in the `estimators` collection
11+
12+
* Functional API
13+
* Fixed saving OOF arrays in file
14+
315
### v0.4.0 -- August 12, 2019
416

517
Since v0.4.0 vecstack provides official support for Python 3.5 and higher only,

LICENSE.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
MIT License
22

33
Vecstack. Python package for stacking (machine learning technique)
4-
Copyright (c) 2016-2019 Igor Ivanov
4+
Copyright (c) 2016-2025 Igor Ivanov
55
66

77
Permission is hereby granted, free of charge, to any person obtaining a copy

README.md

Lines changed: 26 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[![PyPI version](https://img.shields.io/pypi/v/vecstack.svg?colorB=4cc61e)](https://pypi.python.org/pypi/vecstack)
22
[![PyPI license](https://img.shields.io/pypi/l/vecstack.svg)](https://github.com/vecxoz/vecstack/blob/master/LICENSE.txt)
3-
[![Build Status](https://travis-ci.org/vecxoz/vecstack.svg?branch=master)](https://travis-ci.org/vecxoz/vecstack)
3+
[![Build status](https://github.com/vecxoz/vecstack/actions/workflows/actions.yaml/badge.svg?branch=master)](https://github.com/vecxoz/vecstack/actions)
44
[![Coverage Status](https://coveralls.io/repos/github/vecxoz/vecstack/badge.svg?branch=master)](https://coveralls.io/github/vecxoz/vecstack?branch=master)
55
[![PyPI pyversions](https://img.shields.io/pypi/pyversions/vecstack.svg)](https://pypi.python.org/pypi/vecstack/)
66

@@ -11,18 +11,18 @@ Convenient way to automate OOF computation, prediction and bagging using any num
1111
* [Functional API](https://github.com/vecxoz/vecstack#usage-functional-api):
1212
* Minimalistic. Get your stacked features in a single line
1313
* RAM-friendly. The lowest possible memory consumption
14-
* Kaggle-ready. Stacked features and hyperparameters from each run can be [automatically saved](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py#L209) in files. No more mess at the end of the competition. [Log example](https://github.com/vecxoz/vecstack/blob/master/examples/03_log_example.txt)
14+
* Kaggle-ready. Stacked features and hyperparameters from each run can be [automatically saved](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py#L210) in files. No more mess at the end of the competition. [Log example](https://github.com/vecxoz/vecstack/blob/master/examples/03_log_example.txt)
1515
* [Scikit-learn API](https://github.com/vecxoz/vecstack#usage-scikit-learn-api):
1616
* Standardized. Fully scikit-learn compatible transformer class exposing `fit` and `transform` methods
1717
* Pipeline-certified. Implement and deploy [multilevel stacking](https://github.com/vecxoz/vecstack/blob/master/examples/04_sklearn_api_regression_pipeline.ipynb) like it's no big deal using `sklearn.pipeline.Pipeline`
1818
* And of course `FeatureUnion` is also invited to the party
1919
* Overall specs:
2020
* Use any sklearn-like estimators
21-
* Perform [classification and regression](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L83) tasks
22-
* Predict [class labels or probabilities](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L119) in classification task
23-
* Apply any [user-defined metric](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L124)
24-
* Apply any [user-defined transformations](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L87) for target and prediction
25-
* Python 3.5 and higher, [unofficial support for Python 2.7 and 3.4](https://github.com/vecxoz/vecstack/blob/master/PY2.md)
21+
* Perform [classification and regression](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L85) tasks
22+
* Predict [class labels or probabilities](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L121) in classification task
23+
* Apply any [user-defined metric](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L126)
24+
* Apply any [user-defined transformations](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L89) for target and prediction
25+
* Python 3.9+, [unofficial support for Python 2.7 and 3.4](https://github.com/vecxoz/vecstack/blob/master/PY2.md)
2626
* Win, Linux, Mac
2727
* [MIT license](https://github.com/vecxoz/vecstack/blob/master/LICENSE.txt)
2828
* Depends on **numpy**, **scipy**, **scikit-learn>=0.18**
@@ -44,19 +44,19 @@ Convenient way to automate OOF computation, prediction and bagging using any num
4444
* [Regression + Multilevel stacking using Pipeline](https://github.com/vecxoz/vecstack/blob/master/examples/04_sklearn_api_regression_pipeline.ipynb)
4545
* Documentation:
4646
* [Functional API](https://github.com/vecxoz/vecstack/blob/master/vecstack/core.py#L133) or type ```>>> help(stacking)```
47-
* [Scikit-learn API](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L64) or type ```>>> help(StackingTransformer)```
47+
* [Scikit-learn API](https://github.com/vecxoz/vecstack/blob/master/vecstack/coresk.py#L66) or type ```>>> help(StackingTransformer)```
4848

4949
# Installation
5050

51-
***Note:*** Python 3.5 or higher is required. If you’re still using Python 2.7 or 3.4 see [installation details here](https://github.com/vecxoz/vecstack/blob/master/PY2.md)
51+
***Note:*** Python 3.9+ is officially supported and tested. If you’re still using Python 2.7 or 3.4 see [installation details here](https://github.com/vecxoz/vecstack/blob/master/PY2.md)
5252

5353
* ***Classic 1st time installation (recommended):***
5454
* `pip install vecstack`
5555
* Install for current user only (if you have some troubles with write permission):
5656
* `pip install --user vecstack`
5757
* If your PATH doesn't work:
5858
* `/usr/bin/python -m pip install vecstack`
59-
* `C:/Python36/python -m pip install vecstack`
59+
* `C:/Python3/python -m pip install vecstack`
6060
* Upgrade vecstack and all dependencies:
6161
* `pip install --upgrade vecstack`
6262
* Upgrade vecstack WITHOUT upgrading dependencies:
@@ -137,6 +137,7 @@ S_test = stack.transform(X_test)
137137
28. [Can I use `(Randomized)GridSearchCV` to tune the whole stacking Pipeline?](https://github.com/vecxoz/vecstack#28-can-i-use-randomizedgridsearchcv-to-tune-the-whole-stacking-pipeline)
138138
29. [How to define custom metric, especially AUC?](https://github.com/vecxoz/vecstack#29-how-to-define-custom-metric-especially-auc)
139139
30. [Do folds (splits) have to be the same across estimators and stacking levels? How does `random_state` work?](https://github.com/vecxoz/vecstack#30-do-folds-splits-have-to-be-the-same-across-estimators-and-stacking-levels-how-does-random_state-work)
140+
31. [How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`?](https://github.com/vecxoz/vecstack#31-how-does-vecstackstackingtransformer-differ-from-sklearnensemblestackingclassifier)
140141

141142
### 1. How can I report an issue? How can I ask a question about stacking or vecstack package?
142143

@@ -167,7 +168,7 @@ Main idea is to use predictions as features.
167168
More specifically we predict train set (in CV-like fashion) and test set using some 1st level model(s), and then use these predictions as features for 2nd level model. You can find more details (concept, pictures, code) in [stacking tutorial](https://github.com/vecxoz/vecstack/blob/master/examples/00_stacking_concept_pictures_code.ipynb).
168169
Also make sure to check out:
169170
* [Ensemble Learning](https://en.wikipedia.org/wiki/Ensemble_learning) ([Stacking](https://en.wikipedia.org/wiki/Ensemble_learning#Stacking)) in Wikipedia
170-
* Classical [Kaggle Ensembling Guide](https://mlwave.com/kaggle-ensembling-guide/)
171+
* Classical [Kaggle Ensembling Guide](https://mlwave.com/kaggle-ensembling-guide/) or try [another link](https://web.archive.org/web/20210727094233/https://mlwave.com/kaggle-ensembling-guide/)
171172
* [Stacked Generalization](https://www.researchgate.net/publication/222467943_Stacked_Generalization) paper by David H. Wolpert
172173

173174
### 5. What about stacking name?
@@ -216,7 +217,7 @@ Speaking about inner stacking mechanics, you should remember that when you have
216217
### 12. What is *blending*? How is it related to stacking?
217218

218219
Basically it is the same thing. Both approaches use predictions as features.
219-
Often this terms are used interchangeably.
220+
Often these terms are used interchangeably.
220221
The difference is how we generate features (predictions) for the next level:
221222
* *stacking*: perform cross-validation procedure and predict each part of train set (OOF)
222223
* *blending*: predict fixed holdout set
@@ -387,10 +388,14 @@ def auc(y_true, y_pred):
387388

388389
To ensure better result, folds (splits) have to be the same across all estimators and all stacking levels. It means that `random_state` has to be the same in every call to `stacking` function or `StackingTransformer`. This is default behavior of `stacking` function and `StackingTransformer` (by default `random_state=0`). If you want to try different folds (splits) try to set different `random_state` values.
389390

391+
### 31. How does `vecstack.StackingTransformer` differ from `sklearn.ensemble.StackingClassifier`?
392+
393+
It significantly differs. Please see a [detailed explanation](https://github.com/vecxoz/vecstack/issues/37).
394+
390395

391396
# Stacking concept
392397

393-
1. We want to predict train set and test set with some 1st level model(s), and then use these predictions as features for 2nd level model(s).
398+
1. We want to predict train set and test set with some 1st level model(s), and then use these predictions as features for 2nd level model(s).
394399
2. Any model can be used as 1st level model or 2nd level model.
395400
3. To avoid overfitting (for train set) we use cross-validation technique and in each fold we predict out-of-fold (OOF) part of train set.
396401
4. The common practice is to use from 3 to 10 folds.
@@ -404,6 +409,7 @@ To ensure better result, folds (splits) have to be the same across all estimator
404409
8. We can repeat this cycle using other 1st level models to get more features for 2nd level model.
405410
9. You can also look at animation of [Variant A](https://github.com/vecxoz/vecstack#variant-a-animation) and [Variant B](https://github.com/vecxoz/vecstack#variant-b-animation).
406411

412+
407413
# Variant A
408414

409415
![Fold 1 of 3](https://github.com/vecxoz/vecstack/raw/master/pic/dia1.png "Fold 1 of 3")
@@ -429,3 +435,10 @@ To ensure better result, folds (splits) have to be the same across all estimator
429435
# Variant B. Animation
430436

431437
![Variant B. Animation](https://github.com/vecxoz/vecstack/raw/master/pic/animation2.gif "Variant B. Animation")
438+
439+
440+
# References
441+
442+
* [Ensemble Learning](https://en.wikipedia.org/wiki/Ensemble_learning) ([Stacking](https://en.wikipedia.org/wiki/Ensemble_learning#Stacking)) in Wikipedia
443+
* Classical [Kaggle Ensembling Guide](https://mlwave.com/kaggle-ensembling-guide/) or try [another link](https://web.archive.org/web/20210727094233/https://mlwave.com/kaggle-ensembling-guide/)
444+
* [Stacked Generalization](https://www.researchgate.net/publication/222467943_Stacked_Generalization) paper by David H. Wolpert

0 commit comments

Comments
 (0)