[Suggestion] Purging and embargoing to deal with unintended data leaks in cross validation.

These approaches are often used in financial ML. Can benefit a wide variety of ML tasks though.

In short: Adding a safety gap between the k-folds or train-, test- and validation splits.

These articles explain it in detail:

[https://medium.com/mlearning-ai/why-k-fold-cross-validation-is-failing-in-finance-65c895e83fdf](https://medium.com/mlearning-ai/why-k-fold-cross-validation-is-failing-in-finance-65c895e83fdf)

[https://blog.quantinsti.com/cross-validation-embargo-purging-combinatorial/](https://blog.quantinsti.com/cross-validation-embargo-purging-combinatorial/)

The Combinatorial Purged Cross Validation mentioned there (it is a little better explained here: [https://towardsai.net/p/l/the-combinatorial-purged-cross-validation-method](https://towardsai.net/p/l/the-combinatorial-purged-cross-validation-method)) helps creating more walk-forward paths that are purely out-of-sample for increased statistical significance. This was proposed by Marcos Lopez de Prado in the “Advances in financial machine learning”.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Suggestion] Purging and embargoing to deal with unintended data leaks in cross validation. #1589

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Suggestion] Purging and embargoing to deal with unintended data leaks in cross validation. #1589

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions