A Python library for discovering causal networks from time series data using Optimal Causation Entropy (oCSE).
CausationEntropy implements state-of-the-art information-theoretic methods for causal discovery from multivariate time series. The library provides robust algorithms that can identify causal relationships while controlling for confounding variables and false discoveries.
Given time series data, CausationEntropy finds which variables cause changes in other variables by:
- Predictive Testing: Testing if knowing variable X at time t helps predict variable Y at time t+1
- Information Theory: Using conditional mutual information to measure predictive relationships
- Statistical Control: Rigorous statistical testing to avoid false discoveries
- Multiple Methods: Supporting various information estimators and discovery algorithms
pip install causationentropygit clone https://github.com/Center-For-Complex-Systems-Science/causationentropy.git
cd causationentropy
pip install -e .python -m pytest causationentropy/tests/ --cov=causationentropy --cov-report=xml --cov-report=term-missing -vSee our Quick Start colab notebook:
Get the relationships as a data frame:
import pandas as pd
from causationentropy import discover_network
from causationentropy.graph import network_to_dataframe
# Load your time series data (variables as columns, time as rows)
data = pd.read_csv('data.csv')
# Discover causal network
network = discover_network(data, method='standard', max_lag=5)
df = network_to_dataframe(network)
df.head()Plot the causal network:
from causationentropy import discover_network
from causationentropy.core.plotting import plot_causal_network
# Load your time series data (variables as columns, time as rows)
data = pd.read_csv('data.csv')
# Discover causal network
network = discover_network(data, method='standard', max_lag=5)
fig, ax = plot_causal_network(network, save_path="network.png")Note: This implementation of this algorithm runs in O(n^2 T log T) where N is the number of variables and T is the length of the time series. Application of this algorithm without optimizations is computationally intensive. When running this algorithm, please be patient. Optimizations of the algorithm are planned for a later release that leverage singular value decomposition and KD-Trees. However, these optimizations are not part of the original algorithm. Adding additional lags also contributes to additional performance degradations.
from causationentropy import discover_network
# Configure discovery parameters
network = discover_network(
data,
method='standard', # 'standard', 'alternative', 'information_lasso', or 'lasso'
information='gaussian', # 'gaussian', 'knn', 'kde', 'geometric_knn', or 'poisson'
max_lag=5, # Maximum time lag to consider
alpha_forward=0.05, # Forward selection significance
alpha_backward=0.05, # Backward elimination significance
n_shuffles=200 # Permutation test iterations
)from causationentropy.datasets import synthetic
from causationentropy import discover_network
# Generate synthetic causal time series
data, true_network = synthetic.linear_stochastic_gaussian_process(
n_variables=5,
n_samples=1000,
sparsity=0.3
)
# Discover network
discovered = discover_network(data)- Multiple Algorithms: Standard, alternative, information lasso, and lasso variants of oCSE
- Flexible Information Estimators: Gaussian, k-NN, KDE, geometric k-NN, and Poisson methods
- Statistical Rigor: Permutation-based significance testing with comprehensive test coverage
- Synthetic Data: Built-in generators for testing and validation
- Visualization: Network plotting and analysis tools
The algorithm uses conditional mutual information to quantify causal relationships:
This measures how much variable X tells us about variable Y, beyond what we already know from conditioning set Z.
Causal Discovery Rule: Variable X causes Y if knowing X(t) significantly improves prediction of Y(t+1), even when controlling for all other relevant variables.
The algorithm implements a two-phase approach:
- Forward Selection: Iteratively adds predictors that maximize conditional mutual information
- Backward Elimination: Removes predictors that lose significance when conditioned on others
📚 Read the full documentation on ReadTheDocs
- API Reference: Complete function and class documentation
- User Guide: Detailed tutorials and examples
- Theory: Mathematical background and algorithms
- Examples: Check the
notebooks/directory - Research Papers: See the
theory glossaryin the documentation
Build documentation locally:
cd docs/
make html
# Open docs/_build/html/index.htmlWe welcome contributions! Please see CONTRIBUTING.md for guidelines.
If you use this library in your research, please cite:
@misc{slote2025causationentropy,
author = {Slote, Kevin and Fish, Jeremie and Bollt, Erik},
title = {CausationEntropy: A Python Library for Causal Discovery},
url = {https://github.com/Center-For-Complex-Systems-Science/causationentropy},
doi = {10.5281/zenodo.17047565}
}This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
This work builds upon fundamental research in information theory, causal inference, and time series analysis. Special thanks to the open-source scientific Python community.
Generative AI was used to help with doc strings, documentation, and unit tests.