Skip to content

Adhith14/machine-learning-model-analysis

Repository files navigation

Market Regime Detection for Financial Time Series

This project explores how market regimes (different market conditions) can be detected and used to improve financial time series prediction models. The goal is to combine traditional machine learning models with regime detection techniques to better understand market behaviour.


Project Overview

Financial markets often behave differently under various conditions such as high volatility, low volatility, or transitional periods. These are commonly referred to as market regimes.

In this project:

  • A baseline Random Forest model predicts future returns using lag and rolling statistical features.
  • Clustering methods are used to detect market regimes based on log returns and volatility.
  • The detected regime is added as an additional feature to improve prediction performance.

Methods Used

Feature Engineering

The model uses:

  • Lagged log returns
  • Rolling mean of returns
  • Rolling standard deviation (volatility)

These features help capture short-term market dynamics.

Regime Detection

Two clustering approaches are explored:

  • K-Means Clustering
  • Gaussian Mixture Models (GMM)

The regimes represent different market states such as low, medium, and high volatility.

Dimensionality Reduction (Exploratory Analysis)

Principal Component Analysis (PCA) is used in the notebook to visualise regime clusters in a reduced feature space.


Model

The prediction model is implemented using:

  • Random Forest Regressor
  • Regime labels added as an additional feature

The model predicts future log returns for ETF tickers.


Dataset

The dataset contains historical ETF price data for several sectors including:

  • XLK – Technology
  • XLF – Financials
  • XLE – Energy
  • XLP – Consumer Staples
  • XLV – Healthcare
  • XLI – Industrials

Evaluation Metrics

Model performance is evaluated using:

  • MAE (Mean Absolute Error)
  • RMSE (Root Mean Squared Error)
  • Directional Accuracy

These metrics help measure both prediction error and the ability to correctly predict price movement direction.


Results

The regime-based model shows similar performance to the baseline model, indicating that regime signals help explain market behaviour but may not significantly improve short-horizon predictions.

However, the clustering analysis reveals clear market states that can be visualised using PCA.


Tech Stack

  • Python
  • NumPy
  • Pandas
  • Scikit-learn
  • Matplotlib

About

Machine learning project for predicting financial time-series returns using Random Forest with clustering-based market regime detection. Includes feature engineering with lag and rolling statistics, regime identification using K-Means and GMM, and PCA visualisation for analysing market states.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages