llm-stock-forecaster

Stock market forecasting using Frozen Pretrained Transformers (FPT). This project explores leveraging pretrained Large Language Models (LLMs) as feature extractors for time series prediction on Indian stock market indices.

Overview

This project implements a novel approach to stock price forecasting by:

Freezing most parameters of pretrained LLM backbones
Fine-tuning only layer normalization and embedding layers
Using patch-based input embeddings to convert time series data into a format suitable for transformer architectures

The approach is inspired by the observation that pretrained transformers learn general-purpose representations that can transfer to domains beyond natural language.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Input Time Series                        │
│                    (seq_len=60 time steps)                      │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Instance Normalization                       │
│              (zero mean, unit variance per sample)              │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                      Patch Embedding                            │
│         (split into 6 patches of size 10, project to d_model)   │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                   + Positional Embedding                        │
│                  (learnable, 6 positions)                       │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Frozen LLM Backbone                          │
│    ┌─────────────────────────────────────────────────────┐      │
│    │  Transformer Layers (weights frozen)                │      │
│    │  Layer Norms (fine-tuned) ✓                         │      │
│    │  Embeddings (fine-tuned) ✓                          │      │
│    └─────────────────────────────────────────────────────┘      │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                      Output Projection                          │
│              (flatten → linear → prediction)                    │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                   Reverse Normalization                         │
│                (restore original scale)                         │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
                         Forecast Output

Supported LLM Backbones

Model	Parameters	Architecture	Reference
GPT-2	124M	Decoder-only, causal attention	Radford et al. (2019)
BERT	110M	Encoder-only, bidirectional	Devlin et al. (2019)
XLNet	110M	Permutation LM (AR + AE)	Yang et al. (2019)
ALBERT	12M	Parameter sharing, factorized embeddings	Lan et al. (2020)
DistilBERT	66M	Distilled BERT, fewer layers	Sanh et al. (2019)

Dataset

The project supports 5 NSE (National Stock Exchange of India) indices:

Index	Data Available From	Description
NIFTY 50	1999	Top 50 companies by market cap
NIFTY NEXT 50	1999	Companies ranked 51-100
NIFTY BANK	2005	Banking sector index
NIFTY FINANCIAL SERVICES	2012	Financial services sector
NIFTY MIDCAP SELECT	2022	Mid-cap companies

Installation

Prerequisites

Python >= 3.14
CUDA-capable GPU (optional, but recommended)

Using uv (Recommended)

This project uses PEP 723 inline script metadata. The easiest way to run the scripts is with uv:

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run any script directly (dependencies auto-installed)
uv run model.py
uv run backbone_variation.py
uv run ticker_variation.py

Manual Installation

If you prefer traditional pip installation:

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/macOS
# .venv\Scripts\activate   # Windows

# Install dependencies
pip install numpy pandas scikit-learn torch transformers matplotlib requests sentencepiece

Usage

1. Fetch Data

First, download historical stock data from NSE:

uv run fetch_data.py

This will create CSV files for all 5 indices in the current directory.

2. Run Main Model (GPT-2 on NIFTY 50)

uv run model.py

This trains a GPT-2 based forecaster on NIFTY 50 data and outputs:

Training/validation loss per epoch
Test metrics (RMSE, MAE, MSE)
Forecast visualization (forecast_results.png)

3. Backbone Comparison Experiment

Compare all 5 LLM architectures on NIFTY 50:

uv run backbone_variation.py

Outputs:

Per-model training logs
Comparative metrics table
Multi-panel visualization (backbone_variation_results.png)

4. Ticker Variation Experiment

Test GPT-2 across all 5 stock indices:

uv run ticker_variation.py

Outputs:

Per-ticker training logs
Comparative metrics table
Multi-panel visualization (ticker_variation_results.png)

Configuration

Key hyperparameters (defined in each script's Config class):

Parameter	Default	Description
`SEQ_LEN`	60	Input sequence length (days)
`PRED_LEN`	1	Prediction horizon (days)
`PATCH_SIZE`	10	Size of each input patch
`BATCH_SIZE`	32	Training batch size
`EPOCHS`	20	Number of training epochs
`LEARNING_RATE`	1e-4	Adam optimizer learning rate

Project Structure

llm-stock-forecaster/
├── LICENSE                  # MIT License
├── README.md                # This file
├── .gitignore               # Git ignore patterns
├── model.py                 # Main FPT model (GPT-2 on NIFTY 50)
├── backbone_variation.py    # Experiment: compare 5 LLM architectures
├── ticker_variation.py      # Experiment: GPT-2 on 5 stock indices
└── fetch_data.py            # Data fetcher for NSE indices

Results

Results will vary based on the data fetched (market data changes over time). Example metrics format:

=========================================
BACKBONE VARIATION RESULTS (NIFTY 50)
=========================================
Model                Total Params      Trainable         RMSE          MAE
----------------------------------------------------------------------------------
GPT-2 (124M)          124,xxx,xxx       x,xxx,xxx       xx.xxxx      xx.xxxx
BERT (110M)           110,xxx,xxx       x,xxx,xxx       xx.xxxx      xx.xxxx
...

References

Radford, A., et al. (2019). "Language Models are Unsupervised Multitask Learners"
Devlin, J., et al. (2019). "BERT: Pre-training of Deep Bidirectional Transformers"
Yang, Z., et al. (2019). "XLNet: Generalized Autoregressive Pretraining"
Lan, Z., et al. (2020). "ALBERT: A Lite BERT for Self-supervised Learning"
Sanh, V., et al. (2019). "DistilBERT, a distilled version of BERT"
Zhou, T., et al. (2023). "One Fits All: Power General Time Series Analysis by Pretrained LM"

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-stock-forecaster

Overview

Architecture

Supported LLM Backbones

Dataset

Installation

Prerequisites

Using uv (Recommended)

Manual Installation

Usage

1. Fetch Data

2. Run Main Model (GPT-2 on NIFTY 50)

3. Backbone Comparison Experiment

4. Ticker Variation Experiment

Configuration

Project Structure

Results

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
backbone_variation.py		backbone_variation.py
fetch_data.py		fetch_data.py
model.py		model.py
ticker_variation.py		ticker_variation.py

Folders and files

Latest commit

History

Repository files navigation

llm-stock-forecaster

Overview

Architecture

Supported LLM Backbones

Dataset

Installation

Prerequisites

Using uv (Recommended)

Manual Installation

Usage

1. Fetch Data

2. Run Main Model (GPT-2 on NIFTY 50)

3. Backbone Comparison Experiment

4. Ticker Variation Experiment

Configuration

Project Structure

Results

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages