Transformer Time Series Interpretability Toolkit

This repository provides an end-to-end workflow for analysing Transformer-based time series classification (TSC) models through mechanistic interpretability methods. It contains ready-to-run notebooks, a modular training script and a collection of pre-trained models.

Author: Matiss Kalnare Supervisor: Niki van Stein

Repository Structure

Notebooks/             - Interactive notebooks demonstrating the two analysis pipelines
  Patching.ipynb       - Activation patching/causal tracing walkthrough
  SAE.ipynb            - Sparse Autoencoder exploration
  IPYNB_to_PY/         - Python script versions of the notebooks

Utilities/             - Helper code
  TST_trainer.py       - Training/evaluation script and model definition
  utils.py             - Patching and plotting utilities

TST_models/            - Pre-trained models for several datasets
SAE_models/            - Example sparse autoencoder weights
Results/               - Example results (plots, patched predictions, ...)
requirements.txt       - Python package requirements

Installation

Clone the repository and install dependencies
```
git clone https://github.com/mathiisk/TSTpatching.git
cd TSTpatching
pip install -r requirements.txt
```
A GPU with CUDA is recommended but the code also runs on CPU.

Quick Start

Pre-trained weights for common datasets are provided in TST_models. You can immediately run the notebooks to reproduce the experiments.

Open the activation patching notebook:

jupyter notebook Notebooks/Patching.ipynb

or the sparse autoencoder notebook:

jupyter notebook Notebooks/SAE.ipynb

Step through the cells to load a model, run the analysis and display plots. The notebooks assume the working directory is the repository root.

Training a New Model

Utilities/TST_trainer.py can train a fresh Transformer on any dataset from timeseriesclassification.com.

python Utilities/TST_trainer.py --dataset DATASET_NAME --epochs NUM_EPOCHS --batch_size BATCH_SIZE

DATASET_NAME should match one of the names on the website, e.g. JapaneseVowels.
NUM_EPOCHS defaults to 100 if not provided.
BATCH_SIZE defaults to 32.

The resulting weights are stored as TST_<dataset>.pth under TST_models/.

Sparse Autoencoders

The notebook Notebooks/SAE.ipynb trains an autoencoder on intermediate activations of a Transformer. It highlights interpretable concepts that the model relies on. Pre-trained SAE weights are stored in SAE_models/ and can be loaded by the notebook.

Output & Results

All figures and intermediate outputs generated by the notebooks are stored under Results/ by default. Separate folders exist for each dataset so you can keep experiments organised.

BSc Thesis Context

This code base accompanies a Bachelor thesis exploring whether interpretability techniques from NLP, namely activation patching and sparse autoencoders, can reveal causal mechanisms inside Transformer-based time series classifiers. The provided scripts and notebooks allow anyone to reproduce and extend the experiments.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Notebooks		Notebooks
Results		Results
SAE_models		SAE_models
TST_models		TST_models
Utilities		Utilities
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
topkgraph.pdf		topkgraph.pdf
tresholdgraph.pdf		tresholdgraph.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Transformer Time Series Interpretability Toolkit

Repository Structure

Installation

Quick Start

Training a New Model

Sparse Autoencoders

Output & Results

BSc Thesis Context

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

mathiisk/TST-Mechanistic-Interpretability

Folders and files

Latest commit

History

Repository files navigation

Transformer Time Series Interpretability Toolkit

Repository Structure

Installation

Quick Start

Training a New Model

Sparse Autoencoders

Output & Results

BSc Thesis Context

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages