InterProt

This repo contains tools for interpreting protein language models using sparse autoencoders (SAEs). Our SAE visualizer is available at interprot.com and our SAE models weights are on HuggingFace. For more information, check out our preprint.

viz contains the frontend app for visualizing SAE features. interprot is a Python package for SAE training, evaluation, and interpretation.

Getting started with our SAEs

Check out this demo notebook for SAE inference with a custom input sequence.

The visualizer

The visualizer is a React app with some RunPod serverless functions that serve our SAEs.

Running the visualizer locally

cd viz
pnpm install
pnpm run dev

RunPod endpoints

The RunPod serverless functions live in their own repos:

SAE inference: https://github.com/liambai/sae-inference
SAE steering: https://github.com/liambai/sae-steering

Generating visualization files

The visualizer and several of our analysis scripts require the generation of files (also referred to as visualization files) which summarize each SAE latent.

Generate the visualization files using interprot/make_viz_files/__main__.py
Compute family specificity using interprot/scripts/run_compute_family_specificity.py
Classify latents by activation pattern using interprot/scripts/run_viz_file_analysis.py. This will also compute many more statistics about the latents.

The input sequences to the visualization file generation script can be found here.

Running and developing the Python package

Setting up pre-commit

pip install pre-commit
pre-commit install

Building and running the Docker container

docker compose build
docker compose run --rm interprot bash

Linear probes

We find linear probes over SAE latents to be a powerful tool for uncovering interpretable features. Here's a demo notebook on the subcellular localization classification task.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

InterProt

Getting started with our SAEs

The visualizer

Running the visualizer locally

RunPod endpoints

Generating visualization files

Running and developing the Python package

Setting up pre-commit

Building and running the Docker container

Linear probes

Files

README.md

Latest commit

History

README.md

File metadata and controls

InterProt

Getting started with our SAEs

The visualizer

Running the visualizer locally

RunPod endpoints

Generating visualization files

Running and developing the Python package

Setting up pre-commit

Building and running the Docker container

Linear probes