Evo 2 is a state of the art DNA language model for long context modeling and design. Evo 2 models DNA sequences at single-nucleotide resolution at up to 1 million base pair context length using the StripedHyena 2 architecture. Evo 2 was pretrained using Savanna. Evo 2 was trained autoregressively on OpenGenome2, a dataset containing 8.8 trillion tokens from all domains of life.
We describe Evo 2 in the preprint: "Genome modeling and design across all domains of life with Evo 2".
This repo is for running Evo 2 locally for inference or generation, using our Vortex inference code. For training and finetuning, see the section here. You can run Evo 2 without any installation using the Nvidia Hosted API. You can also self-host an instance using Nvidia NIM. See the Nvidia NIM section for more information.
Evo 2 is built on the Vortex inference repo, see the Vortex github for more details and Docker option.
Prerequisites
- Transformer Engine >= 2.0.0
- Flash Attention for optimized attention operations (strongly recommended)
System requirements
- [OS] Linux (official) or WSL2 (limited support)
- [GPU] Requires Compute Capability 8.9+ (Ada/Hopper/Blackwell) due to FP8 being required
- [Software]
- CUDA: 12.1+ (12.8+ for Blackwell) with compatible NVIDIA drivers
- cuDNN: 9.3+
- Compiler: GCC 9+ or Clang 10+ with C++17 support
- Python 3.12 required
Check respective githubs for more details about Transformer Engine and Flash Attention and how to install them. We recommend using conda to easily install Transformer Engine. Here is an example of how to install the prerequisites:
conda install -c nvidia cuda-nvcc cuda-cudart-dev
conda install -c conda-forge transformer-engine-torch=2.3.0
pip install flash-attn==2.8.0.post2 --no-build-isolation
To get started with Evo 2, install from pip or from github after installing the prerequisites.
To install Evo 2:
pip install evo2
For the latest features or to contribute:
git clone https://github.com/arcinstitute/evo2
cd evo2
pip install -e .
To verify that the installation was correct:
python -m evo2.test.test_evo2_generation --model_name evo2_7b
Evo 2 can be run using Docker (shown below), Singularity, or Apptainer.
docker build -t evo2 .
docker run -it --rm --gpus '"device=0"' -v ./huggingface:/root/.cache/huggingface evo2 bash
Note: The volume mount (-v) preserves downloaded models between container runs and specifies where they are saved.
Once inside the container:
python -m evo2.test.test_evo2_generation --model_name evo2_7b
We provide the following model checkpoints, hosted on HuggingFace:
Checkpoint Name | Description |
---|---|
evo2_40b |
A model pretrained with 1 million context obtained through context extension of evo2_40b_base . |
evo2_7b |
A model pretrained with 1 million context obtained through context extension of evo2_7b_base . |
evo2_40b_base |
A model pretrained with 8192 context length. |
evo2_7b_base |
A model pretrained with 8192 context length. |
evo2_1b_base |
A smaller model pretrained with 8192 context length. |
To use Evo 2 40B, you will need multiple GPUs. Vortex automatically handles device placement, splitting the model across available cuda devices.
Note that the 7B checkpoints can be run without FP8, thus avoiding the compute capability requirement. This can be done by modifying the configs to turn off FP8 and is not officially supported as there are numerical differences.
Below are simple examples of how to download Evo 2 and use it locally in Python.
Evo 2 can be used to score the likelihoods across a DNA sequence.
import torch
from evo2 import Evo2
evo2_model = Evo2('evo2_7b')
sequence = 'ACGT'
input_ids = torch.tensor(
evo2_model.tokenizer.tokenize(sequence),
dtype=torch.int,
).unsqueeze(0).to('cuda:0')
outputs, _ = evo2_model(input_ids)
logits = outputs[0]
print('Logits: ', logits)
print('Shape (batch, length, vocab): ', logits.shape)
Evo 2 embeddings can be saved for use downstream. We find that intermediate embeddings work better than final embeddings, see our paper for details.
import torch
from evo2 import Evo2
evo2_model = Evo2('evo2_7b')
sequence = 'ACGT'
input_ids = torch.tensor(
evo2_model.tokenizer.tokenize(sequence),
dtype=torch.int,
).unsqueeze(0).to('cuda:0')
layer_name = 'blocks.28.mlp.l3'
outputs, embeddings = evo2_model(input_ids, return_embeddings=True, layer_names=[layer_name])
print('Embeddings shape: ', embeddings[layer_name].shape)
Evo 2 can generate DNA sequences based on prompts.
from evo2 import Evo2
evo2_model = Evo2('evo2_7b')
output = evo2_model.generate(prompt_seqs=["ACGT"], n_tokens=400, temperature=1.0, top_k=4)
print(output.sequences[0])
We provide example notebooks.
The BRCA1 notebook shows zero-shot BRCA1 variant effect prediction. This example includes a walkthrough of:
- Performing zero-shot BRCA1 variant effect predictions using Evo 2
- Reference vs alternative allele normalization
The generation notebook shows DNA sequence completion with Evo 2. This example shows:
- DNA prompt based generation and 'DNA autocompletion'
- How to get and prompt using phylogenetic species tags for generation
Evo 2 is available on Nvidia NIM and hosted API.
The quickstart guides users through running Evo 2 on the NVIDIA NIM using a python or shell client after starting NIM. An example python client script is shown below. This is the same way you would interact with the Nvidia hosted API.
#!/usr/bin/env python3
import requests
import os
import json
from pathlib import Path
key = os.getenv("NVCF_RUN_KEY") or input("Paste the Run Key: ")
r = requests.post(
url=os.getenv("URL", "https://health.api.nvidia.com/v1/biology/arc/evo2-40b/generate"),
headers={"Authorization": f"Bearer {key}"},
json={
"sequence": "ACTGACTGACTGACTG",
"num_tokens": 8,
"top_k": 1,
"enable_sampled_probs": True,
},
)
if "application/json" in r.headers.get("Content-Type", ""):
print(r, "Saving to output.json:\n", r.text[:200], "...")
Path("output.json").write_text(r.text)
elif "application/zip" in r.headers.get("Content-Type", ""):
print(r, "Saving large response to data.zip")
Path("data.zip").write_bytes(r.content)
else:
print(r, r.headers, r.content)
You can use Savanna or Nvidia BioNemo for embedding long sequences. Vortex can currently compute over very long sequences via teacher prompting, however please note that forward pass on long sequences may currently be slow.
The OpenGenome2 dataset used for pretraining Evo2 is available on HuggingFace . Data is available either as raw fastas or as JSONL files which include preprocessing and data augmentation.
Evo 2 was trained using Savanna, an open source framework for training alternative architectures.
To train or finetune Evo 2, you can use Savanna or Nvidia BioNemo which provides a Evo 2 finetuning tutorial here.
If you find these models useful for your research, please cite the relevant papers
@article {Brixi2025.02.18.638918,
author = {Brixi, Garyk and Durrant, Matthew G and Ku, Jerome and Poli, Michael and Brockman, Greg and Chang, Daniel and Gonzalez, Gabriel A and King, Samuel H and Li, David B and Merchant, Aditi T and Naghipourfar, Mohsen and Nguyen, Eric and Ricci-Tam, Chiara and Romero, David W and Sun, Gwanggyu and Taghibakshi, Ali and Vorontsov, Anton and Yang, Brandon and Deng, Myra and Gorton, Liv and Nguyen, Nam and Wang, Nicholas K and Adams, Etowah and Baccus, Stephen A and Dillmann, Steven and Ermon, Stefano and Guo, Daniel and Ilango, Rajesh and Janik, Ken and Lu, Amy X and Mehta, Reshma and Mofrad, Mohammad R.K. and Ng, Madelena Y and Pannu, Jaspreet and Re, Christopher and Schmok, Jonathan C and St. John, John and Sullivan, Jeremy and Zhu, Kevin and Zynda, Greg and Balsam, Daniel and Collison, Patrick and Costa, Anthony B. and Hernandez-Boussard, Tina and Ho, Eric and Liu, Ming-Yu and McGrath, Tom and Powell, Kimberly and Burke, Dave P. and Goodarzi, Hani and Hsu, Patrick D and Hie, Brian},
title = {Genome modeling and design across all domains of life with Evo 2},
elocation-id = {2025.02.18.638918},
year = {2025},
doi = {10.1101/2025.02.18.638918},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2025/02/21/2025.02.18.638918},
eprint = {https://www.biorxiv.org/content/early/2025/02/21/2025.02.18.638918.full.pdf},
journal = {bioRxiv}
}