Align embeddings across multiple modalities using context-aware embeddings at the sample level.
mmcontext is built upon the excellent sentence-transformers framework maintained by Hugging Face. By leveraging their comprehensive documentation and extensive capabilities for text embeddings, mmcontext enables you to efficiently generate multi-modal embeddings without reinventing the wheel.
mmcontext is described in detail in our paper (citation to be added upon publication). If you use mmcontext in your research, please cite our work.
mmcontext provides tools for multi-modal embedding generation, with different workflows depending on your needs:
- Using Pre-trained Models for Inference - Load and use pre-trained mmcontext models
- Training New Models - Train custom mmcontext models on your datasets
- Reproducing Paper Results - Reproduce experiments from the paper
You can use pre-trained models (currently limited to RNA-seq) to embed your own data into the joint latent space of an mmcontext model. You can then query the dataset with natural language, for annotation etc.
For inference, you only need to install the package:
pip install git+https://github.com/mengerj/mmcontext.git@mainPre-trained mmcontext models are available on Hugging Face under the jo-mengr organization. Browse available models at: https://huggingface.co/jo-mengr
See the Using a Pre-trained Model for Inference tutorial for a complete guide on:
- Loading pre-trained models from Hugging Face
- Processing your adata to be suitable
- Generating embeddings for text and omics data
- Using models for downstream tasks
Train your own mmcontext models on custom datasets.
See the Training a New Model tutorial for:
- Step-by-step training guide
- Configuration and hyperparameters
- Best practices for multi-modal training
If you want to utilize the full pipeline used to process the training datasets, go to adata-hf-datasets. This package handles the conversion of AnnData objects to Hugging Face datasets with proper formatting for multimodal training.
To reproduce results from the paper, clone the repository and install in editable mode:
git clone https://github.com/mengerj/mmcontext.git
cd mmcontext
# Create a virtual environment (however you like)
# eg. with venv
python -m venv mmcontext
source mmcontext/bin/activate
# And install the package with pip
pip install -e .To train a model as done for the paper, use the scripts/train.py script.
Try a small training run with
python scripts/train.py --config-name example_confInspect the config at example_conf.yaml. The configs to train the models presented in the paper, are basebert_numeric, for all models using numeric initial representations and basebert_text.yaml for the model using cell_sentences (text only). All datasets used in training are hosted publically on the huggingface hub (with references to zenodo), therefore the training scripts can be launched without manually downloading any data.
For HPC systems with CUDA support, the recommended approach is to use the scripts/run_training.slurm SLURM script to launch training jobs. Training also works on CPU or MPS devices if CUDA is not available.
The SLURM script allows you to override configuration values from the command line, which is useful when launching several jobs with different configurations.
sbatch scripts/run_training.slurmBefore training, it is recommended to authenticate with Hugging Face (after activating the virtual environment used for mmcontext installation):
source .venv/bin/activate # or your venv activation command
hf auth loginIf you have a Weights & Biases (wandb) account, you can also log in so your training is tracked. From the command line, run:
wandb loginThis will prompt you to enter your wandb API key.
Once training is complete, the finished models will be automatically uploaded to the Hugging Face Hub with metadata and model cards.
Figure 1D of the paper investigates the latent space of one model in detail. This can be recreated with the evaluate_model.ipynb notebook.
For Figure 1E, we evaluate several models on multiple datasets, with the scripts/embed_eval.py script. This runs both inference and evaluation pipelines in sequence.
The combined pipeline is configured using embed_eval_conf.yaml, which inherits from dataset and model configuration files that list the datasets and models to be evaluated. The configuration file contains additional parameters that are explained in the comments within the file itself.
The models and datasets evaluated in the paper are referenced in model_list_cxg_geo_all.yaml and dataset_list.yaml. These configs are imported in embed_eval_conf.yaml. To jointly embed data and evaluate with CellWhisperer, set run_cellwhisperer: true. It is highly recommended to use CUDA for CellWhisperer. The mmcontext models also run in reasonable time on MPS or CPU.
Run it locally with
python scripts/embed_eval.pyFor HPC systems, you can run the combined pipeline as array jobs using scripts/run_combined_cpu.slurm:
sbatch scripts/run_embed_eval_cpu.slurmThis allows you to process multiple model configurations in parallel across different array job tasks, by spreading the models across several config files and passing them as a list to the array job.
Figure Panel 1E is created by collecting the metrics from the big evaluation run. The config collect_metrics_conf.yaml has to point to the directory where the results from all datasets/models were stored. Then to collect and plot the metrics, run
python scripts/collect_metrics.py
python scripts/plot_metrics.pyThe main model implementation is the MMContextEncoder, located in src/mmcontext/mmcontextencoder.py. This dual-tower encoder can process both text and omics data, enabling multi-modal embedding generation.
This package is under active development. Contributions and suggestions are very welcome! Please open an issue to:
- Propose enhancements
- Report bugs
- Discuss potential improvements
- Ask questions or seek help
We encourage community contributions and are happy to help you get started.
If you find mmcontext useful for your research, please consider citing our paper (citation to be added upon publication):
@misc{mmcontext,
author = {Jonatan Menger},
title = {mmcontext: Multi-modal Contextual Embeddings},
year = {2025},
publisher = {GitHub},
journal = {GitHub Repository},
url = {https://github.com/mengerj/mmcontext}
}Encountered a bug or need help? Please use the issue tracker. We are happy for any feedback.
