Skip to content

atomicarchitects/OMolExplorer

Repository files navigation

OMol25 Explorer

An interactive 3D explorer for the OMol25 dataset. Supports querying by composition (eg. C6H5*), charge, spin and domain (eg. electrolytes and biomolecules)!

image

Setup

Due to the licensing rules of the OMol25 dataset, we do not host the dataset here; it must be downloaded separately from Hugging Face.

Install server dependencies:

uv venv
source .venv/bin/activate
uv pip install -r requirements.txt

Follow the instructions at (Hugging Face)[https://huggingface.co/facebook/OMol25] to download the OMol25 files to your OMOL_DATA_DIR, eg. /path/to/omol25/data:

export OMOL_DATA_DIR=/path/to/omol25/data

Below is an example with the train_4M subset (good for testing, approximately 19 GB).

cd ${OMOL_DATA_DIR}

wget https://dl.fbaipublicfiles.com/opencatalystproject/data/OMol/250514/train_4M.tar.gz

tar -xvzf train_4M.tar.gz

cd -

You should now have a ${OMOL_DATA_DIR}/train_4M/ directory with .aselmdb files.

Usage

Start the server:

source .venv/bin/activate && streamlit run app.py -- --dataset-dir ${OMOL_DATA_DIR}

Then, choose a dataset split and execute a query:

image

You will see the different molecules visualized with stmol. You can also see the distribution of various statistics across queries:

image

Building your own index?

Do you have your own dataset in the OMol25 format? Do you want to extend our explorer to support queries on arbitrary attributes? You can build your own index!

Specify the folder where you want to store your index files:

export OMOL_INDEX_DIR=/path/to/omol25/index/

Build your index. On the train_4M subset from OMol25, this takes around 30 minutes on a single CPU.

python -m src.index --dataset-dir ${OMOL_DATA_DIR} --index-dir ${OMOL_INDEX_DIR} --split train_4M

Start the server and specify your custom index directory:

streamlit run app.py -- --dataset-dir ${OMOL_DATA_DIR} --index-dir ${OMOL_INDEX_DIR}

Acknowledgements

This Streamlit app was built with stmol and py3Dmol. Please cite the OMol25 dataset if you use this explorer. Finally, please feel free to cite this repository if you found our tool useful!

@software{omol-explorer,
  author = {Daigavane, Ameya and Smidt, Tess},
  title = {{OMol Explorer}},
  url = {https://github.com/atomicarchitects/OMolExplorer},
  version = {0.0.1},
  year = {2025}
}

About

An interactive 3D explorer for OMol25.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published