Please refer code for the official implementation.
The code has been tested in the following environment:
conda create -n tagmol python=3.8.17
conda activate tagmol
conda install pytorch=1.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
conda install pyg=2.2.0 -c pyg
conda install rdkit=2022.03.2 openbabel=3.1.1 tensorboard=2.13.0 pyyaml=6.0 easydict=1.9 python-lmdb=1.4.1 -c conda-forge
# For Vina Docking
pip install meeko==0.1.dev3 scipy pdb2pqr vina==1.2.2
python -m pip install git+https://github.com/Valdes-Tresanco-MS/AutoDockTools_py3conda env create -f environment.ymlIMPORTANT NOTE: You might have to do the following to append the path of the root working directory.
export PYTHONPATH=".":$PYTHONPATHThe resources can be found here. The data are inside data directory, the backbone model is inside pretrained_models and the guide checkpoints are inside logs.
python scripts/train_diffusion.py configs/training.ymlpython scripts/train_dock_guide.py configs/training_dock_guide.ymlpython scripts/train_dock_guide.py configs/training_dock_guide_qed.ymlpython scripts/train_dock_guide.py configs/training_dock_guide_sa.ymlNOTE: The outputs are saved in logs/ by default.
python scripts/sample_diffusion.py configs/sampling.yml --data_id {i} # Replace {i} with the index of the data. i should be between 0 and 99 for the testset.We have a bash file that can run the inference for the entire test set in a loop.
bash scripts/batch_sample_diffusion.sh configs/sampling.yml backboneThe output will be stored in experiments/backbone.
The following variables: BATCH_SIZE, NODE_ALL, NODE_THIS and START_IDX, can be modified in the script file, if required.
python scripts/sample_multi_guided_diffusion.py [path-to-config.yml] --data_id {i} # Replace {i} with the index of the data. i should be between 0 and 99 for the testset.To run inference on all 100 targets in the test set:
bash scripts/batch_sample_multi_guided_diffusion.sh [path-to-config.yml] [output-dir-name]The outputs are stored in experiments_multi/[output-dir-name]when run using the bash file. The config files are available in configs/noise_guide_multi.
- Single-objective guidance
- BA:
sampling_guided_ba_1.yml - QED:
sampling_guided_qed_1.yml - SA:
sampling_guided_sa_1.yml
- BA:
- Dual-objective guidance
- QED + BA:
sampling_guided_qed_0.5_ba_0.5.yml - SA + BA:
sampling_guided_sa_0.5_ba_0.5.yml - QED + SA:
sampling_guided_qed_0.5_sa_0.5.yml
- QED + BA:
- Multi-objective guidance (our main model)
- QED + SA + BA:
sampling_guided_qed_0.33_sa_0.33_ba_0.34.yml
- QED + SA + BA:
For example, to run the multi-objective setting (i.e., our model):
bash scripts/batch_sample_multi_guided_diffusion.sh configs/noise_guide_multi/sampling_guided_qed_0.33_sa_0.33_ba_0.34.yml qed_0.33_sa_0.33_ba_0.34python scripts/eval_dock_guide.py --ckpt_path [path-to-checkpoint.pt]python scripts/evaluate_diffusion.py {OUTPUT_DIR} --docking_mode vina_score --protein_root data/test_setThe docking mode can be chosen from {qvina, vina_score, vina_dock, none}
NOTE: It will take some time to prepare pqdqt and pqr files when you run the evaluation code with vina_score/vina_dock docking mode for the first time.
| Methods | Vina Score (↓) | Vina Min (↓) | Vina Dock (↓) | High Affinity (↑) | QED (↑) | SA (↑) | Diversity (↑) | Hit Rate % (↑) | |||||||
| Avg. | Med. | Avg. | Med. | Avg. | Med. | Avg. | Med. | Avg. | Med. | Avg. | Med. | Avg. | Med. | ||
| Reference | -6.36 | -6.46 | -6.71 | -6.49 | -7.45 | -7.26 | - | - | 0.48 | 0.47 | 0.73 | 0.74 | - | - | 21 |
| liGAN | - | - | - | - | -6.33 | -6.20 | 21.1% | 11.1% | 0.39 | 0.39 | 0.59 | 0.57 | 0.66 | 0.67 | 13.2 |
| AR | -5.75 | -5.64 | -6.18 | -5.88 | -6.75 | -6.62 | 37.9% | 31.0% | 0.51 | 0.50 | 0.63 | 0.63 | 0.70 | 0.70 | 12.9 |
| Pocket2Mol | -5.14 | -4.70 | -6.42 | -5.82 | -7.15 | -6.79 | 48.4% | 51.0% | 0.56 | 0.57 | 0.74 | 0.75 | 0.69 | 0.71 | 24.3 |
| TargetDiff | -5.47 | -6.30 | -6.64 | -6.83 | -7.80 | -7.91 | 58.1% | 59.1% | 0.48 | 0.48 | 0.58 | 0.58 | 0.72 | 0.71 | 20.5 |
| DecompDiff | -4.85 | -6.03 | -6.76 | -7.09 | -8.48 | -8.50 | 64.8% | 78.6% | 0.44 | 0.41 | 0.59 | 0.59 | 0.63 | 0.62 | 24.9 |
| TAGMol | -7.02 | -7.77 | -7.95 | -8.07 | -8.59 | -8.69 | 69.8% | 76.4% | 0.55 | 0.56 | 0.56 | 0.56 | 0.69 | 0.70 | 27.7 |
Due to space constraints, we only share the eval_results folder generated from the evaluation script. It can be found in the same link as other resources, inside results directory.
@article{dorna2024tagmol,
title={TAGMol: Target-Aware Gradient-guided Molecule Generation},
author={Vineeth Dorna and D. Subhalingam and Keshav Kolluru and Shreshth Tuli and Mrityunjay Singh and Saurabh Singal and N. M. Anoop Krishnan and Sayan Ranu},
journal={arXiv preprint arXiv:2406.01650},
year={2024}
}
This codebase was build on top of TargetDiff
