pdb2reaction is a Python CLI toolkit for turning PDB structures into enzymatic reaction pathways with machine-learning interatomic potentials (MLIPs). Each workflow step is also available as an individual subcommand (opt, scan, scan2d, path-search, tsopt, freq, irc, dft, energy-diagram, etc.) for fine-grained control.
A single command can generate a first-pass enzymatic reaction path:
# bezA (GPP C6-methyltransferase): methyl transfer (SAM→GPP C6) + proton abstraction (E170)
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3'# Scan mode (single structure → staged bond scans → MEP)
pdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
'[("GPP 321 H11","GLU 186 OE2",0.90)]'The full workflow — MEP search → TS optimization → IRC → thermochemistry → single-point DFT — can be run in one command:
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--tsopt --thermo --dftWorking examples are provided in the
examples/directory, including completeallworkflow scripts for both multi-structure MEP and scan-based pipelines. The example system is GPP C6-methyltransferase BezA (Tsutsumi et al., Angew. Chem. Int. Ed. 2022, 61, e202111217), which catalyzes a two-step reaction: (1) electrophilic methyl transfer from SAM to the C6 position of GPP via a C7 carbocation intermediate, and (2) proton abstraction from C6 by the catalytic base E170 to yield 6-methylgeranyl pyrophosphate (6MGPP).
Given (i) two or more PDB files (R → ... → P), or (ii) one PDB with --scan-lists, or (iii) one TS candidate with --tsopt, pdb2reaction automatically:
- extracts an active-site model around user-defined substrates to build a cluster model,
- explores minimum-energy paths (MEPs) with GSM or DMF,
- optionally optimizes transition states, runs vibrational analysis, IRC, and single-point DFT,
using machine-learning interatomic potentials (MLIPs).
| Tool | Use case | Repository |
|---|---|---|
| mlmm-toolkit | ML/MM (ONIOM) with full protein environment — automates MM parameter generation and ML region assignment from a single PDB input | https://github.com/t-0hmura/mlmm_toolkit |
| UMA–Pysisyphus Interface | YAML-input-based reaction mechanism analysis for small molecules | https://github.com/t-0hmura/uma_pysis |
Both pdb2reaction and mlmm-toolkit include a custom GPU-optimized pysisyphus fork for geometry optimization, TS search, and IRC. This bundled fork is not compatible with the upstream pysisyphus package; do not install them side by side.
Important (prerequisites):
- Input PDB files must already contain hydrogen atoms.
- When providing multiple PDBs, they must contain the same atoms in the same order (only coordinates may differ).
- Boolean CLI options accept both
--flag/--no-flagand value style--flag True/False(yes/no,1/0are also accepted). Prefer toggle style in new scripts.- The workflow also works for small-molecule systems. If you omit
--center/-cand--ligand-charge, you can use.xyzor.gjfinputs as well.
- Getting Started — Quick start and workflow overview
- Installation — Setup and dependency installation
- Examples — Working
allworkflow scripts (MEP and scan pipelines) for bezA - YAML Reference — Configuration options
- JSON Output Reference — Machine-readable result.json schema
- Troubleshooting — Common errors, backend selection guide, VRAM requirements
- Full documentation: docs/index.md
This software is still under development. Please use it at your own risk.
pdb2reaction requires Linux with a CUDA-capable GPU.
- Python >= 3.11
- CUDA 12.x
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu129
pip install pdb2reaction
plotly_get_chrome -y
huggingface-cli loginconda create -n pdb2reaction python=3.11 -y
conda activate pdb2reaction
conda install -c conda-forge cyipopt -y
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu129
pip install pdb2reaction
plotly_get_chrome -yDFT dependencies are not installed by default. To use pdb2reaction dft, install the [dft] extra:
pip install "pdb2reaction[dft]"This installs PySCF, GPU4PySCF (x86_64 only), and related CUDA libraries. Note that DFT single-point calculations are practical only for systems up to ~500 atoms; larger systems will require prohibitive compute time and memory.
For detailed installation instructions, see Installation.
| Potential | Repository | Install extra |
|---|---|---|
| UMA (default) | https://github.com/facebookresearch/fairchem | (included) |
| ORB | https://github.com/orbital-materials/orb-models | pip install "pdb2reaction[orb]" |
| MACE | https://github.com/ACEsuit/mace | See below |
| AIMNet2 | https://github.com/isayevlab/aimnetcentral | pip install "pdb2reaction[aimnet]" |
MACE installation: MACE requires
e3nn==0.4.4, which conflicts withfairchem-core(UMA). To use MACE, first uninstall UMA's dependency, then install MACE:pip uninstall fairchem-core pip install mace-torchUMA and MACE cannot coexist in the same environment. Use separate conda environments if you need both.
The examples below use GPP C6-methyltransferase BezA (Tsutsumi et al., Angew. Chem. Int. Ed. 2022, 61, e202111217) — a two-step mechanism: electrophilic methyl transfer from SAM to GPP C6 (via C7 carbocation), then proton abstraction by E170. Complete working scripts are in examples/.
pdb2reaction -i 1.R.pdb 3.P.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--tsopt --thermo --out-dir result_meppdb2reaction -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--scan-lists '[("CS1 SAM 320","GPP 321 C7",1.60)]' \
'[("GPP 321 H11","GLU 186 OE2",0.90)]' \
--tsopt --thermo --out-dir result_scanpdb2reaction -i TS_candidate.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' \
--tsopt1. Extract active-site model (cluster model) — extract
pdb2reaction extract -i 1.R.pdb -c 'SAM,GPP,MG' -l 'SAM:1,GPP:-3' -r 6.02. Optimize geometry — opt
pdb2reaction opt -i model.pdb -l 'SAM:1,GPP:-3'3. MEP search — path-opt
pdb2reaction path-opt -i R_model.pdb P_model.pdb -l 'SAM:1,GPP:-3'4. TS optimization — tsopt
pdb2reaction tsopt -i hei.pdb -l 'SAM:1,GPP:-3'5. Frequency analysis — freq
pdb2reaction freq -i ts_optimized.pdb -l 'SAM:1,GPP:-3'6. IRC — irc
pdb2reaction irc -i ts_optimized.pdb -l 'SAM:1,GPP:-3'7. DFT single-point — dft
pdb2reaction dft -i optimized.pdb -l 'SAM:1,GPP:-3'| Subcommand | Role | Documentation |
|---|---|---|
all |
End-to-end: extraction → MEP → TS → IRC → freq → DFT | docs/all.md |
| Subcommand | Role | Documentation |
|---|---|---|
extract |
Extract active-site model (cluster model) | docs/extract.md |
fix-altloc |
Resolve alternate conformations in PDB files | docs/fix_altloc.md |
add-elem-info |
Add/repair PDB element columns (77–78) | docs/add_elem_info.md |
| Subcommand | Role | Documentation |
|---|---|---|
opt |
Geometry optimization (L-BFGS or RFO) | docs/opt.md |
tsopt |
TS optimization (Dimer or RS-I-RFO) | docs/tsopt.md |
path-opt |
MEP optimization via GSM or DMF | docs/path_opt.md |
path-search |
Recursive MEP search with refinement | docs/path_search.md |
scan |
1D bond-length driven scan | docs/scan.md |
scan2d |
2D distance grid scan | docs/scan2d.md |
scan3d |
3D distance grid scan | docs/scan3d.md |
| Subcommand | Role | Documentation |
|---|---|---|
freq |
Vibrational frequency analysis + thermochemistry | docs/freq.md |
irc |
IRC calculation (EulerPC) | docs/irc.md |
dft |
Single-point DFT (GPU4PySCF / PySCF) | docs/dft.md |
bond-summary |
Compare structures and report bond changes | docs/bond-summary.md |
| Subcommand | Role | Documentation |
|---|---|---|
trj2fig |
Energy plot from XYZ trajectory | docs/trj2fig.md |
energy-diagram |
Energy diagram from numeric values | docs/energy_diagram.md |
Tip: In
tsopt,freq, andirc, setting--hessian-calc-mode Analyticalis strongly recommended when you have enough VRAM.
On HPC clusters or multi-GPU workstations, pdb2reaction can parallelize UMA inference across nodes. Set workers and workers_per_node to enable parallel inference; see docs/uma_pysis.md for details.
pdb2reaction --help
pdb2reaction <subcommand> --help
pdb2reaction <subcommand> --help-advanced
pdb2reaction all --help-advanced
# Shorthand alias (equivalent to pdb2reaction)
p2r --help
# Equivalent module invocation
python -m pdb2reaction --helppdb2reaction all --help shows core options. Use pdb2reaction all --help-advanced for the full option list.
scan, scan2d, scan3d, and the calculation commands (opt, path-opt, path-search, tsopt, freq, irc, dft) now follow the same progressive-help pattern (--help core, --help-advanced full). add-elem-info, trj2fig, and energy-diagram also use the same pattern. extract and fix-altloc also support progressive help (--help core, --help-advanced full parser options).
If you encounter any issues, please open an issue at https://github.com/t-0hmura/pdb2reaction/issues.
A preprint describing pdb2reaction is in preparation. Currently, if you find this work helpful for your research, please cite the software itself:
@software{ohmura2026pdb2reaction,
author = {Ohmura, Takuto},
title = {pdb2reaction},
year = {2026},
month = {3},
version = {0.3.2},
url = {https://github.com/t-0hmura/pdb2reaction},
license = {GPL-3.0},
doi = {10.5281/zenodo.19197878}
}- MACE and UMA cannot coexist in the same environment due to an
e3nnversion conflict. Use separate conda environments. - DFT single-point (
pdb2reaction dft) is practical up to ~500 atoms; larger systems may require fragmentation. - ORB backend has a higher failure rate on multi-step reactions (SVD failures in path optimization).
- CPU-only execution is supported but 10-100x slower than GPU.
pdb2reaction is distributed under the GNU General Public License version 3 (GPL-3.0).
This software is still under development. Please use it at your own risk.