Improving RNA 3D structure prediction by adding template and secondary- structure guidance to open-source AlphaFold3 reproductions. Three model families are explored; each is modified to accept RNA structural guidance that the stock code does not support, then benchmarked against the unmodified baseline.
| Model | Approach | Status |
|---|---|---|
| OpenFold3 | RNA templates + synthetic SS-MSA (9 edited files) | included |
| Boltz-2 | RNA templates + SS contact constraints (2 edited files) | included |
| GraphaRNA | GNN + diffusion with N motif templates | planned |
- Template-guided approaches consistently improved RNA 3D structure prediction compared with baseline models.
- OpenFold3 modified with RNA templates and synthetic secondary-structure MSAs produced the strongest benchmark performance.
- Boltz-2 was extended to accept RNA template and secondary-structure guidance not available in the original implementation.
- Multiple RNA benchmark structures achieved substantial RMSD improvements relative to baseline predictions.
- The study demonstrates that structural guidance can improve prediction accuracy across diverse RNA systems.
RMSD (Å), best per row in bold.
| RNA | Length | OF3 baseline | OF3 MSA v5 | OF3 combined | Boltz template |
|---|---|---|---|---|---|
| 1E7K | 17 | 8.98 | 0.94 | 0.84 | 9.88 |
| 1P5P | 77 | 22.45 | 3.42 | 3.39 | 6.72 |
| 1P5O | 77 | 22.57 | 3.81 | 3.82 | 6.13 |
| 1HMH | 34 | 11.39 | 6.87 | 9.58 | 15.98 |
| 1KH6 | 48 | 5.94 | 5.91 | 7.10 | 14.18 |
OF3 combined achieves sub-Ångström accuracy on 1E7K (0.84 Å). OF3 MSA v5 and combined together dominate this subset, recovering predictions from 22 Å baseline failure (1P5P, 1P5O) to single-digit RMSD.
All work is built on the open-source projects and shared as edits/patches on top of the official code :- install the upstream model, then apply the changes here.
models/
openfold3/ RNA template + synthetic-MSA pipeline
edited_files/ patches/ scripts/ example_17RA/ example_1Z30_msa/
results/ docs/ README.md
boltz2/ SS constraints + RNA templates
edited_files/ patches/ scripts/ example_ss_constraints/
example_template/ results/ README.md
figures/ flow + structure diagrams, result visualizations
Each model folder is self-contained: edited source files (with a README on how to apply them to a fresh install), the pipeline scripts, a complete worked example (input -> intermediates -> output), and benchmark results.
Stock AlphaFold3 reproductions support protein templates and protein MSAs, but not RNA equivalents. This work adds:
- OpenFold3 : accepts RNA 3D templates (C1' distances, P-C4'-C1' frames) and an optional synthetic MSA generated from the RNA secondary structure, giving the model coevolution-like signal without a real alignment.
- Boltz-2 : accepts RNA template chains (frames from C4'-C1'-glycosidic N), and a generator that turns dot-bracket SS into contact constraints (works on stock Boltz, no patch needed).
Structures are evaluated by C1' Kabsch RMSD vs the native, best-of-N over diffusion samples and seeds.
Each model's full pipeline runs from a single entry command. See the model README for arguments and the paths to change.
OpenFold3 (template stage, then synthetic-MSA stage):
conda activate openfold3_rna_templ
bash models/openfold3/scripts/run_one_rna_of3_template.sh 1Z30
bash models/openfold3/scripts/run_ss_msa_pipeline.sh \
1Z30 1Z30.seq 1Z30.2d <stage1_dir> 1Z30.pdbBoltz-2 (pick an approach):
conda activate boltz2
# SS constraints (stock Boltz-2)
python models/boltz2/scripts/generate_boltz_yaml.py \
--rna 1A1T --seq 1A1T.seq --ss 1A1T.2d --out 1A1T.yaml
boltz predict 1A1T.yaml --diffusion_samples 5 --out_dir results_1A1T --override
# template (patched Boltz-2)
python models/boltz2/scripts/generate_template_yaml.py \
--rna 1A1T --pdb 1A1T.pdb --out 1A1T_template.yaml --cif_out 1A1T.cif
boltz predict 1A1T_template.yaml --out_dir results_1A1T_template --overrideAcross a 22-RNA benchmark, OpenFold3 with synthetic MSA was the strongest single
method overall, with large gains on several RNAs (e.g. 1P5P 22.45 -> 3.54 A,
1E7K 8.98 -> 0.94 A). Boltz-2 templates won on some RNAs (e.g. 3G8T 2.80 A),
showing the two approaches are complementary rather than one strictly dominating.
Per-RNA tables and head-to-head comparisons are in each model's results/ folder.
Some RNAs did not improve (e.g. 1U9S with synthetic MSA) :- these are reported too.
- No proprietary weights or large datasets are included; install upstream models for those.
- Paths in scripts use placeholders / are passed as arguments :- see each model README for the specific paths to set.