Skip to content

Conversation

@daylight-00
Copy link

Summary

This PR implements atom-wise RASA conditioning for ligands, enabling precise control over which atoms should be buried or exposed. This allows fine-grained control of ligand orientation during protein-ligand design.

Key Point: The model was already trained with atom-level RASA features (rf_diffusion/sasa.py computes SASA per atom). This PR simply exposes that capability to users at inference time, ensuring backward compatibility with existing global RASA specifications.

Changes

  1. Atom-wise specification: Expanded syntax rasa='global_rasa,ATOM1:value1,ATOM2:value2,...' to control individual atoms
  2. Metadata infrastructure: Pass ligand atom names through inference pipeline for atom name mapping
  3. Bug fix: Improved boundary handling in one_hot_buckets() function

Technical Details

  • Atom names must match PDB naming conventions
  • Values: 0.0 (buried) to 1.0 (exposed)
  • Backward compatible: global RASA values still work
  • Graceful fallback when metadata unavailable

Usage Examples

Control ligand orientation by specifying which atoms should be exposed:
4yhy_m3l.pdb

Inward Orientation

m3l_in.png

inference.conditions.relative_sasa_v2.rasa=\'0.0,N:1.0,CA:1.0,O:1.0,OXT:1.0\'

→ M3L backbone faces outward (backbone atoms have high RASA values)

Outward Orientation

m3l_out.png

inference.conditions.relative_sasa_v2.rasa=\'0.0,CM1:1.0,CM2:1.0,CM3:1.0,NZ:1.0\'

→ M3L side chain faces outward (side chain atoms have high RASA values)


Why This Matters

  • Precise control: Target specific functional groups (e.g., expose polar atoms, bury hydrophobic)
  • Training-consistent: No training/inference mismatch concerns
  • Backward compatible: Existing code continues to work

Validated with successful test cases (see examples above). Happy to address any feedback!

- Clamp bucket indices to valid range [0, n-1]
- Fix off-by-one error in bucket assignment
- Ensure values below 'low' go to first bucket
- Ensure values above 'high' go to last bucket
Enable ligand atom information to be accessible during feature computation:
- Store metadata (including ligand_atom_names) in Sampler.sample_init()
- Pass metadata to get_extra_tXd_inference() calls via kwargs
- Enables features like atom-wise RASA to map atom names to values

This infrastructure is required for per-atom feature specification,
where different atoms in a ligand can have different conditioning values.
Add parse_atomwise_rasa_config() to support per-atom RASA values:
- Parse configuration strings like '0.0,O7:0.8,C8:1.0,C9:1.0'
- Map ligand atom names to specific RASA values
- Fall back to global RASA when atom-specific values not provided
- Add validation and logging for atom matching

Update get_relative_sasa_inference() to:
- Use new parsing function
- Accept metadata via kwargs
- Print summary statistics of applied RASA values
@daylight-00 daylight-00 changed the title feat: Add atom-wise RASA conditioning for ligands feat: Add atom-wise RASA conditioning for ligand orientation design Oct 30, 2025
@daylight-00
Copy link
Author

Resolves #11

@rclune rclune assigned rclune, r-krishna, dtischer and w-ahern and unassigned rclune Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants