Skip to content

Add optional ML-based solvation correction pathway #2798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

BonhyeokKoo
Copy link

Summary

This PR introduces an optional ML-based solvation correction pathway into RMG’s thermochemistry pipeline. It provides an interface for estimating solvation effects using a pre-trained ML model, while preserving the existing LSER (Linear Solvation Energy Relationship) method as a fallback mechanism.

Motivation or Problem

Currently, RMG uses LSER for solvation corrections for thermo. At this stage, the ML component is implemented as a dummy version that always outputs zero corrections. This is intentional — actual ML functionality will be introduced later, after transitioning to Python 3.11 for better compatibility.

Description of Changes

Add MLSolvation class to rmgpy/data/solvation.py

  • Introduced a new MLSolvation class that mirrors the structure of MLEstimator from mlEstimator.
  • Accepts the same kind of mlSolvation block in the input file, with options like use_ml_solvation=True and name='solvation'.
  • Unlike MLEstimator, which resides in rmgpy/ml/estimator.py, this new class is defined within rmgpy/data/solvation.py.
  • The class exposes a get_solvation_correction() method, analogous to SolvationDatabase, and returns a SolvationCorrection object.

Add ml_solvation() function to rmgpy/rmg/input.py

  • Mirrors the implementation of the ml_estimator() block and allows ML solvation to be optionally configured via the input file.

Modify rmgpy/thermo/thermoengine.py to support fallback logic

  • At the point where solvation corrections are applied to thermo, the code now tries to use the ML-based model via get_input("ml_solvation").
  • If this fails (e.g., due to configuration or import issues), the code gracefully falls back to the default LSER method.
  • This behavior is wrapped in a try-except block.

Testing

  • Two minimal examples were tested:
    1. Without an mlSolvation block → expected warning:
      Warning: ML solvation correction not used: 'RMG' object has no attribute 'ml_solvation'
      
      The system then falls back to LSER as expected.
    2. With an mlSolvation block → dummy ML model successfully invoked:
      [NOTICE] Dummy ML model loaded from: /Users/bon/rmg/RMG-database/input/thermo/ml/solvation
      [NOTICE] Dummy ML model utilized
      

Reviewer Tips

  • Please check whether the following imports are appropriate and safe:
    • from rmgpy.data.solvation import SolvationCorrection in solvation.py
    • from rmgpy.rmg.input import get_input in thermoengine.py
  • Note that MLSolvation is currently a dummy implementation and does not load an actual model. It is prepared to be upgraded in future commits after Python 3.11 migration.

@JacksonBurns
Copy link
Contributor

JacksonBurns commented May 29, 2025

hi @BonhyeokKoo welcome to RMG world! Please tag me in this PR as needed, and as a reviewer when the time comes.

Following up on some offline discussion, the following things need to happen to unblock this PR by adding support for Python 3.11:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants