Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: expected only 1 added H per heavy atom, maybe deleted element had double bond to this heavy atom #267

Open
accibio opened this issue Dec 7, 2024 · 7 comments · May be fixed by #273

Comments

@accibio
Copy link

accibio commented Dec 7, 2024

Hi, I'm trying to prepare ligands for virtual screening. I have a multi-molecule SDF file of ligands prepared using RDKit which I need to split into multiple PDBQT files to serve as input for QuickVina2-GPU-2.1.

I tried running this:

mk_prepare_ligand.py -i compounds.sdf --rigid_macrocycles --multimol_outdir .

It runs for a while and then throws this error:

RuntimeError: expected only 1 added H per heavy atom, maybe deleted element had double bond to this heavy atom

I'm guessing there's something wrong with some compound which I'm unable to figure out. I have attached the multi-molecule SDF file for reference. Please help.
Thanks.
compounds.zip

@rwxayheee
Copy link
Contributor

Hi @accibio
It seems to be NSC36548. It seems to be a very distorted structure containing metal (Zn)

@diogomart
Copy link
Contributor

Not sure what is going on. It's possible that it's something on RDKit's end. Atom 24 in NSC36548 is a Sulfur with -1 formal charge and a dative bond to Zn (atom 25). After Zn is removed in RDKitMoleculeSetup.remove_elements, Sulfur 24 gets two Hs, but I expected it to get just one. The error was a sanity check that caught this unexpected double H addition. This problem does not happen with a minimal example based on CN(C)C(=S1)[S-]->[Zn+2]12S=C(S2)N(C)C but the S with a dative bond to Zn in this minimal example unexpectedly becomes a radical after Zn is removed and before Chem.AddHs is called.

@rwxayheee
Copy link
Contributor

rwxayheee commented Dec 9, 2024

Part of the bond order, formal charge, or number of hydrogens don't add up in the input structure (which might also explain why the generated structure is distorted). The formal total charge of the complex should be +2.

Our current charge assignment method only supports single bonds with metals. If we must go with this complex, the SDF needs four single bonds (like unpolarized iconic bonds). Each sulfur has a formal charge of +1 and Zn will be -2. After these corrections, this SDF input will pass the current charge assignment method:
NSC36548_single_bonds.sdf.txt
Output PDBQT:
NSC36548_single_bonds.pdbqt.txt

It seems like removal of the metal element (if it's the acceptor of dative bonds) means the donor atom now has two (a pair of) extra available elections. They are assumed unpaired and each needs to be fullfilled with a bond with an added H (?). I can do a little more work and see if we can support dative bonds in the charging method. But again, the input structure needs to have the correct total charges and chemical structure.

@accibio
Copy link
Author

accibio commented Dec 9, 2024

Thanks a lot for all the valuable and insightful information. I must admit that many of these details are quite overwhelming for me. I would rather prefer to remove molecules with such abnormalities than attempting to fix them. Is there a way to identify such molecules and remove them in RDKit or Meeko? That would save a lot of time. Removing all molecules with metal atoms is one option, but I might end up losing a lot of potentially important molecules.

@diogomart
Copy link
Contributor

We should put molecule preparation inside a try/except statement and exclude "bad" molecules automatically (#271). But that isn't implemented right now. Removing molecules with metals sounds reasonable given that their geometries (only looked at a couple) are severely distorted, so docking wouldn't produce meaningful results.

@rwxayheee
Copy link
Contributor

rwxayheee commented Dec 9, 2024

Hi @accibio
We just implemented the try/except in the command-line script as suggested by @diogomart. You could check out the updated version of mk_prepare_ligand.py from the develop branch of this repository. For your input SDF, you will get a table of summary after the execution:

Input molecules processed: 2467, skipped: 0
PDBQT files written: 2467
PDBQT files not written due to error: 50
Input molecules with errors: 50
Warning: 142 molecules have duplicated names
Some molecules encountered errors!

There are various possible reasons for the failed preparation, not limited to unphysical charge/valence state. In this input file specifically, quite a few ligands didn't pass because element As does not have an ad4 atom type.

Note that Meeko's current ligand preparation does not flag distorted geometry. Still, ligand geometry is important and if you have control over the source or the generation process of your compounds structure file, I recommend doing a detailed analysis of the compounds geometries in RDKit or other tools (not just counting the bad ones). A basic method would be to evaluate the bond lengths (just use some crude criteria don't have to be super accurate) of all registered bonds per molecule. This will allow you to quickly weed out the compounds that are highly distorted.

@diogomart
Copy link
Contributor

The feature is available in the develop branch, see instructions to install from source:
https://meeko.readthedocs.io/en/release/installation.html#from-source

@rwxayheee rwxayheee linked a pull request Dec 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants