chembl_gen_check is a Python library that uses lightweight MolBloom filters for rapid verification of the existence of scaffolds, generic scaffolds or ring systems in ChEMBL (and SureChEMBL) structures. The library can also indicate whether a compound has uncommon bonds according to the LACAN algorithm, or that a compound triggers a structural alert. Taken together, these checks provide rapid assessment of the reasonableness of ring systems and scaffolds, as well as ensuring that atom and bond environments have precedent.
pip install chembl-gen-check
from chembl_gen_check import Checker
checker = Checker("chembl")
#checker = Checker("surechembl")
smiles = "CCN(CC)C(=O)C[C@H]1C[C@@H]1c1ccccc1"
checker.load_smiles(smiles)
# Murcko scaffold found in the loaded database (True/False)
checker.check_scaffold()
# Generic Murcko scaffold found in loaded database (True/False)
checker.check_skeleton()
# All molecule ring systems found in loaded database (True/False)
checker.check_ring_systems()
# Number of structural alerts using the ChEMBL set in RDKit(integer)
checker.check_structural_alerts()
# LACAN score > 0.5 using the loaded database (True/False)
checker.check_lacan() > 0.5Code to extract ring systems adapted from: W Patrick Walters. useful_rdkit_utils
Code to calculate LACAN scores adapted from: Dehaen, W. LACAN. https://github.com/dehaenw/lacan/