Skip to content

rdkit update breaks entry validation #62

@mmzdouc

Description

@mmzdouc

Problem description

With rdkit==2024.3.6, the validation of all MITE entries passes [i.e. substrate(s)>reactionSMARTS>product(s)]. However, in more recent versions of rdkit, the validation for several MITE entries fails due to unknown reasons, presumably issues with chirality.

Example

reaction SMARTS: [#6:1]1(:[#6:36]:[#7:35;h1]:[#6:30]2:[#6:31]:[#6:32]:[#6:33]:[#6:34]:[#6:29]:1:2)-[#6:2]\\[#6@:3]1-[#6@:13]2-[#6@@:7]3(-[#6:27](=[#8:28])-[#6:26]=[#6:25]-[#6:24]-[#6:23]-[#6:21](-[#6:22])=[#6:20]-[#6@@:18](/[#6:19])-[#6:17]-[#6:16]=[#6:15]/[#6@:8]-3-[#6:9]=[#6:10](-[#6:14])-[#6@:11]/2/[#6:12])/[#6:5](=[#8:6])-[#7:4]-1>>[#6:1]1(:[#6:36]:[#7:35;h1]:[#6:30]2:[#6:31]:[#6:32]:[#6:33]:[#6:34]:[#6:29]:1:2)-[#6:2]\\[#6@:3]1-[#6@:13]2-[#6@@:7]3(-[#6:27](=[#8:28])-[#6:26]=[#6:25]-[#6:24]-[#6:23]-[#6:21](-[#6:22])=[#6:20]-[#6@@:18](/[#6:19])-[#6:17]-[#6:16]=[#6:15]/[#6@:8]-3-[#6@@:9]3\\[#8]-[#6@:10]-3(/[#6:14])-[#6@:11]/2/[#6:12])/[#6:5](=[#8:6])-[#7:4]-1

substrate: c1(C[C@@H]2NC(=O)[C@@]34[C@H](C=C([C@@H](C)C23)C)C=CC[C@H](C)C=C(C)CCC=CC4=O)c2c(cccc2)[nH]c1 |c:21,t:16,26|
expected product: c1(C[C@H]2C3[C@H](C)[C@]4(C)[C@@H](O4)[C@H]4[C@@]3(C(N2)=O)C(C=CCCC(C)=C[C@H](CC=C4)C)=O)c2c(cccc2)[nH]c1

error with rdkit>2024.3.6

ValueError: Reaction did not lead to all expected products.
Expected products:
CC1=C[C@@H](C)CC=C[C@H]2[C@@H]3O[C@]3(C)[C@@H](C)C3[C@H](Cc4c[nH]c5ccccc45)NC(=O)[C@]32C(=O)C=CCC1
Generated products:
CC1=C[C@@H](C)CC=C[C@H]2[C@H]3O[C@]3(C)[C@@H](C)C3[C@H](Cc4c[nH]c5ccccc45)NC(=O)[C@]32C(=O)C=CCC1

As can be seen from the expected vs generated products, the difference is a change in chirality (from @@ to @)

This is possibly related to #60 and atropisomeric bonds

Downstream problems and possible solutions

Currently, this issue is mitigated by pinning rdkit==2024.3.6; however, this restricts the MITE project to >=3.12.0,<3.13.0, which is not sustainable.

The proposed solution is to manually fix entries/reaction SMARTS that are not passing with rdkit>2024.3.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions