rdeditor icon indicating copy to clipboard operation
rdeditor copied to clipboard

Implicit and explicit H's

Open EBjerrum opened this issue 1 year ago • 5 comments

There's no way to correct or set explicit H's. This can disturb kekulization in e.g. in aromatic heterocycles.

EBjerrum avatar Jul 27 '24 17:07 EBjerrum

I was working on some code to automatically fix the hydrogens and atom valence. Unfortunately, it was really unreliable and I never full figured out how rdkit calculates the implicit valence.

jacktday avatar Dec 08 '24 11:12 jacktday

There is a dirty hack where you export to smiles and then reimport it. Then rdkit will fix the hydrogens for you. I don’t really like it, but it works.

jacktday avatar Dec 08 '24 12:12 jacktday

I wouldn't like that hack either. The goal is to stay as true to the molecular graph as possible, so having a round-trip around SMILES with the potential information loss that can occur is not the way forward. The explicit/implicit/actual_H_atom_on_Graph is very confusing indeed and is some technical debt that would probably be designed differently if it was reimplemented today. We rather need a method to edit these counts via the interface so that the graph can be manipulated properly.

EBjerrum avatar Dec 09 '24 13:12 EBjerrum

The more I look into solutions for this the more attractive it becomes to write a new library. Rdkit is some pretty nasty code.

There are also a lot of edge cases around aromaticity. This makes it really hard to determine how to fix the valence. You need to be able aware of the relationship of atoms to rings. Which I don’t think is available in rdkit.

jacktday avatar Dec 09 '24 13:12 jacktday

This is still a work in progress, but it starts to simplify templates. It doesn't actually fix the hydrogen issue yet.

https://github.com/jacktday/rdeditor/tree/custom_reactions_for_better_templates

jacktday avatar Dec 10 '24 09:12 jacktday