Generating SMILES without Chirality Information
I'd like to express my gratitude for your outstanding work on this project. It has been incredibly helpful in my research.
However, I noticed that the generated SMILES strings do not include chirality information, which makes them somewhat inconsistent with uspto-50k test set.
Could you please provide some guidance on how to generate SMILES strings with the appropriate chirality information? I would like to eval the results with Chem.MolToSmiles(mol, isomericSmiles=True).
Again, thank for your outstanding work~
Hi @Jesse-zjx, thanks for your message and sorry for the slow response. Unfortunately, I'm afraid this is not possible because we never used chirality information in this project.
also, note that uspto50k has a considerable proportion of stereochemistry errors, which e.g. prohibit models to ever reach 100% accuracy on the dataset
@arneschneuing @mrwnmsr Thank you for your patience in answering. BTW, could you provide references or sources for "uspto50k has a considerable proportion of stereochemistry errors"?
source: own analysis ;)
LOL, thank you for your analysis of the stereochemistry in 50k, which provides some basis for our following research.
Is it possible to provide your methods/ideas or tools for analyzing stereochemistry? This will be of great help to us.