Carboxylates from mol2 files produced by tleap have incorrect bond orders and formal charges
Describe the bug Carboxylate functional groups are read incorrectly from mol2 files produced by Amber/tleap. Carboxylate bond orders are specified as 1 such that the total charge of the functional group is -3. The expected total charge is -1 due to resonance states.
To Reproduce
ASP.mol2 is aspartate dipeptide from the OpenFF port of Amber ff14SB, AllDipeptides/MainChain/ASP/ASP.mol2 from AllDipeptides.tar.gz, and was produced from tleap. This file is reproduced below.
@<TRIPOS>MOLECULE
default_name
24 23 1 0 0
SMALL
No Charge or Current Charge
@<TRIPOS>ATOM
1 H1 25.9290 25.1730 24.5000 H 1 ACE 0.112300
2 CH3 25.9910 26.1800 24.0920 C.3 1 ACE -0.366200
3 H2 25.3370 26.8470 24.6490 H 1 ACE 0.112300
4 H3 25.7100 26.1710 23.0410 H 1 ACE 0.112300
5 C 27.4140 26.6690 24.2150 C.2 1 ACE 0.597200
6 O 28.2640 25.9470 24.7160 O.2 1 ACE -0.567900
7 N 27.6440 27.8970 23.7640 N.am 2 ASP -0.516300
8 H 26.8660 28.4410 23.4250 H 2 ASP 0.293600
9 CA 28.9330 28.6030 23.7820 C.3 2 ASP 0.038100
10 HA 29.5120 28.2670 24.6440 H 2 ASP 0.088000
11 CB 29.7230 28.2540 22.5020 C.3 2 ASP -0.030300
12 HB2 29.7230 27.1700 22.3750 H 2 ASP -0.012200
13 HB3 29.2100 28.6900 21.6420 H 2 ASP -0.012200
14 CG 31.1870 28.7210 22.5090 C.2 2 ASP 0.799400
15 OD1 31.6980 29.0740 23.5980 O.co2 2 ASP -0.801400
16 OD2 31.7820 28.7390 21.4100 O.co2 2 ASP -0.801400
17 C 28.6930 30.1240 23.9240 C.2 2 ASP 0.536600
18 O 27.5610 30.5970 23.7680 O.2 2 ASP -0.581900
19 N 29.7400 30.8950 24.2340 N.am 3 NME -0.415700
20 H 30.6470 30.4250 24.2560 H 3 NME 0.271900
21 CH3 29.7000 32.3450 24.3910 C.3 3 NME -0.149000
22 HH31 28.8290 32.6270 24.9840 H 3 NME 0.097600
23 HH32 30.6080 32.6840 24.8890 H 3 NME 0.097600
24 HH33 29.6290 32.8110 23.4080 H 3 NME 0.097600
@<TRIPOS>BOND
1 5 6 2
2 5 7 am
3 2 3 1
4 2 4 1
5 2 5 1
6 1 2 1
7 17 18 2
8 17 19 am
9 14 15 1
10 14 16 1
11 11 12 1
12 11 13 1
13 11 14 1
14 9 10 1
15 9 11 1
This will reproduce the incorrect behavior in python.
from openff.toolkit.topology import Molecule
mol = Molecule.from_file('ASP.mol2')
print(mol.total_charge)
print(mol.to_smiles())
Output
-3.0 e
[H][C@@](C(=O)N([H])C([H])([H])[H])(C([H])([H])[C-]([O-])[O-])N([H])C(=O)C([H])([H])[H]
Computing environment
- Operating system: Ubuntu 20.04.2
- Output of running
conda list
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
abseil-cpp 20210324.2 h9c3ff4c_0 conda-forge
alembic 1.7.5 pyhd8ed1ab_0 conda-forge
amberlite 16.0 pypi_0 pypi
ambertools 21.11 py39hc630cb1_0 conda-forge
amberutils 21.0 pypi_0 pypi
argcomplete 1.12.3 pyhd8ed1ab_2 conda-forge
argon2-cffi 21.1.0 py39h3811e60_2 conda-forge
arpack 3.7.0 hdefa2d7_2 conda-forge
arrow-cpp 6.0.1 py39h1d68239_0_cpu conda-forge
astunparse 1.6.3 pyhd8ed1ab_0 conda-forge
async_generator 1.10 py_0 conda-forge
attrs 21.2.0 pyhd8ed1ab_0 conda-forge
aws-c-auth 0.6.7 hfef2836_0 conda-forge
aws-c-cal 0.5.12 h70efedd_7 conda-forge
aws-c-common 0.6.17 h7f98852_0 conda-forge
aws-c-compression 0.2.14 h7c7754b_7 conda-forge
aws-c-event-stream 0.2.7 hb80ed28_31 conda-forge
aws-c-http 0.6.10 h58a30cf_2 conda-forge
aws-c-io 0.10.13 he836878_5 conda-forge
aws-c-mqtt 0.7.9 h042a236_0 conda-forge
aws-c-s3 0.1.27 hae5f17b_11 conda-forge
aws-c-sdkutils 0.1.1 h7c7754b_4 conda-forge
aws-checksums 0.1.12 h7c7754b_6 conda-forge
aws-crt-cpp 0.17.8 h82bac0c_1 conda-forge
aws-sdk-cpp 1.9.145 hfe59705_2 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge
basis_set_exchange 0.9 pyhd8ed1ab_0 conda-forge
bcrypt 3.2.0 py39h3811e60_2 conda-forge
bleach 4.1.0 pyhd8ed1ab_0 conda-forge
blosc 1.21.0 h9c3ff4c_0 conda-forge
boost 1.74.0 py39h5472131_4 conda-forge
boost-cpp 1.74.0 h359cf19_5 conda-forge
brotli 1.0.9 h7f98852_6 conda-forge
brotli-bin 1.0.9 h7f98852_6 conda-forge
brotlipy 0.7.0 py39h3811e60_1003 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.18.1 h7f98852_0 conda-forge
ca-certificates 2021.10.8 ha878542_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cachetools 4.2.4 pyhd8ed1ab_0 conda-forge
cairo 1.16.0 ha00ac49_1009 conda-forge
certifi 2021.10.8 py39hf3d152e_1 conda-forge
cffi 1.15.0 py39h4bc2ebd_0 conda-forge
chardet 4.0.0 py39hf3d152e_2 conda-forge
click 8.0.3 pypi_0 pypi
codecov 2.1.11 pyhd3deb0d_0 conda-forge
colorama 0.4.4 pyh9f0ad1d_0 conda-forge
coverage 6.1.2 py39h3811e60_0 conda-forge
cryptography 35.0.0 py39h95dcef6_2 conda-forge
cudatoolkit 11.5.0 h36ae40a_9 conda-forge
curl 7.80.0 h2574ce0_0 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
cython 0.29.24 py39he80948d_1 conda-forge
debugpy 1.5.1 py39he80948d_0 conda-forge
decorator 5.1.0 pyhd8ed1ab_0 conda-forge
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
double-conversion 3.1.5 h9c3ff4c_2 conda-forge
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
fftw 3.3.10 nompi_h74d3f13_101 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.13.1 hba837de_1005 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
fonttools 4.28.1 py39h3811e60_0 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
geometric 0.9.7.2 py_0 conda-forge
gettext 0.19.8.1 h73d1719_1008 conda-forge
gflags 2.2.2 he1b5a44_1004 conda-forge
glog 0.5.0 h48cff8f_0 conda-forge
grpc-cpp 1.41.1 h75e9d12_2 conda-forge
h5py 3.4.0 nompi_py39h7e08c79_102 conda-forge
hdf4 4.2.15 h10796ff_3 conda-forge
hdf5 1.12.1 nompi_h2750804_101 conda-forge
icu 69.1 h9c3ff4c_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
importlib-metadata 4.8.2 py39hf3d152e_0 conda-forge
importlib_metadata 4.8.2 hd8ed1ab_0 conda-forge
importlib_resources 5.4.0 pyhd8ed1ab_0 conda-forge
iniconfig 1.1.1 pyh9f0ad1d_0 conda-forge
ipykernel 6.5.0 py39hef51801_1 conda-forge
ipython 7.29.0 py39hef51801_2 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
ipywidgets 7.6.5 pyhd8ed1ab_0 conda-forge
jbig 2.1 h7f98852_2003 conda-forge
jedi 0.18.1 py39hf3d152e_0 conda-forge
jinja2 3.0.3 pyhd8ed1ab_0 conda-forge
jpeg 9d h36c2ea0_0 conda-forge
jsonschema 4.2.1 pyhd8ed1ab_0 conda-forge
jupyter_client 7.0.6 pyhd8ed1ab_0 conda-forge
jupyter_core 4.9.1 py39hf3d152e_1 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_widgets 1.0.2 pyhd8ed1ab_0 conda-forge
kiwisolver 1.3.2 py39h1a9c180_1 conda-forge
krb5 1.19.2 hcc1bbae_3 conda-forge
lcms2 2.12 hddcbb42_0 conda-forge
ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge
lerc 3.0 h9c3ff4c_0 conda-forge
libblas 3.9.0 12_linux64_openblas conda-forge
libbrotlicommon 1.0.9 h7f98852_6 conda-forge
libbrotlidec 1.0.9 h7f98852_6 conda-forge
libbrotlienc 1.0.9 h7f98852_6 conda-forge
libcblas 3.9.0 12_linux64_openblas conda-forge
libcurl 7.80.0 h2574ce0_0 conda-forge
libdeflate 1.8 h7f98852_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libevent 2.1.10 h9b69904_4 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 11.2.0 h1d223b6_11 conda-forge
libgfortran-ng 11.2.0 h69a702a_11 conda-forge
libgfortran5 11.2.0 h5c6108e_11 conda-forge
libglib 2.70.1 h174f98d_0 conda-forge
libgomp 11.2.0 h1d223b6_11 conda-forge
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 12_linux64_openblas conda-forge
libnetcdf 4.8.1 nompi_hb3fd0d9_101 conda-forge
libnghttp2 1.43.0 h812cca2_1 conda-forge
libopenblas 0.3.18 pthreads_h8fe5266_0 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libpq 13.3 hd57d9b9_3 conda-forge
libprotobuf 3.18.1 h780b84a_0 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libssh2 1.10.0 ha56f1ee_2 conda-forge
libstdcxx-ng 11.2.0 he4da1e4_11 conda-forge
libthrift 0.15.0 he6d91bd_1 conda-forge
libtiff 4.3.0 h6f004c6_2 conda-forge
libutf8proc 2.6.1 h7f98852_0 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libwebp-base 1.2.1 h7f98852_0 conda-forge
libxcb 1.13 h7f98852_1004 conda-forge
libxml2 2.9.12 h885dcf4_1 conda-forge
libxslt 1.1.33 h0ef7038_3 conda-forge
libzip 1.8.0 h4de3113_1 conda-forge
libzlib 1.2.11 h36c2ea0_1013 conda-forge
lxml 4.6.4 py39h107f48f_0 conda-forge
lz4-c 1.9.3 h9c3ff4c_1 conda-forge
lzo 2.10 h516909a_1000 conda-forge
mako 1.1.6 pyhd8ed1ab_0 conda-forge
markupsafe 2.0.1 py39h3811e60_1 conda-forge
matplotlib-base 3.5.0 py39h2fa2bec_0 conda-forge
matplotlib-inline 0.1.3 pyhd8ed1ab_0 conda-forge
mdtraj 1.9.7 py39h138c130_0 conda-forge
mistune 0.8.4 py39h3811e60_1005 conda-forge
mmpbsa-py 16.0 pypi_0 pypi
mock 4.0.3 py39hf3d152e_2 conda-forge
more-itertools 8.11.0 pyhd8ed1ab_0 conda-forge
msgpack-python 1.0.2 py39h1a9c180_2 conda-forge
munkres 1.1.4 pyh9f0ad1d_0 conda-forge
nbclient 0.5.8 pyhd8ed1ab_0 conda-forge
nbconvert 6.3.0 py39hf3d152e_1 conda-forge
nbformat 5.1.3 pyhd8ed1ab_0 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge
netcdf-fortran 4.5.3 nompi_h2b6e579_106 conda-forge
networkx 2.6.3 pyhd8ed1ab_1 conda-forge
nglview 3.0.3 pyh8a188c0_0 conda-forge
nomkl 1.0 h5ca1d4c_0 conda-forge
notebook 6.4.6 pyha770c72_0 conda-forge
numexpr 2.7.3 py39hbd72853_102 conda-forge
numpy 1.21.4 py39hdbf815f_0 conda-forge
ocl-icd 2.3.1 h7f98852_0 conda-forge
ocl-icd-system 1.0.0 1 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
openeye-toolkits 2021.1.1 py39_0 openeye
openff-forcefields 2.0.0 pyh6c4a22f_0 conda-forge
openff-fragmenter 0.1.2 pyhd8ed1ab_0 conda-forge
openff-fragmenter-base 0.1.2 pyhd8ed1ab_0 conda-forge
openff-qcsubmit 0.3.0 pyhd8ed1ab_0 conda-forge
openff-toolkit 0.10.1 pyhd8ed1ab_0 conda-forge
openff-toolkit-base 0.10.1 pyhd8ed1ab_0 conda-forge
openjpeg 2.4.0 hb52868f_1 conda-forge
openmm 7.5.1 py39h71eca04_1 conda-forge
openmmforcefields 0.9.0 pyhd8ed1ab_0 conda-forge
openssl 1.1.1l h7f98852_0 conda-forge
orc 1.7.1 h68e2c4e_0 conda-forge
packaging 21.3 pyhd8ed1ab_0 conda-forge
packmol 20.010 h86c2bf4_0 conda-forge
packmol-memgen 1.2.1rc0 pypi_0 pypi
pandas 1.3.4 py39hde0f152_1 conda-forge
pandoc 2.16.1 h7f98852_0 conda-forge
pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge
parmed 3.4.3 py39he80948d_1 conda-forge
parquet-cpp 1.5.1 2 conda-forge
parso 0.8.2 pyhd8ed1ab_0 conda-forge
pcre 8.45 h9c3ff4c_0 conda-forge
pdb4amber 20.1 pypi_0 pypi
perl 5.32.1 1_h7f98852_perl5 conda-forge
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 8.4.0 py39ha612740_0 conda-forge
pint 0.18 pyhd8ed1ab_0 conda-forge
pip 21.3.1 pyhd8ed1ab_0 conda-forge
pixman 0.40.0 h36c2ea0_0 conda-forge
plotly 5.4.0 pyhd8ed1ab_0 conda-forge
pluggy 1.0.0 py39hf3d152e_2 conda-forge
postgresql 13.3 h2510834_3 conda-forge
prometheus_client 0.12.0 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.22 pyha770c72_0 conda-forge
psutil 5.8.0 py39h3811e60_2 conda-forge
psycopg2 2.9.2 py39h3811e60_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
py 1.11.0 pyh6c4a22f_0 conda-forge
py-cpuinfo 8.0.0 pyhd8ed1ab_0 conda-forge
pyarrow 6.0.1 py39hff6fa39_0_cpu conda-forge
pycairo 1.20.1 py39hedcb9fc_1 conda-forge
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pydantic 1.8.2 py39h3811e60_2 conda-forge
pygments 2.10.0 pyhd8ed1ab_0 conda-forge
pyopenssl 21.0.0 pyhd8ed1ab_0 conda-forge
pyparsing 3.0.6 pyhd8ed1ab_0 conda-forge
pyrsistent 0.18.0 py39h3811e60_0 conda-forge
pysocks 1.7.1 py39hf3d152e_4 conda-forge
pytables 3.6.1 py39h2669a42_4 conda-forge
pytest 6.2.5 py39hf3d152e_1 conda-forge
pytest-cov 3.0.0 pyhd8ed1ab_0 conda-forge
python 3.9.7 hb7a2778_3_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python_abi 3.9 2_cp39 conda-forge
pytraj 2.0.6 pypi_0 pypi
pytz 2021.3 pyhd8ed1ab_0 conda-forge
pyyaml 6.0 py39h3811e60_3 conda-forge
pyzmq 22.3.0 py39h37b5a0c_1 conda-forge
qcelemental 0.23.0 pyhd8ed1ab_0 conda-forge
qcengine 0.20.1 pyhd8ed1ab_0 conda-forge
qcfractal 0.15.7 py39hf3d152e_0 conda-forge
qcfractal-core 0.15.7 py39hf3d152e_0 conda-forge
qcportal 0.15.7 pyhd8ed1ab_0 conda-forge
rdkit 2021.09.2 py39hccf6a74_0 conda-forge
re2 2021.11.01 h9c3ff4c_0 conda-forge
readline 8.1 h46c0cb4_0 conda-forge
regex 2021.11.10 py39h3811e60_0 conda-forge
reportlab 3.5.68 py39he59360d_1 conda-forge
requests 2.25.1 pyhd3deb0d_0 conda-forge
requests-mock 1.9.3 pyhd8ed1ab_0 conda-forge
s2n 1.3.0 h9b69904_0 conda-forge
sander 16.0 pypi_0 pypi
scipy 1.7.2 py39hee8e79c_0 conda-forge
send2trash 1.8.0 pyhd8ed1ab_0 conda-forge
setuptools 59.1.1 py39hf3d152e_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
smirnoff99frosst 1.1.0 pyh44b312d_0 conda-forge
snappy 1.1.8 he1b5a44_3 conda-forge
sqlalchemy 1.3.23 py39h3811e60_0 conda-forge
sqlite 3.36.0 h9cd32fc_2 conda-forge
tenacity 8.0.1 pyhd8ed1ab_0 conda-forge
terminado 0.12.1 py39hf3d152e_1 conda-forge
testpath 0.5.0 pyhd8ed1ab_0 conda-forge
tinydb 4.5.2 pyhd8ed1ab_0 conda-forge
tk 8.6.11 h27826a3_1 conda-forge
toml 0.10.2 pyhd8ed1ab_0 conda-forge
tomli 1.2.2 pyhd8ed1ab_0 conda-forge
tornado 6.1 py39h3811e60_2 conda-forge
torsiondrive 1.1.0 pyhd8ed1ab_0 conda-forge
tqdm 4.62.3 pyhd8ed1ab_0 conda-forge
traitlets 5.1.1 pyhd8ed1ab_0 conda-forge
typing-extensions 4.0.0 hd8ed1ab_0 conda-forge
typing_extensions 4.0.0 pyha770c72_0 conda-forge
tzcode 2021e h7f98852_0 conda-forge
tzdata 2021e he74cb21_0 conda-forge
urllib3 1.26.7 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
wheel 0.37.0 pyhd8ed1ab_1 conda-forge
widgetsnbextension 3.5.2 py39hf3d152e_1 conda-forge
xmltodict 0.12.0 py_0 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.3 hd9c2040_1000 conda-forge
xorg-libx11 1.7.2 h7f98852_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-libxt 1.2.1 h7f98852_2 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
zeromq 4.3.4 h9c3ff4c_1 conda-forge
zipp 3.6.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.11 h36c2ea0_1013 conda-forge
zstd 1.5.0 ha95c52a_0 conda-forge
Additional context Reading carboxylate-containing mol2 files produced by tleap into OpenEye also produces an incorrect molecule with a total charge of -3. After writing the OpenEye molecule to SDF, the carboxylate bond orders are 5.
This problem and a solution are discussed in this issue from amber-ff-porting, but this problem in not documented in the OpenFF toolkit.
Thanks @chapincavender for reporting this and linking to the issue on amber-ff-porting.
I'm not sure what action to take on this - mol2 is famously lacking specification and these two tools clearly have different interpretations of the molecule in the file. Documentation of this ambiguity may be good, and in the future we could have [C-]([O-])([O-]) motifs lead to an informative error.
in the future we could have
[C-]([O-])([O-])motifs lead to an informative error.
I think this is the best way to address this in the long run. Throw a warning or error that this chemistry is unlikely but is known to show up in molecules produced by this pathway.