Discussion

Otis Rothenberger's picture
Otis Rothenberger | Wed, 02/15/2017 - 15:21

Bob,

I'm almost certain it's a Resolver issue. If I extract the JME file for the glycine zwitterion from JME, I get the following with charge designation on oxygen:

5 4 O- 13.15 -5.14 C 11.70 -5.97 O 11.70 -7.64 C 10.24 -5.14 N+ 8.80 -5.97 1 2 1 2 3 2 2 4 1 4 5 1

JME file on Resolver goes way back to Markus. I think this is a case of Resolver not keeping up with changes in the JME file structure. By the way, the above loads correctly in Jmol.

Otis

Otis Rothenberger's picture
Otis Rothenberger | Wed, 02/15/2017 - 10:58

Bob,

Bob H is going to have to check me on this, but I think your problem relates to different ways that Resolver and JME treat quaternary nitrogen. If you right-click on JME, you will see a prompt to copy or paste a molfile. The JME molfile explicitly shows the quaternary nitrogen charge. If you paste this JME molfile back into JME, you get the correct result.

By the way, if you actually want to see the charges in Jmol, you are going to have to use some Jmol Script. Right click the Jmol window and select console. In the console, type and then run: color label pink; select formalCharge <> 0;label %C

Bob can automate that for you in this Hack-a-Mol app if it's an important issue.

Otis

Robert Belford's picture
Robert Belford | Wed, 02/15/2017 - 10:10

When I edit the zwiterion from the 2D on the file from NCI (not PubChem), it works, and I get the following mol file. But, when I paste that file, it does not show the same 2D structure, and places an additional hydrogen on the 2D editor, even though it is not in the molfile. The 3D works fine, and you can "right click" go to surfaces, and plot molecular electrostatic potential. I wonder if there is a way the charge can be labeled in the 3D the way it can in the 2D. And Evan is right, we need to look deeper into Dr. Hanson's "How it Works",
https://chemapps.stolaf.edu/jmol/docs/misc/hackamolworkings.pdf

Can you upload the file showing the radical? Like the one below?

C2H5NO2
APtclcactv02151710583D 0 0.00000 0.00000

10 9 0 0 0 0 0 0 0 0999 V2000
1.8469 -0.1026 -0.0000 N 0 3 0 0 0 0 0 0 0 0 0 0
0.6941 0.8078 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5807 0.0042 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.5334 -1.2141 -0.0000 O 0 5 0 0 0 0 0 0 0 0 0 0
-1.6595 0.5723 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
2.7005 0.4354 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
0.7274 1.4363 0.8900 H 0 0 0 0 0 0 0 0 0 0 0 0
0.7274 1.4363 -0.8900 H 0 0 0 0 0 0 0 0 0 0 0 0
1.8160 -0.6844 0.8238 H 0 0 0 0 0 0 0 0 0 0 0 0
1.8160 -0.6844 -0.8238 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
2 3 1 0 0 0 0
3 4 1 0 0 0 0
3 5 2 0 0 0 0
1 6 1 0 0 0 0
2 7 1 0 0 0 0
2 8 1 0 0 0 0
1 9 1 0 0 0 0
1 10 1 0 0 0 0
M CHG 2 1 1 4 -1
M END
$$$$

Evan Hepler-Smith's picture
Evan Hepler-Smith | Wed, 02/15/2017 - 09:57

Hi,

Take a look at my response to "Question on Amino Acid Zwitterion," here: http://olcc.ccce.divched.org/comment/890#comment-890

Olcc S15 | Wed, 02/15/2017 - 09:47

In studying amino acids in Hack a mol, for example glycine we can use the JSME molecular editor to form a zwitterion. If we then copy the MOL/SDF file and try to reproduce the same zwitterion we instead get a molecule (radical) with only one charge. Is there a way of reproducing the same molecule using the same text file?

Damon Ridley's picture
Damon Ridley | Wed, 02/15/2017 - 01:10

Bob

If you search the CAS RN in Reaxys you indeed get the substance, although the reference in Reaxys is to a 2012 Kinetics and Catalysis article. Note, however, that the substance in Reaxys does not have stereochemistry assigned. If you search the structure 'As drawn' in Reaxys you get the amino alcohol, but you also get the hydrogen sulfate which is referenced to a 2016 article in Applied Soil Ecology. The structure search 'As drawn' (through Reaxys) also picks up 9 substances in PubChem. None of these have the stereochemistry assigned; the 8 other substances are various salts of the amino alcohol although one substance appears to be a multicomponent substance where the second substance is ethane. This substance is totally crazy and whoever put it into PubChem doesn't know much chemistry. Actually all the other substances are also pretty crazy since aminoalcohols such as this would very readily lose water to give the imine (in this case the imine of hydroxyacetaldehyde). If you search this imine in Reaxys, As drawn, then you get substances in Reaxys and in PubChem.

I see that Evan also has commented on your question, and I agree totally with what Evan has said. CAS RNs are just another synonym and should not be relied upon when searching other databases which have their own rules. Note also that if the CAS RN is actually for the R-enantiomer then this illustrates another point, namely that the matching of CAS RNs with substances in Reaxys is essentially at the level of structure connection tables - where stereochemistry is effectively "ignored" (by the CAS computers that do the comparison).

To me, however, none of this is real chemistry because the hydroxyamine is so unstable, although I would need to read the original articles to see the evidence for the substance.

Damon

Evan Hepler-Smith's picture
Evan Hepler-Smith | Tue, 02/14/2017 - 16:43

Let's see...

It looks like the 3D structure for glycineZixyz is more or less correct, but the negative charge on the carboxylate oxygen is missing in the 2D structure, correct?

In order to explain what I **think** is going on here, we'll have to take a look at "How it Works" that Dr. Hanson put together for Hack-a-mol. Here's the link (also linked at the bottom of the Hack-a-mol frame): https://chemapps.stolaf.edu/jmol/docs/misc/hackamolworkings.pdf

The relevant passage:

"When the user modifies the structure (or pastes into the textarea some sort of structure file data) and presses ENTER, [..details re. script...] the loadMol() method is executed. This method
checks to see if the data is 2D mol data (the characters “2D” starting at column 21 on the second line of the file) or 3D data (anything else), and then passes the data either to the appropriate module."

Okay, no "2D" in the specified place, so that's where the 3D structure comes from. How do we from 3D to 2D? We go to that section of How it Works:

"We need to maintain both a 3D and a 2D representation when a 3D model is loaded. In order to do
that, we again tap into the CIR at NCI. This is done in the to2D() function [...details re. script...] which then sends the following command to the CIR, again using a SMILES string to communicate"

Hack-a-mol also displays this SMILES string: "SMILES: [O]C(=O)C[NH3] at ChEMBL"

Note that we've got the extra proton on the N, but there's no charge on the O. My guess is that this is part of the reason why you're getting the glycinium cation rather than the zwitterion.

We could try directly inputting a SMILES string for the zwitterion [O-]C(=O)C[NH3+] into the text box. Same problem: we get the right 3D, but we're still missing the negative charge in the 2D. In fact, the same thing happens if we plug in, say, the SMILES for acetate CC([O-])=O . Same with azide [N-]=[N+]=[N-] . The negative ions just don't want to show up in the 2D.

Since the path, as specified by How It Works, is mol file --> 3D --> 2D AND identifier --> 3D --> 2D, and since the 3D --> 2D conversion goes through a SMILES string, it seems reasonable to guess that something funky is going on with how the 3D renderer is handling the SMILES for anions.

We'll have to get Dr. Hanson to weigh in on this puzzle!

Thanks,
Evan

Robert Belford's picture
Robert Belford | Tue, 02/14/2017 - 14:39

Just to clarify the above question, I had asked the class to change the file for Glycine from its neutral form to the Zwitterion, which ends up being a bit more complicated than just changing the atom the carboxylic acid bonds to.

Leah McEwen's picture
Leah McEwen | Tue, 02/14/2017 - 14:16

Jumping in with a data management perspective, where detail within and between resources counts -

1- Looking up CAS RN 13053-46-8 in the Chemical Abstracts Service database brings up an unspecified isomer, with 25 references, including some 2014 patents. Looking up 1-aminoethane-1,2-diol in the Chemical Abstracts Service database brings up CAS RN 1621416-22-5 for the R enantiomer, which is referenced in a 2014 patent.

2- This comment thread re-enforces the lesson that there are different approaches to organizing chemical information in different databases, as we've been discussing, and some of the challenges in using record IDs outside their original context. Thus database record IDs are generally not best practice standards for exchanging identifying information about chemical structures between systems, such as the many excellent chemical and material systems that are available for research. Database IDs can only be verified by looking them up in the original database, which is not always accessible and prohibitive for reviewing a large group of compounds, and may not match the local use case for a chemical form or type of material.

3- Using record IDs as an exchange link are especially inappropriate for automated applications used in cheminformatics that incorporate validation functions and machine learning and are not relying on human intervention to hand curate every record. These functions need to have chemically-defined machine-accessible rule-sets, this is the premise behind the InChI as an International Chemical Identifier. InChI in its current form also has some limitations at these levels of granularity, but comparison and validity can be checked and flagged much more readily by software.

It is one process for a well versed individual chemist to check and confirm data points of interest in authoritative sources; it is another process for a single computer program to manage manage and normalize many incoming data points according to local priorities; it is quite another process to exchange and process data across multiple large systems used around the world in many sectors and disciplines.

olcc s16 | Tue, 02/14/2017 - 14:13

at the bottom this page, there are 3 mol file for Glycine. glycine.txt is the file Hack a Mol obtained from pubchem and glycineZi, we tried to get the zwitterion to work but the Nitrogen bonded 5 times, and in glycineZixyz, we changed the x coordinate for atom 10 but we cannot show the negative charge on the Oxygen. Can you upload glycineZixyz to Hack a Mol and tell us what we need to do to look right?