Discussion

Robert Belford's picture
Robert Belford | Fri, 03/24/2017 - 17:36
I think there is another way. When you go to the URL with the InChI key, you get the page asking if you want to create the page, as wikipedia pages are not named after InChI keys. What we need, is the name of the wikipedia page, and that is embedded in the script here, <a href="/wiki/Maneb" title="Maneb" data-serp-pos="0">Maneb</a> I think the xpath stuff Jordi was showing us can do this, we just need to figure how to do it with Excel, instead of Google sheets.

Robert Belford's picture
Robert Belford | Fri, 03/24/2017 - 17:22
I have a list of 100,002 InChI keys, and I am trying to see if those chemicals are on Wikipedia. A student created the following spreadsheet in Google, <a href="https://docs.google.com/spreadsheets/d/1Kit4f7frOnz-2SWSM6KV9ULZjffm8QcfQJx09GWSz3M/copy">https://docs.google.com/spreadsheets/d/1Kit4f7frOnz-2SWSM6KV9ULZjffm8QcfQJx09GWSz3M/copy</a> I can get column "B" to work, but am dumbfounded on column "C". There may be a different way to do this, but our objective is to take a list of InChI Keys and determine how many are in Wikipedia, and then, how many of those have GHS codes in the chembox. Any ideas?

Sunghwan Kim | Thu, 03/23/2017 - 21:06

On March 23, 2017.

The question numbers in the old version was confusing, so I've corrected numbering in the Homework questions on this Module 6 web page. And a corrected version of the .docx file for the Module 6 assignments (2017OLCCModule6Assignment.docx) has also been uploaded, which is available at the top of this page.

(In the corrected version, Question 2, which has sub-questions (a), (b), and (c), is followed by Question 3. However, in the old version, the sub-questions of Question 2 were mistakenly labelled as 1, 2, and 3, and Question 3 as Question 4. Similar mistakes were made in several places in the old version. These errors are corrected in the new version. All questions are exactly the same as the old ones, except for the question numbers/labels.)

Sunghwan Kim | Thu, 03/23/2017 - 10:40

Some single-component compounds are negatively or positively charged, and their parent compounds are neutralized. So, the acetate ion and acetic acid have the same parent (which is acetic acid).

Jordi Cuadros's picture
Jordi Cuadros | Thu, 03/23/2017 - 10:30
Is there any charge normalization? Acetates and acetic acid seem to have the same parent.

Sunghwan Kim | Thu, 03/23/2017 - 10:27

A "parent" compound is a conceptually important component (part) of a compound. As an example, atorvastatin calcium (lipitor: CID 15378998) has two unique components (atorvastatin and calcium) and we know that it is the atorvastatin part that binds to the target protein, so we view the atorvastatin component is more important than the calcium component.

Of course, the concept of "important components" is very ambiguous, so it needs some clear (mathematical) definition. The current definition of "parent compound" used in PubChem is... if a component contains a super majority (≥70%) of all heavy (non-hydrogen) atoms across all unique components of a mixture and if that component has at least one carbon atom, it is designated as the parent component.  In this definition, a single-component (organic) compound is usually considered the parent compound of itself.

A caveat of this definition is that if you have a two-component compound, whose components are similar in size (that is, the heavy atom count ratio of one component to the other is 50:50, no component is considered as a parent.

 

Jordi Cuadros's picture
Jordi Cuadros | Thu, 03/23/2017 - 10:17
Some of the filters and links are related to the concept of a parent compound (or a shared parent compound). But I haven't been able to find a description or a definition of what a parent is or how it is computed. Can someone enlighten me? Thanks in advance. Jordi

Bob Hanson's picture
Bob Hanson | Fri, 03/17/2017 - 15:35

Dear S04 (a more human name would be helpful here!), Sorry about the wording above. Herman is just saying that the wording you see there was intended for another purpose. Just realize that while Herman and I are here to help, it's got to be your project. The wording there about "leading you..." was in reference to a proposed module not a project. Both Herman and I will be very happy to assist you in any web-based visualization project that you come up with. My suggestions for getting going on this:

  1. Bounce one or more ideas off us and see what we think.
  2. Take a look at what is out there already that might be similar to what you think you might want to do.
  3. Help us out by giving us some idea of your background.

I would recommend you do this by email directly with us, not in this comment/reply format. My email address is hansonr@stolaf.edu. Way simpler working with you that way, I think. We could also Skype if you are interested.