OLCC Project 3: User Specific Laboratory Chemical Safety Summary

No votes yet
Join the conversation.

Comments 11

Robert Belford's picture
Robert Belford | Wed, 02/03/2016 - 14:34
I embedded a copy of the sheet into a webpage and opened access to the public. So it may not last long, but it is here for discussion. Cheers, Bob

Ralph Stuart's picture
Ralph Stuart | Fri, 02/05/2016 - 06:43
I found a free web site with apps that convert files into and out of XML. It's at <a href="http://xmlgrid.net/xml2text.html">http://xmlgrid.net/xml2text.html</a> and might be useful in exploring the PubChem data in Excel. - Ralph

Robert Belford's picture
Robert Belford | Wed, 02/10/2016 - 13:42
For some reason several Zip programs would not unzip the LCSS dump. I was able to do so with 7-zip, but it is too big to upload to our site. Fortunately, UALR has unlimited Google Drive, and so I put the original file here. <a href="https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk">https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk</a> I had to remove the following in order to load it to Excel <a href="https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk">https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk</a> And here it is in Excel, <a href="https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk">https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk</a> Now Brian and I found another way to get the list of chemicals for which there is a PubChem LCSS. Brian, are you going to try and use the NIH Resolver to convert the names to InChI-Key (I think it does that). Sort of the way we did in the second module? (except get InChI Key instead of molar mass). Cheers,

Ralph Stuart's picture
Ralph Stuart | Wed, 02/10/2016 - 15:06
Google docs doesn't allow me access to the Excel files you posted; do you need to share them with me? My google account is <a href="mailto:keenestateehs@gmail.com">keenestateehs@gmail.com</a>

Ralph Stuart's picture
Ralph Stuart | Wed, 02/10/2016 - 15:16
I generated an Excel file that compares Wikipedia safety information with PubChem LCSS information by hand for about 90 chemicals. The list of chemicals came from the original LCSS roster in Prudent Practices. In the process, I suspect that I made clerical errors, so replicating this effort electronically would be a good step. I hope that the Excel file that Bob generated will help us do this. These are the columns in the spreadsheet and what they indicate: Wikipedia Entry Is there a wikipedia entry for this chemical Chembox Does the wikipedia entry have a Chembox Safety info Does the chembox contain any safety information? GHS info Does the chembox contain GHS information PubChem LCSS? Is there a PubChem LCSS for this chemical? Pubchem sections How many content sections are there in the Pubchem LCSS Number of sets How many different sets of GHS symbols does PubChem present? Distinct sets How many different sets of symbols are there? The interesting result is that wikipedia has more coverage of the chemicals listed (95% vs, 82%), but less safety information (78% in wikipedia, 82% in Pubchem) and much less GHS info (33% of chemical listed have GHS info in wikipedia). Brian, is it possible to verify these numbers?

Robert Belford's picture
Robert Belford | Wed, 02/10/2016 - 15:26
Ralph, Can you upload the file to this page, or the Google drive I just shared with you. It would be nice to see how you structured the file. Cheers, Bob

Ralph Stuart's picture
Ralph Stuart | Wed, 02/10/2016 - 15:30
I just uploaded the file to the page. "Structured" is a little generous - it was more a note taking device with some calculations at the bottom. But I think that it gives us an idea of how to generate interesting data for SD. - Ralph

Brian Murphy | Wed, 02/10/2016 - 18:45
Ive uploaded an excel page with the list of chemicals from the LCSS. There are four tabs, Sheet0, Sheet1, Sheet2, and Sheet3. Sheet3 is the filtered down list in alphabetical order. The other sheets are just part of the filtering process. There are many duplicates which I can get rid of and there are many chemicals listed by pubchem as numbers.

Ralph Stuart's picture
Ralph Stuart | Mon, 02/15/2016 - 11:47
I've uploaded a first draft of an overview of how I think the mining of the PubChem and Wikipedia Chemboxes can help us work with chemical safety information that is available on the web. The numbers in the paper are based on my "spare time", manual review of the 88 or so chemicals named in the 1995 edition of Prudent Practices last week. It would be great if we could double check them electronically and expand the number that are reviewed using the XML data from PubChem and scraping of the ChemBoxes.

Brian Murphy | Mon, 02/15/2016 - 17:14
With the pubchem identifier exchange service we can convert pubchem IDs to inchi keys as well as other things. It can be found at <a href="https://pubchem.ncbi.nlm.nih.gov/idexchange/idexchange.cgi">https://pubchem.ncbi.nlm.nih.gov/idexchange/idexchange.cgi</a>

Robert Belford's picture
Robert Belford | Wed, 02/24/2016 - 14:43
Here are some interesting stuff, <a href="http://chem-bla-ics.blogspot.com/2016/01/adding-chemical-compound-to-wikidata.html">http://chem-bla-ics.blogspot.com/2016/01/adding-chemical-compound-to-wikidata.html</a> This lead me to Hay's tools: <a href="http://tools.wmflabs.org/hay/">http://tools.wmflabs.org/hay/</a> of which the tool directory may be of interest <a href="http://tools.wmflabs.org/hay/directory/">http://tools.wmflabs.org/hay/directory/</a> <a href="https://www.ebi.ac.uk/efo/webulous/">https://www.ebi.ac.uk/efo/webulous/</a>