Discussion

Ralph Stuart's picture
Ralph Stuart | Wed, 02/10/2016 - 15:16
I generated an Excel file that compares Wikipedia safety information with PubChem LCSS information by hand for about 90 chemicals. The list of chemicals came from the original LCSS roster in Prudent Practices. In the process, I suspect that I made clerical errors, so replicating this effort electronically would be a good step. I hope that the Excel file that Bob generated will help us do this. These are the columns in the spreadsheet and what they indicate: Wikipedia Entry Is there a wikipedia entry for this chemical Chembox Does the wikipedia entry have a Chembox Safety info Does the chembox contain any safety information? GHS info Does the chembox contain GHS information PubChem LCSS? Is there a PubChem LCSS for this chemical? Pubchem sections How many content sections are there in the Pubchem LCSS Number of sets How many different sets of GHS symbols does PubChem present? Distinct sets How many different sets of symbols are there? The interesting result is that wikipedia has more coverage of the chemicals listed (95% vs, 82%), but less safety information (78% in wikipedia, 82% in Pubchem) and much less GHS info (33% of chemical listed have GHS info in wikipedia). Brian, is it possible to verify these numbers?

Ralph Stuart's picture
Ralph Stuart | Wed, 02/10/2016 - 15:06
Google docs doesn't allow me access to the Excel files you posted; do you need to share them with me? My google account is <a href="mailto:keenestateehs@gmail.com">keenestateehs@gmail.com</a>

Robert Belford's picture
Robert Belford | Wed, 02/10/2016 - 13:42
For some reason several Zip programs would not unzip the LCSS dump. I was able to do so with 7-zip, but it is too big to upload to our site. Fortunately, UALR has unlimited Google Drive, and so I put the original file here. <a href="https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk">https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk</a> I had to remove the following in order to load it to Excel <a href="https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk">https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk</a> And here it is in Excel, <a href="https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk">https://drive.google.com/open?id=0ByRWZ4TaLO_0NndIMHFfVF9PbVk</a> Now Brian and I found another way to get the list of chemicals for which there is a PubChem LCSS. Brian, are you going to try and use the NIH Resolver to convert the names to InChI-Key (I think it does that). Sort of the way we did in the second module? (except get InChI Key instead of molar mass). Cheers,

Ralph Stuart's picture
Ralph Stuart | Fri, 02/05/2016 - 06:43
I found a free web site with apps that convert files into and out of XML. It's at <a href="http://xmlgrid.net/xml2text.html">http://xmlgrid.net/xml2text.html</a> and might be useful in exploring the PubChem data in Excel. - Ralph

Robert Belford's picture
Robert Belford | Wed, 02/03/2016 - 14:34
I embedded a copy of the sheet into a webpage and opened access to the public. So it may not last long, but it is here for discussion. Cheers, Bob

Sarah House (not verified) | Fri, 12/18/2015 - 17:28
Dr. Belford, I have uploaded the abstract I have saved. However, the project seems to have gone in a different direction since this was written.

Sarah House (not verified) | Fri, 12/18/2015 - 17:23
Test

John House (not verified) | Fri, 12/18/2015 - 14:24
Test