Using Google Sheets: Name2Spectra

Google Sheets Name2Spectra

This video will show you how to download to a google sheet from the NIST Webbook both the image and data for IR, UVis and Mass spectra.  

1.  Set up the cells to talk to the databases: in each cell identified by brackets type in the following

[A1]: Name  (you will place the name of the chemical into [B2}, the cell next to it)
[A2]: CASRN (we will remove the dashes from the CAS Registry Number as that is what the webbook uses to.

[A3] In this cell we will use Data validation to provide options between MASS, IR and UVVis spectra.  
After clicking on cell A3, go to (click):DATA/DATA VALIDATION, and in Criteria choose "list of items", and in the box type IR,MASS,UVVis , and click "SAVE. You now have a dropdown box with these three options.

[B1] - here you will type the chemical you are searching for.
[B2]: =MID(IMPORTXML("https://chem.nlm.nih.gov/chemidplus/name/" & B1,"//link[@rel='canonical']/@href"),40,60)
 

Here you are using the ChemIDplus service of NIH TOXNET to obtain the CAS registry number.  For example, if you look up anthracene at toxnet the URL is: https://chem.nlm.nih.gov/chemidplus/name/anthracene, and you can see the CAS number is 120-12-7.  Now if in your browser you click on "show page source", and search for that CAS number (ctrl "F"), <link rel="canonical" href="https://chem.nlm.nih.gov/chemidplus/rn/120-12-7" />, and you see that the CAS number starts at position 40. This search uses two Google SpreadSheets Functions, MID and IMPORTXML

What you want is the link "https://chem.nlm.nih.gov/chemidplus/name/[CAS of your Chemical]

IMPORTXML has syntax (url, expath_query), where "//link[@rel='canonical']@href" reads search for a "link" tag wherever in the pages such as it has the attribute "rel" that equals "canonical and pulls the value of its "href" attribute.

MID has syntax ("string", starting _at, exact_length)

note, https://chem.nlm.nih.gov/chemidplus/rn/120-12-7 has 39 characters, and so you take the 40th character of the string, with up to 60 in total.

[B3]: =SUBSTITUTE(B2,"-","")

The above removes the "-" from the CAS number because we need that when we find the spectra in the NIST webbook. To see this, go to http://webbook.nist.gov/chemistry/  click "search options"/name", type in anthracene, and click the buttons for IR, Mass and UVVis spectra. Scroll down the page until you see the UVVis, right click on the image and click copy image address, which gives you: http://webbook.nist.gov/cgi/cbook.cgi?Spec=C120127&Index=0&Type=UVVis, and note that the 120127 is the CAS number with no dashes.

2. Insert Image of spectra

Merge Cells over the ranges of C2 to L2 and and C19 to L19 making one large block, 

[Merged Cell Block]: =IMAGE("http://webbook.nist.gov/cgi/cbook.cgi?Spec=C" & B3 & "&Type=" &A5, 1)

Here, you go to the Nist webbook, and grab the image at this address, with the number in cell B3 and pull the spectra of the type in the dropdown box)

 

3. Download data

[F24] =IMPORTDATA("http://webbook.nist.gov/cgi/cbook.cgi?JCAMP=C" & B3 & "&Type=" &A5)

To understand the logic, go back to the UV-VIS spectra in the NIST book (see above), and scroll down to "Download spectrum in JCAMP-DX format," right click on "spectrum" and "Copy link address", given,http://webbook.nist.gov/cgi/cbook.cgi?JCAMP=C120127&Index=0&Type=UVVis.  The above code allows you to change the modified CAS number to the one of your compound [B1], and to choose the type of spectra from the dropdown box [A5]

 

Rating: 
0
No votes yet

Annotations