Project Mentors: Damon Ridley, Sunghwan Kim and Anja Brunner.
The wealth of information available creates information overload problems, and more than ever it is helpful if scientists understand the core collections in their field and know which sources to use and when. At some stage, evaluations of the different options need to be made, and since elsewhere in this course PubChem and Reaxys have been introduced, we now have the opportunity to evaluate them together.
We shall explore answers to questions such as: in what ways are they similar and in what ways are they complementary; what is their combined landscape, and how are the different systems searched?
Those interested in teaching chemical information retrieval may wish to explore the “big picture”, i.e., the overall content and search functionality of PubChem/Reaxys; those interested in finding information in their special field of study or research may wish to explore subject-specific information.
Evaluating systems is fraught with numerous difficulties pertaining not only to the database(s) but also to the knowledge and skills of the searchers. In this project we shall work as a team, with different participants evaluating areas of their choice. The outcome should be of interest to everyone in the OLCC program … and beyond …
Comments 30
Student Project
Hello,
I am interested in this project as a part of this course. I am hopeful to use both PubChem and Reaxys in research work. Before I begin with it, can you tell me about PubChem and Reaxys, their similarity and differents? How can they get collaborated?
Looking forward to hearing from you.
Thank you
Amita
PubChem and Reaxys
Amita
The history here is that Sunghwan, Anja, and I thought it may be interesting for students to make their own assessments on similarities and differences - in other words, exactly what you are asking is the type of thing we would like you to assess in the student project.
While we haven't designed the exact way the project would be run (we would rather students decide that), we are thinking that if we can get a few students interested then each of them may tackle one aspect. Finally the results students come up with would be collated.
In making comparisons like this you have to consider the content, which could be broken into documents, substances, reactions and properties - but that's not just a simple matter of saying (in comparing "substances"), e.g., there are 250 million PubChem Substances, 95 million PubChem Compounds and 28 million Reaxys Substance Records (or whatever the latest numbers are). What you have to do is to look at what type of information, on average, is in the Substance/Compound records.
Then you have to do the same for comparing "documents", "reactions", and "properties".
Another thing to consider is the search interface - how you search. Here I can perhaps let you know (without giving too much away) that you can indeed search PubChem Compounds through the Reaxys interface, but you cannot search Reaxys Substance Records through the PubChem interface. So one student may, for example, wish to investigate the different results you get when searching PubChem Compounds through the PubChem interface and the Reaxys interface. You may think you'd get the same result (you are searching the same data), but, NO, you may get different results because of the way the two systems search.
From memory you were interested in nanomaterials and water-treatment, right? That being the case, you may wish to compare ways of searching for such topics in PubChem/Reaxys. Probably the way it would work is that Sunghwan would work with you on options in PubChem, while Anja/I would help you with Reaxys. However, while such an analysis may be of interest to you (and others working in the area), such a topic (and other very specific topics) may not be suitable for such analysis because it may turn out that you can only find this information in one of the products - and that can be found out very quickly.
Of course one of the huge differences between PubChem and Reaxys is that one is in the public domain (and is funded effectively though "public money") whereas the other is a proprietary product (and is funded through charges to those who access it).
I don't want set down rules at this stage, and we'd rather interested students let us know and once we have a group then together we would work out sub-projects. Having said that one of the things that does not seem to have been discussed during this Spring 2017 course is, out of all the available information "out there" which sources should I choose and when - and we shall try to answer the question relating to PubChem/Reaxys either together or separately at least for the projects that we undertake together.
Damon
help
help
Please discuss your potential project with Prof. Belford.
Hi, Amita and Nwume,
Thank you for showing interest in this project. By the way, it seems that both of you are from UALR and Professor Belford (at UALR) told me that each student at UALR would be required to work on different project (for grading purposes), while it is okay for him or her to work with students at other schools. Considering that both of you are from UALR, I suggest that both of you discuss this issue with Professor Belford before starting working on it. Because your school is getting closer to the end of semester, I recommend you to do it as soon as possible (if both of you are still interested in this project).
student projects
Sorry for late reply. I was
See my comments below
How to design a project......
It seems that you are having hard time figuring out where to start from. To help you get started, let me tell you how you should approach. Typically, a project has a specific goal or specific questions to answer and needs to have tasks/steps that you will need to take to achieve that goal.
I guess your goal is to compare PubChem and Reaxys, but what tasks should you do to achieve the goal? One issue here is that your goal is somewhat broad, making it difficult to identify necessary tasks. A questions that you answered in your comment above is a process to narrow down the scope of your study, probably making your goal is more specific.
You asked how you can use the two databases to get useful information for your research on waste water treatment using nanomaterials. However, you did not really say "what specific information" (or specific analysis tools/services) you need for your research. Once you can identify specific pieces of information relevant to your study, now you can compare the two databases in the context of that information content. Does both databases provide what you need? Does one database have incorrect information? Is a particular type of information found in one database, but not in the other? So, essentially, you need to identifiy a set of specific questions that you can use when comparing the two databases.
Actually, you have asked me a series of questions, including:
While some of these questions are very subjective and ambiguous, I think they are a sort of good starting point. Please define your questions more narrowly. For example, in the fist question, what do you mean by "useful information"? By narrowing down your questions and searching for answers to them, you will be able to compare two information resources.
Sorry for the confusion. My
Pick a question and plan a search
Thank you. I will follow the
QUATs and Cellulose Nanomaterials
Amita Greetings from Sydney! I am on a different time zone so was unable to reply earlier, although I note Anja and Dr Kim have commented on your project. Everything they have said is absolutely correct, and, in particular, that you need to refine the project a little. A comment here is that there are a few steps in any search: first, define the question; second, understand the search landscape; third, then proceed with the search(es).
Let me start from the third part first. Here you almost invariably need to trial a few searches first, so don't just do a single search and then stop.
The second part (the landscape) means that you need to have some idea what is in the database(s) and then start to think of searches based on that knowledge.
Which brings me to the first part, where my suggestion is that you focus on two general problems:
1. How to find relevant substances and then specific properties on them. I suggest you choose QUAT 188 (perhaps also include some related quaternary ammonium salts) and find biological activity information on them.
2. How to formulate and execute a (text) search relating to "removal and recovery of phosphorus and heavy metals (Selenium, Arsenic, cobalt, lead etc.) from waste water using renewable resources based nanomaterials prepared from wood or cellulose". In starting this search, you need to define the concepts then formulate a search based on a few of the concepts (the ones for which you feel you can search more efficiently).
Since you probably have not had a lot of experience with general strategies for the second problem, let me start you off with the concepts I see (and comments on how to search them):
- removal/recovery (authors/indexers would use all sorts of terms here, so I don't think this would be a good concept at least initially);
- phosphorus (problem here is that you don't mean 'phosphorus' (the element) at all - rather you mean phosphate. Herein lies a common problem in this field. Yes, I know that fertilisers are often defined in N:K:P content and I know some authors would write about phosphorus and mean phosphate - but the question is how to search for it. Again, I don't think this would be a good concept at least initially);
- heavy metals (similar problem. Some authors/indexers would talk about heavy metals, but others would be more specific - as you suggest: Se, As, Co, Pb etc. So yet again, I don't think I'd be searching for this concept at least initially);
- waste water (at last - something that is quite specific and commonly mentioned, although you have to think about searching wastewater/wastewaters and so forth);
- nanomaterials ('synonyms' may be nanosponges, nanoparticles, nano materials and so forth. These probably could be included in an initial search, and there is a very simple way to search them - right? :-) );
- wood/cellulose (this concept also would be worth considering in an initial search).
So, what do you think? If you agree with my suggestions, then the next thing I suggest you do is formulate specific search strategies in PubChem and in Reaxys, and then get them checked by Dr Kim and Anja/me. Damon
Finding Pharmacological data
Finding Pharmacological Data
Nanomaterials
Try to find the same information in PubChem, too.
Hi, Amita.
Try to search PubChem for the same compound. You will be able to find one record for the query. Go to its compound summary page and look into what kind of information is available and what is missing in PubChem that does exist in Reaxys, vice versa. And figure out how to go from this record to records in other databases (e.g., literature, protein, gene, disease, and so on). In this way, you will start noticing differences between the two databases.
psychotic disorder
See advice in comments above
psychotic disorder in Reaxys
Hi, Nwume,
Hi, Nwume,
First of all, you really need to have a specific question that you want to answer, and then need to find a way to how to answer that question. When you don’t have a good question to answer, you just can’t answer it. So, please write down your question that ends with a question mark, and then list a set of tasks that you need to do to answer that question. If you can’t identify a specific task(s) for finding the answer, it is likely to mean that your question is too broad, so need to write a new question (or questions) that is narrower than the original one. Repeat this until you see what you really want to ask and how you find the answer to it.
By the way, to find psychoactive drugs from PubChem, probably the best approach is to user the PubChem classification browser (https://pubchem.ncbi.nlm.nih.gov/classification).
(1) Go to the classification browser, select “MeSH” from the “Select classification” drop-down menu (on the top-left) and “Compound” for the “Data type counts to display” option.
(2) Type “antipsychotic agents” in the text box and run a search by hitting the search button. This returns ~9 MeSH terms with which any PubChem Compounds are annotated.
(3) Click the record count for one of the hits to explore (for example, “Antipsychotic Agents”). This gives you a list of 207 compounds annotated with the MeSH term “antipsychotic agents”.
(4a) Refine the resulting compound list using various filters available under “Refine your results” on the right column. You can retrieve the compounds with protein-bound experimental 3-D structures by clicking “Protein 3D Structure”, or retrieve the compounds with bioactivity data by clicking “BioAssays, Tested”.
(4b) Alternatively, you can click the “Structure Clustering” button under the “Actions on your results” section (on the top-right).
(4c) Another thing that you can try is to retrieve records in other databases associated with the antipsychotic drugs, using the dropdown menus in the “Find related data” section (on the right column). For example, by selecting “BioSystems” from the dropdown menu, you will get a list of pathways associated with the psychotic drugs.
Well, there are many other things you can do with PubChem as well as Reaxys, but you can get the most out of these resources, only if you have a clear question that you want to answer. I think you need to come up with some good idea about it.
specific
Ziprasidone
searching documents on Reaxys
Truncation
Search compound in PubChem
Because you had a typo in your query.
It should be "chloro" not "chloor".
https://www.ncbi.nlm.nih.gov/pccompound/?term="(3-chloro-2-hydroxypropyl)trimethylammonium+chloride"
Thank you. I got it.
While using Reaxys and
Safety and hazard information in Reaxys