You are here

8. Building and Optimizing Information Search Strategies

Project mentors: Anja Brunner (Reaxys), Damon Ridley (Reaxys) and anyone else who would like to support the project!

At Elsevier, our searches in Reaxys are influenced by our knowledge of the content, structure and search options of our product. It is clear to us, however, that Reaxys users may approach finding information in Reaxys differently. So, we want to learn from you — users of Reaxys. How would you teach others to leverage the search capabilities of Reaxys? If you take on one of the following projects (or have an idea for another similar project), our hope is that you will explore the content and functions of Reaxys and develop some best practices or “tips and tricks” that help other users to take full advantage of what Reaxys has to offer.

OPERATORS AND TRUNCATION IN SEARCHES

Most of us know the Boolean operators AND, NOT, OR. In a search, these operators offer different ways of linking together query terms and specifying how the terms relate to the hits that result. In the same way, NEAR, NEXT and PROXIMITY help refine the input criteria for a search. Truncation also serves to optimize a search strategy, opening the possibility of finding a broader range of information connected by a steadfast commonality, such as the stem of a word. Another form of truncation is entering ? or * in a formula to be searched in Reaxys.
(1) How does Reaxys interpret operators and truncations?
(2) What is the impact of these different operators on the outcome of a search?
(3) How exactly does a hit set change depending on what operators or truncation are used?
(4) What rules can one follow in their use?
(5) How are operators implemented in other search engines/databases?

Prepare a set of screencasts to explain the role of operators and truncation in search strategies.

BUILDING AN EFFECTIVE SEARCH STRATEGY

We are all used to the ease with which we enter a phrase into Google and get relevant answers to our question. At the same time, we also know that the long list of hits that emerges includes a lot of irrelevant results and we rarely go beyond the first couple of hit pages. A search for scientific information can be quite complicated. Ideally, we build a search to retrieve only relevant hits, but at the same time ensure that the answers we get are comprehensive.

A search strategy is our approach to finding answers to a question. In a natural language search engine, that approach may be figuring out the best way to phrase our question. In a user interface like Query builder in Reaxys, that involves figuring out what type of information we are looking for, what terms should we query, and how will we connect the search fields used. Another aspect of a search strategy may be any form of processing we do with the results from a search -- like combining hit sets from 2 or more searches, filtering or analyzing hit sets.

Defining a search strategy can be difficult. it is influenced by the type of question asked, the type of search engine or system used, and the knowledge context of our question -- a search for a particular reaction can be approached in different ways depending on what we know about the reaction itself.

So how do you build a search strategy? What steps do you follow? How do you inform your approach and where to you find the right query terms to use?

Pick a question and show how you can use Reaxys to optimize your search strategy to answer the question. Just to narrow the scope of the project, focus on a specific type of search:
(1) search for information on a particular chemistry topic
(2) search for a substance or group of substances that meet certain criteria
(3) search for properties of substances that would help you identify an unknown
(4) search for a reaction and figure out how to optimize it

Use screencasts to show your thought process and the steps in generating optimal sets of answers. Based on your exploration, build guidelines you can share with others for building effective search strategies.

Rating: 
0
No votes yet
Join the conversation.

Comments 10

Olcc S15 | Thu, 03/16/2017 - 14:23
Greetings, My name is Daniel. I am one of Dr Belford students and i decided to choose Reaxys as my special project. After working at the start of the semester with Reaxys. I noticed that search strategies are quite important , especially for substances and materials that meet certain criteria. My focus will be on using screencasts to show my thought process and steps in searching Reaxys and also the difficulties i encountered in searching specific compounds. Thanks

Anja Brunner | Fri, 03/17/2017 - 02:36

Hi Daniel, Damon and I are happy to see interest in this project as it is a topic that is quite important to us. We really feel that we can learn quite a bit from users exploring Reaxys. Let's see who else joins the project and then we can have a coordinated discussion about specific topics each person (or a team) would like to tackle. In the meantime, perhaps you can plan out a topic or type of search you want to focus on. Looking forward to working with you! Anja

OLCC S17 | Mon, 04/10/2017 - 11:26
My name is Esther, I am a student in Dr. Belford's Cheminformatics class. I wish to write my project on Drug Resistance in Malaria,i feel there is more to malaria than i know. I welcome any different idea as regards to my project. Looking forward to hearing back from you shortly. Thank you.

Damon Ridley's picture
Damon Ridley | Mon, 04/10/2017 - 20:45
Esther Thank you for your interest in the project: "Building and Optimising Information Search Strategies" and for your suggested topic "drug resistance in malaria". The topic clearly is a very important one, but the issue is that there are thousands of papers/patents published on the topic each year, so we need first to narrow the scope of the topic - but that in itself is one of the aspects of building search strategies! To do this, we'd first like to know a little about you and your interest in the topic. Are you an undergraduate, postgraduate, or researcher in the field? Is your interest 'a general one' or do you have a specific aspect in mind? If your interest is a general one, then to build the search strategy we first need to define the concepts and in this case there would be three concepts: "drug"; "resistance"; "malaria". Next we need to ask whether each of these concepts is (equally?) important - and in this case I think you'd probably answer: "Yes". Assuming that, the next step to build the search strategy is to think of synonyms for each of the concepts. For "drug" there would be many options including text terms such as pharmaceuticals, names of substances (e.g., chloroquine or artemisinin) or classes of substances where the classes of substances could be defined by text terms such as antimalarials or antiplasmoidals, or structures such as "substances with the artemisinin substructure". For "resistance" the most common terms would be resistance/resistant - but these could easily be searched by use of a truncated term such as resist*. Lastly you may think it would be simple to search the term "malaria" - but, no, since while that's a commonly accepted/recognised term, what scientists really are talking about are a group of Plasmodium species and in particular P. falciparum. Trouble here is that there are various strains/clones of P. falciparum, such as the Sierra Leone clone, the Indochina clone, and a bunch of other clones - named or unnamed. I am not trying to make things complicated, rather I'm being realistic and I'm pointing out to you that chemical information retrieval really is a science (hmmm... there is also an 'art' component) that needs to be studied and that researchers who rely on Google, or researchers that don't think the topic needs to be studied are almost certainly not going to get the information they need or should get. I suggest then, that we focus on a more specific aspect and that we imagine we are serious researchers who want first to understand 'everything' about what is published in that specific aspect and then develop research projects based on this understanding and knowledge. We could focus on chloroquine resistance or pyrimethanamine resistance or the like, but basically a current frontline research area is on artemisinin (and its analogues). While at present most Plasmodium species have not found a way to develop resistance to the artemisinins, the world still needs to assume that this will happen and we'd better be prepared for it. One aspect of current research in the field is to investigate artemisinin combination therapies, and this may be a good area to research for your project. If you agree then let Anja/I know and we'll put together a plan of things for you to do. If, however, you would like to research another specific aspect then that's fine - just let us know what you would like to do and we can work together on how to approach the topic. After all, what we want to do is to understand the literature in a specific area and then develop generic workflows. Looking forward to hearing from you. Damon

OLCC S17 | Mon, 04/10/2017 - 23:08
Hi Damon, Thank you for your message,i have read between lines. I am a graduate student. please can i request for your email, because we gonna have more conversations and it gives me direct access. Thank you.

Damon Ridley's picture
Damon Ridley | Mon, 04/10/2017 - 23:14
Esther Great to hear from you. Can you ask Professor Belford to give you my email address? Yes, I am very happy to communicate with you directly. Damon

OLCC S197 | Sun, 04/23/2017 - 00:54
please how can i search for drugs that inhibit amino acids synthesis using reaxys or pubchem

OLCC S197 | Sun, 04/23/2017 - 00:54
please how can i search for drugs that inhibit amino acids synthesis using reaxys or pubchem

OLCC S197 | Sun, 04/23/2017 - 00:54
please how can i search for drugs that inhibit amino acids synthesis using reaxys or pubchem

Damon Ridley's picture
Damon Ridley | Sun, 04/23/2017 - 03:45
Hi OLCC197 You have asked a very general question and I think you have to tell us more precisely what you want to do. Thus I presume you are well aware that there are a number of naturally occurring amino acids and that the biosynthesis of each one of them is done in a number of steps, each one of which involves a very specific enzyme. So can you think about what you really want to find and then let us know? Another suggestion I have is that you read the general documents published in Part IV: Exploring Reaxys, in particular the one on Substance Records and the one on Text Searches. These largely teach the "search mechanics" and when you have a good grasp of these then you can try a few of the workflows and see what you get. Then you will better understand why you need to narrow your search intent. Having said that, on Reaxys you can easily find the substances (=inhibitors) that have been tested on an enzyme. In the first instance you can try to search for the enzyme in the Substance Basic Index (available through Query builder => Search properties: Substance Basic Index, then click, drag and drop the field into the main work space) involved in the (bio)synthesis . So please define your question more precisely. Damon