SAO/NASA ADS -> Help -> Abstract Query Form |
|
SAO/NASA ADS Help Pages | Prev | Next |
2.2 - Abstract Query Form The Abstract Query Form provides basic access into the ADS abstract databases. The query form is divided into 3 sections: The main search parameters, the filters, and the settings. 2.2.1 - Introduction The ADS search engine has a number of features which have been developed in order to improve searching our databases. These include the use of synonyms, stop words, and word translations. 2.2.1.1 - Synonyms By default, each search term specified in the title, abstract, or author query field triggers a search for all the term's synonyms as well as the term itself. The use of synonym searches for text words has been established to equate different tenses of the same verb, different genders of the same name, as well as different words with the same meaning. In addition, we use synonyms to perform foreign language translations, to equate foreign words with their English counterpart. This feature can be turned off by modifying the default settings (for more information please see the section Synonym Replacement below). For authors, synonym replacement allows the user to search on an author's name and get back any alternative or incorrect spellings of which we are aware. For example, the author synonym pair A'HEARN, M and AHEARN, M allows users to search on either spelling and retrieve abstracts which contain either spelling. If you are aware of an author name, misspelling, or foreign word translation which does not seem to be working correctly, please inform us by email to ads@cfa.harvard.edu 2.2.1.2 - Stop Words The ADS search system recognizes certain words as being not important for searches and removes those words from a search. These are words commonly used in the English language with great frequency, as well as adverbs, prepositions and any other words not carrying a significant meaning when used in a scientific context. There are two types of stopwords, case insensitive stopwords and case sensitive stopwords. The former are words like much and but. They are ignored in a search. The latter are words that are stop words in some context, but are significant in other contexts. For instance and is a stopword, but And is significant. Since and does not occur at the beginning of a sentence, we assume that anytime And is in the text, it means Andromeda. The list of stop words is available on-line. 2.2.1.3 - Search Word Translations Last, we perform word translations for certain terms or patterns which are conventionally written in a few different ways. This is most commonly done with astronomical object names (e.g. M 31 and M31), as well as some composite words (e.g. X RAY, X-RAY and XRAY). The list of words which are currently translated is available. If you are aware of others which should be included, please inform us by email to ads@cfa.harvard.edu 2.2.1.4 - Phrase Searches In the title and text fields, searching for phrases can be specified by enclosing several words in either single or double quotes, or concatenating them with periods (".") or hyphens ("--"). All these accomplish the same goal of searching the database for references that contain specified sequences of words. The database is indexed for two-word phrases in addition to single words. Phrases with more than two words are treated as a search for sets of two-word phrases containing the first and second word in the first phrase, the second and third word in the second phrase, etc. For instance: "black hole" searches for the words black and hole next to each other. 2.2.1.5 - Wildcard Searches In order to be able to search for families of words, a limited wildcard capability is available. Two wildcard characters are defined: The question mark "?" is used to specify a single wildcard character and the asterisk "*" is used to specify zero or more wildcard characters. The "?" can be used anywhere in a word. For instance a search for M1? will find all Messier objects between M10 and M19. A search for a?sorb will find references with absorb as well as adsorb. The asterisk can only be used at the beginning or at the end of a word. For instance 3C* searches for all 3C objects. *sorb searches for words that end in sorb like absorb, desorb, etc. When synonym replacement is on, all their synonyms (e.g. absorption) will be found as well. The "?" and the "*" can be combined in the same search string. 2.2.2 - Search Fields The Abstract Service Form allows users to enter keywords to be searched for in one or more of its search fields: Authors, Title, Abstract text, and Object Names (for Astronomy only). While author searches are case-insensitive, case does matter when searching the object name field and the title and abstract fields. Words that are entered in all uppercase are considered acronyms and will only match the same uppercase acronyms as found in the literature, while words entered in lowercase (or mixed case) will match words irregardless of what case was used in the original document. For example, compare the difference in results when searching for "FUSE" (an acronym for the Far Ultraviolet Spectroscopic Explorer) rather than "fuse." 2.2.2.1 - Authors Authors may be entered one per line, or on the same line separated by semi-colons. Authors may be entered by last name only, by last name and first initial, by last name and first name, or by last name, first name and middle name. Any of the following will work (the space after the comma is optional): smith, arthur james smith, arthur j smith, arthur smith, a james smith, a j smith, a smithThe search will return all author names that potentially match the query string. For instance: smith, arthur james will return articles with any of these author names: Smith, Arthur James Smith, Arthur J Smith, A James Smith, A J Smith, Arthur Smith, Abut not with these author names: Smith, Arthur Joseph Smith, Arthur M Smith, Andrew James Smith, Andrew Smith, A M Smith, C JIt is possible to disable this search behaviour and only search for a particular author string by checking the Exact name matching button above the author search box. This will cause the search engine to return only records for which the author has been entered as smith, arthur james. To find author names with different spellings, use the ADS Author Name Query form capability which will display all the spellings matching a particular author name template. You can search only for articles that contain the specified author as first author by specifying a "^" before the author name. For instance: ^smith, a searches for all articles with A. Smith as the first author. In addition, you can search only for articles that contain the specified author as last author by specifying a "$" after the author name. For instance: smith, a$ searches for all articles with A. Smith as the last author. A combination of these two specifications allows you to search for articles with just one author: ^smith, a$ searches for all articles with A. Smith as the only author. The search: smith, a # smith, arthur # finds only authors without middle initial or middle name. Note: Authors whose last names contain an umlaut may be entered under both possible English spellings (e.g. Boehm and Bohm), as well as with the umlaut (Böhm). 2.2.2.2 - SIMBAD/NED/ADS Object Names/Position The Object Names/Position search field (available only in the Astronomy and Astrophysics Abstract search form ) allows users to search the literature for bibliographic records relevant to one or more astronomical objects or for a specified position on the sky. Object names may be entered one per line, or on the same line separated by semi-colons. By default, an object name query searches the contents of the ADS abstract database as well as the SIMBAD and NED services, which are databases of galactic (SIMBAD) and extragalactic objects (SIMBAD and NED). The resulting list consists of papers which are relevant to the specified object(s). To get a list of object aliases, enter an object name in the appropriate window of the List Query Alternatively, a position on the sky and an optional radius can be specified for the search. Position searches locate the papers dealing with celestial objects located within the specified radius of the specified position. They work through SIMBAD and NED databases, and can be combined with the other search criteria. The syntax for position searches is: RA ±Dec : radius where RA and Dec are right ascention and declination J2000 positions, expressed in decimal degrees or in sexagesimal notation (hours minutes seconds and degrees arcmin and arcsec). The plus or minus sign before the declination is mandatory. The search radius may be given in decimal or sexagesimal degrees. For example, a 10' radius may be written as 0.1667 or 0 10 . The default search radius is 2' (0 2 = 0.033 deg) 2.2.2.3 - Publication Date The Publication Date should be entered as two integers in the form MM and YYYY (e.g. 12 and 1988). For those cases when there was no month available, the month is entered in the database as 00. If no date is entered, the program will default to the full date range. If no From month is entered the program will enter a default value of 00. If no To month is entered the program will enter a default value of 12. The limit by publication date is inclusive, meaning that an end date of 08 2000 will include all articles published before and in August 2000. 2.2.2.4 - Title This allows the user to search for words found in the title of the paper. The user can select 'OR', 'AND', Simple Logic, or Boolean Logic. 2.2.2.5 - Abstract Text This allows the user to enter any combination of words or sentences s/he chooses. The search will be done on individual words (except for some of the most common English words such as the, a, and and). This is extremely useful for entering the text of an abstract from a previous query to find all the papers most related to that paper. All title words and keywords are also indexed together with the abstract text words. The user can select 'OR', 'AND', Simple Logic, or Boolean Logic. 2.2.2.6 - Number of Abstracts to Return If you would like to see more than the first 100 references, change the appropriate number on the query form in the line Return __ Items. To retrieve the next set of references if the maximum count was exceeded, change the number in starting with number __ to the number from which you want to resume retrieving abstracts. Please note that currently our system will not return more than 500 records at a time. 2.2.3 - Filters The filters section allows the user to select specific abstracts from the results of the search. 2.2.3.1 - Entry Date The Entry Date allows the user to select data which have been entered in the database since a given date. Entering -31 in the Day field will select all new entries in the past month. 2.2.3.2 - Minimum Score The Minimum Score allows the user to select data which have a score greater than the entered minimum score. This is most useful when used in conjunction with the entry date. See the What's New Service described below. 2.2.3.3 - Select References From The default setting returns all abstracts which fit the query criteria. To select only those journals which are refereed (i.e. omit abstracts from publications such as conference proceedings, bulletins, newsletters), choose the '''Select References From: all refereed journals''' option. The checkbox Select only articles removes entries that are not regular articles, for instance meeting abstracts, observing proposals, catalog descriptions, etc. Users may select only from specific journals by choosing Select References From: Allbibliographic sources: and entering the journal(s) bibstems (separated by a comma or a semicolon). To use this field, you must know the abbreviation for the journal. These abbreviations are listed in the journal abbreviations file and are linked to publications. selecting one of the journals from that list will automatically put the abbreviation in this field. The journal abbreviation can be prepended with a "-". This will return only references that are not from the specified journal(s). Prepending a journal abbreviation with a "+" will add this journal to the selected category (either refereed or non-refereed). You can also use the "?" as a wildcard, for example, to include all books with "?????books", or to exclude proposals with "-?????prop". 2.2.3.4 - Select References With This allows users to select references which contain links to other information. You can select references that have either all selected links, any one or none of the selected links, such as full article text, original author abstracts, electronic versions, data tables, etc. 2.2.3.5 - Select References From Group The Abstract Service Astronomy and Astrophysics search form contains a Group selection section, which gives the user the ability to have the results of a query limited to a set of abstracts which we have designated as a bibliographic group. The groups currently in use by the Abstract Service include the following:
2.2.4 - Sorting The sorting section allows the user to sort the results list according to different criteria. The following sorting options are available. 2.2.4.1 - Sort by Score This option sorts the results list according to the score, the measure of how well each article matches the query. This is the default. 2.2.4.2 - Sort by Normalized Score This option sorts the results list according to the score, normalized to the number of authors in the article. This weighs articles with fewer authors higher. 2.2.4.3 - Sort by Citation Count This option sorts the results list according to the number of citations that each article has. It will return the most cited article in the list first. 2.2.4.4 - Sort by Normalized Citation Count This option sorts the results list according to the number of citations that each article has, normalized to the number of authors for the article. It weighs the citation count such that articles with fewer authors are deemed more important. 2.2.4.5 - Sort by First Author Name This option sorts the results list alphabetically according to the last name of the first author. 2.2.4.6 - Sort by Date (most Recent First) This option sorts the results list according to publication date, with the youngest articles returned first. 2.2.4.7 - Sort by Date (oldest First) This option sorts the results list according to publication date, with the oldest articles returned first. 2.2.4.8 - Sort by Entry Date This option sorts the results list according to the date at which the record was entered in the ADS database. 2.2.5 - Settings The settings section allows the user to change default query conditions such as the logic of the query. The following settings are available. 2.2.5.1 - Require Field For Selection This determines how results are combined when multiple fields are being searched. If Require Field for Selection is turned on for a specific search field, a bibliographic record will fulfill a query only if the search term(s) entered in the field are found in the record. For example, if you specify an author and a title word in the respective search fields and then check YES for Require Field For Selection under the Authors column, all abstracts which are retrieved must contain that author. When this is not turned on, abstracts which do not contain that author, but which contain the title word will also be returned (i.e. the default is to OR across fields). 2.2.5.2 - Synonym Replacement If this is turned on, the synonym list will be used to replace words. This corrects for misspellings in the abstract text or author names, and equates words such as accelerate, accelerated, and acceleration (see also Synonyms ) It is also possible to turn synonym replacement on or off for individual words within a query. By default, synonym replacement is done for all words. To exclude a word from synonym replacement, use the "=" sign before that word (to exactly match that word and no synonyms). If you have turned off synonym replacement but want it turned on for a given word, use the "#" sign before the word to turn synonym replacement on for this particular word. 2.2.5.3 - Relative Weights This is the relative weighting of the fields used in calculating the scores. If you want to give more weight to the authors than the title words, for example, these numbers would be changed. It is also possible to make one of these fields 0.0, in which case that item will not be considered when the relative scores of each abstract are calculated (the scores are descriptions of how well a given abstract matches the query conditions, normalized to 1.0). A negative weight will cause the result to return first all matching references that do not contain the field with the negative weight. 2.2.5.4 - Use For Weighting This determines if the field should be used in calculating the scores. If this is turned on, then this search field (authors, title, etc) will be included in the calculation for the total score of each abstract. If it is not turned on, the search field will be used for selection as specified in the other flags but will be ignored for the score calculation. This serves two main functions, to selectively turn off portions of a complex query without editing the query, and in conjunction with the Require Field for Selection flag to permit a Boolean search on a field to be combined with a relevancy ranking. 2.2.5.5 - Weighted Scoring This sets how the score for each abstract is calculated. If this is not turned on, the scoring is straight; each query condition that is met (a hit, e.g. if the abstract contains a word specified in the query) receives a score of 1 (before normalization). If this is turned on, the scoring is weighted; the score for a hit is the inverse log of the frequency of the specified condition (e.g. title word) in the total database. This gives a higher score to hits of rarely used words since they are presumably a stronger search criterion. The final score is then normalized so that a reference that fulfills every query input gets a score of 1. 2.2.5.6 - Example Assume there is a list of desired authors in the author field in the Abstract Query window. Assume that one wants papers where at least one of the desired names is an author, but does not want the paper to appear more relevant if more than one desired author is a co-author. In the Author Query Settings window, you would turn off Use for Weighting for the Author field and turn on Require Field for Selection for the author field. This would allow the rest of the query to determine the relevancy score, but return only abstracts with at least one of the authors. 2.2.6 - Saving Queries If you find yourself frequently modifying any of the default settings found on the Abstract Query Forms (for instance, if by default you want to only search for refereed papers), you may want to consider customizing the search form to fit your needs. If instead you find yourself frequently repeating the same type of query (for instance to see the updated list of papers published on a particular topic), you may want to save one or more of such queries so you can easily re-execute them. To create customized form(s), fill in the fields that will be the basis of a customized search query. Then press the Store Default Form button on the query page and the form will be saved on the ADS server. It will return a URL that can be used to recall this filled-in query form. You can also download a filled-in query form by clicking on Return Query Form. You can then save this query form locally or simply bookmark it. Bookmarking such a filled-in form is the best way to save several different query forms: as all the necessary query parameters are specified in the URL of the document, there is no need to store this form anywhere. Note: if you do bookmark different query forms you will want to change their title so that you can distinguish them. Netscape users can do this by selecting "Bookmarks" -> "Edit Bookmarks" -> "Edit" -> "Bookmark Properties". 2.2.7 - Query Logic This section describes how to compose complex queries by using the different logic operators supported by the ADS search engine. To modify the default search logic, please use the appropriate check buttons which can be found right above the input search box for the field in question. 2.2.7.1 - OR Queries When OR is selected, abstracts are returned that contain any of the words specified in the search field. The score of the search depends on how many words are matched. 2.2.7.2 - AND Queries When AND is selected, abstracts are returned that contain all of the words specified in the search field. The score of the search is 1 (since all words have been matched). 2.2.7.3 - Simple Logic Queries The simple logic recognizes "+" and "-" before the search words. To require a word to be found in a search, it needs a "+" in front of it. A "-" before a word indicates that only references that do NOT contain that word should be returned. "and", "or", and "not" are stop words and will be ignored in the simple logic. Example: +contact +binaries -eclipsing This will search for references that contain both words contact and binary, but not the word eclipsing. 2.2.7.4 - Full Boolean Queries This allows more complex queries than just combining all words with OR or AND. The allowed boolean operators are: and, or, not, (, and ). They can be used in any combination (as long as ( and ) match). For example the query: (redshift or survey) and not galaxy finds all references that contain either redshift or survey, but not galaxy. The order of precedence of the operators is (...), not, and, or. For larger numbers of search terms, this type of search will be slower than regular searches, especially if the not operator is used. Regular scoring is done on any terms that are combined with or. and and not combinations are scored as 1. Note that if you want to use not, it is best if it is not the first search argument, since this requires every abstract in the system to be selected and propagated through the following selection process. |
|
Top | Next |