Jump to content

Astrophysics Data System

From Wikipedia, the free encyclopedia

SAO/NASA Astrophysics Data System
ProducerSmithsonian Astrophysical Observatoryfor theNational Aeronautics and Space Administration(United States)
History1992 to present
Access
CostFree
Coverage
DisciplinesAstronomyandphysics
Record depthIndex & abstract & full-text
Geospatial coverageWorldwide
Links
Websiteui.adsabs.harvard.edu

TheSAO/NASA Astrophysics Data System(ADS) is adigital libraryportal for researchers onastronomyandphysics,operated forNASAby theSmithsonian Astrophysical Observatory.[1]ADS maintains three bibliographic collections containing over 20 million records, including allarXive-prints. Abstracts and full-text of major astronomy and physics publications are indexed and searchable through the portal.

Historical context[edit]

The importance of recording and classifying earlier astronomical knowledge and works was recognized in the 18th century, withJohann Friedrich Weidlerpublishing the first comprehensive history of astronomy in 1741 and the first astronomical bibliography in 1755. This effort was continued byJérôme de La Lande,who published hisBibliographie astronomiquein 1803, a work that covered the time from 480 B. C. to the year of publication. TheBibliographie générale de l’astronomie, Volume I and Volume IIwere published by J.C. Houzeau and A. Lancaster in Brussels, followed in the 1882 to 1889 period.[2][3]

As the number of astronomers and astronomical publications grew, bibliographical efforts became institutional tasks, first at theObservatoire Royal de Belgique,where theBibliography of Astronomywas published from 1881 to 1898, and then at theAstronomischer Rechen-Institutin Heidelberg, where the yearlyAstronomischer Jahresberichtwas published from 1899 to 1968. After this date it was replaced by theAstronomy and Astrophysics Abstractsyearly book series which continued until the end of the 20th century.

History[edit]

The first suggestion of a digital database of journal paper abstracts was made at a conference onAstronomy from Large Data-Bases,held inGarching bei Münchenin 1987.[4][5][6][7]An initial version of ADS, with a database consisting of 40 papers, was created as aproof of conceptin 1988. The ADS Abstract Service became available for general use via proprietary network software in April 1993, with it becoming connected toSIMBADa few months later. In early 1994, the ADS web-based service was launched, which effectively quadrupled the number of active users in the five weeks following its introduction.[8]

In 2011 the ADS launched ADS Labs Streamlined Search which introducedfacetsfor query refinement and selection. In 2013 ADS Labs 2.0 featuring a new search engine, full-text search functionality, scalable facets and an API was introduced. In 2015 the new ADS, code-named Bumblebee, was released as ADS-beta. The ADS-beta system features a microservices API and client-side dynamic page loading served on a cloud platform. In May 2018 the beta label was dropped and Bumblebee became the default ADS interface—with some legacy features (ADS Classic) remaining available.[9]Development continues to the present day, with an extensibleAPIavailable: enabling users to build their own utilities on top of the ADS bibliographic record.

The ADS service is distributed worldwide, with twelvemirror sitesin twelve countries, with the database synchronized by weekly updates usingrsync,a mirroring utility which allows updates to only the portions of the database which have changed. All updates are triggered centrally, but they initiate scripts at the mirror sites which "pull" updated data from the main ADS servers.[10]

Data in the system[edit]

At first, the journal articles available via ADS were exclusivelyscannedbitmapscreated from the paper journals and the abstracts created usingoptical character recognitionsoftware. Some of these scanned articles up to around 1995 are available for free by agreement with the journal publishers,[11]with some dating from as far back as the early 19th century. Eventually, because of a wider spread of online editions of journal publications, abstracts would start to instead be loaded into ADS directly.

Papers are indexed within the database by their bibliographic record which contains the details of the journal they were published in, and various associatedmetadata,such as author lists,referencesandcitations.Originally this data was stored inASCIIformat, but, eventually, the limitations of this encouraged the database maintainers to migrate all records to anXML(Extensible Markup Language) format in 2000. Bibliographic records are now stored as an XML element, with sub-elements for the various metadata.[10]

Scanned articles are stored inTIFFformat, at both medium and highresolution.The TIFF files are converted on demand into GIF files, for on-screen viewing, andPDF,orPostScriptfiles for printing. The generated files are thencachedto eliminate needlessly frequent regenerations for popular articles. As of 2000, ADS contained 250GBof scans, which consisted of 1,128,955 article pages comprising 138,789 articles. By 2005 this had grown to 650 GB, and was expected to grow further, to about 900 GB by 2007.[11]No further information has been published (2005).

The database initially contained only astronomical references, but has now grown to incorporate three databases, coveringastronomy references (including planetary sciences and solar physics),physicsreferences (including instrumentation and geosciences), as well as preprints of scientific papers fromarXiv.The astronomy database is by far the most advanced and its use accounts for about 85% of the total ADS usage. Articles are assigned to the different databases according to the subject rather than the journal they are published in, so that articles from any one journal might appear in all three subject databases. The separation of the databases allows searching in each discipline to be tailored, so that words can automatically be given differentweight functionsin different database searches, depending on how common they are in the relevant field.[10]

Data in the preprint archive is updated daily from thearXiv,the main repository of physics and astronomy preprints. The advent of preprint servers has, like ADS, a significant impact on the rate of astronomical research, as papers are often made available from preprint servers weeks or months before they are published in the journals. The incorporation of preprints from the arXiv into ADS means that the search engine can return the most current research available, with the caveat that preprints may not have been peer-reviewed orproofreadto the required standard for publication in the main journals. ADS's database links preprints with subsequently published articles wherever possible, so that citation and reference searches will return links to the journal article where the preprint was cited.[12]

Software and hardware[edit]

The software runs on a system that was written specifically for it, allowing for extensive customization for astronomical needs that would not have been possible with general purposedatabasesoftware. The scripts are designed to be asplatform independentas possible, given the need to facilitate mirroring on different systems around the world, although the growing use ofLinuxas theoperating systemof choice within astronomy has led to increasing optimization of the scripts for installation on that platform.[10]

The main ADS server is located at theCenter for Astrophysics | Harvard & SmithsonianinCambridge, Massachusetts,and is a dual 64-bit X86Intelserver with two quad-core 3.0GHzCPUsand 32 GB ofRAM,running theCentOS5.4Linuxdistribution.[11]Mirrors are located in Brazil, China, Chile, France, Germany, India, Indonesia, Japan, Russia, South Korea, United Kingdom, and Ukraine.[13]

Indexing[edit]

ADS currently (2005) receives abstracts or tables of contents from almost two hundred journal sources. The service may receive data referring to the same article from multiple sources, and creates one bibliographic reference based on the most accurate data from each source. The common use ofTeXandLaTeXby almost all scientific journals greatly facilitates the incorporation of bibliographic data into the system in a standardized format, and importingHTML-coded web-based articles is also simple. ADS utilizesPythonandPerlscripts for importing, processing and standardizing bibliographic data.[10]

The apparently mundane task of converting author names into a standardSurname,Initialformat is actually one of the more difficult to automate, due to the wide variety of naming conventions around the world and the possibility that a given name such as Davis could be afirst name,middle nameor surname. The accurate conversion of names requires a detailed knowledge of the names of authors active in astronomy, and ADS maintains an extensive database of author names, which is also used in searching the database (see below).

For electronic articles, a list of the references given at the end of the article is easily extracted. For scanned articles, reference extraction relies on OCR. The reference database can then be "inverted" to list the citations for each paper in the database. Citation lists have been used in the past to identify popular articles missing from the database; mostly these were from before 1975 and have now been added to the system.

Coverage[edit]

The database now contains over fifteen million articles. In the cases of the major journals of astronomy (Astrophysical Journal,Astronomical Journal,Astronomy and Astrophysics,Publications of the Astronomical Society of the Pacificand theMonthly Notices of the Royal Astronomical Society), coverage is complete, with all issues indexed from number 1 to the present. These journals account for about two-thirds of the papers in the database, with the rest consisting of papers published in over 100 other journals from around the world, as well as in conference proceedings.[11]

While the database contains the complete contents of all the major journals and many minor ones as well, its coverage of references and citations is much less complete. References in and citations of articles in the major journals are fairly complete, but references such as "private communication", "in press" or "in preparation" cannot be matched, and author errors in reference listings also introduce potential errors. Astronomical papers may cite and be cited by articles in journals which fall outside the scope of ADS, such aschemistry,mathematicsorbiologyjournals.[14]

Search engine[edit]

An example of a complex search combining object, title and abstract queries with a date filter

Since its inception, the ADS has developed a highly complexsearch engineto query the abstract andobject databases.The search engine is tailor-made for searching astronomical abstracts, and the engine and itsuser interfaceassume that the user is well-versed in astronomy and able to interpret search results which are designed to return more than just the most relevant papers. The database can be queried for author names,astronomical objectnames, title words, and words in the abstract text, and results can be filtered according to a number of criteria. It works by first gathering synonyms and simplifying search terms as described above, and then generating an "inverted file", which is a list of all the documents matching each search term. The user-selected logic and filters are then applied to this inverted list to generate the final search results.[15]

Author name queries[edit]

The system indexes author names by surname and initials, and accounts for the possible variations in spelling of names using a list of variations. This is common in the case of names including accents such asumlautsand transliterations fromArabicorCyrillic script.An example of an entry in the author synonym list is:

AFANASJEV, V
AFANAS’EV, V
AFANAS’IEV, V
AFANASEV, V
AFANASYEV, V
AFANS’IEV, V
AFANSEV, V

Object name searches[edit]

The capability to search for papers on specific astronomical objects is one of ADS's most powerful tools. The system uses data from theSIMBAD,theNASA/IPAC Extragalactic Database,theInternational Astronomical UnionCirculars and theLunar and Planetary Instituteto identify papers referring to a given object, and can also search by object position, listing papers which concern objects within a 10arcminuteradius of a givenRight AscensionandDeclination.These databases combine the many catalogue designations an object might have, so that a search for thePleiadeswill also find papers which list the famousopen clusterinTaurusunder any of its other catalog designations or popular names, such as M45, the Seven Sisters or Melotte 22.[16]

Title and abstract searches[edit]

The search engine first filters search terms in several ways. An M followed by a space orhyphenhas the space or hyphen removed, so that searching forMessier catalogueobjects is simplified and a user input of M45, M 45 or M-45 all result in the same query being executed; similarly,NGCdesignations and common search terms such asShoemaker LevyandT Tauriare stripped of spaces. Unimportant words such as AT, OR and TO are stripped out, although in some casescase sensitivityis maintained, so that whileand is ignored,And is converted to "Andromeda",andHer is converted to "Hercules",buther is ignored.[17]

Synonym replacement[edit]

Once search terms have been preprocessed, the database is queried with the revised search term, as well as synonyms for it. As well as simplesynonymreplacement such as searching for bothpluralandsingularforms, ADS also searches for a large number of specifically astronomical synonyms. For example,spectrographandspectroscopehave basically the same meaning, and in an astronomical contextmetallicityandabundanceare also synonymous. ADS's synonym list was created manually, by grouping the list of words in the database according to similar meanings.[10]

As well asEnglish languagesynonyms, ADS also searches for English translations of foreign search terms and vice versa, so that a search for theFrenchwordsoleilretrieves references toSun,and papers in languages other than English can be returned by English search terms.

Synonym replacement can be disabled if required, so that a rare term which is a synonym of a much more common term (such as 'dateline' rather than 'date') can be searched for specifically.

Selection logic[edit]

The search engine allows selectionlogicboth within fields and between fields. Search terms in each field can be combined with OR, AND, simple logic orBoolean logic,and the user can specify which fields must be matched in the search results. This allows complex searches to be built; for example, the user could search for papers concerningNGC 6543ORNGC 7009,with the paper titles containing (radius OR velocity) AND NOT (abundance OR temperature).

Result filtering[edit]

Search results can be filtered according to a number of criteria, including specifying a range of years such as '1945 to 1975', '2000 to the present day' or 'before 1900', and what type of journal the article appears in [–] non-peer-reviewed articles such asconferenceproceedings. These can be excluded or specifically searched for, or specific journals can be included in or excluded from the search.

Search results[edit]

Search results page from ADS – A, F, G, C, R etc. are links to associated data for each abstract such as full-text article, citations, also-read papers and so on.

Although it was conceived as a means of accessing abstracts and papers, ADS provides a substantial amount of ancillary information along with search results. For each abstract returned, links are provided to other papers in the database which are referenced, and which cite the paper, and a link is provided to a preprint, where one exists. The system also generates a link to 'also-read' articles – that is, those which have been most commonly accessed by those reading the article. In this way, an ADS user can determine which papers are of most interest to astronomers who are interested in the subject of a given paper.[15]

Also returned are links to theSIMBADand/orNASA Extragalactic Databaseobject name databases, via which a user can quickly find out basic observational data about the objects analyzed in a paper, and find further papers on those objects.

Impact on astronomy[edit]

ADS is almost universally used as a research tool among astronomers, and there are several studies that have estimated quantitatively how much more efficient ADS has made astronomy; one estimated that ADS increased the efficiency of astronomical research by 333 full-time equivalent research years per year,[8]and another found that in 2002 its effect was equivalent to 736 full-time researchers, or all the astronomical research done in France.[18]ADS has allowed literature searches that would previously have taken days or weeks to carry out to be completed in seconds, and it is estimated that ADS has increased the readership and use of the astronomical literature by a factor of about three since its inception.[18]

In monetary terms, this increase in efficiency represents a considerable amount. There are about 12,000 active astronomical researchers worldwide, so ADS is the equivalent of about 5% of the working population of astronomers. The global astronomical research budget is estimated at between 4,000 and US$5,000 million,[19]so the value of ADS to astronomy would be about 200–250 million USD annually. Its operating budget is a small fraction of this amount.[18]

The great importance of ADS to astronomers has been recognized by theUnited Nations,theGeneral Assemblyof which has commended ADS on its work and success, particularly noting its importance to astronomers in the developing world, in reports of theUnited Nations Committee on the Peaceful Uses of Outer Space.A 2002 report by a visiting committee to the Center for Astrophysics, meanwhile, said that the service had "revolutionized the use of the astronomical literature", and was "probably the most valuable single contribution to astronomy research that the CfA has made in its lifetime".[20]

Sociological studies using ADS[edit]

Because it is used almost universally by astronomers, ADS can reveal much about how astronomical research is distributed around the world. Most users access the system from institutes of higher education, whoseIP addresscan easily be used to determine the user's geographical location. Studies reveal that the highest per-capita users of ADS are France and Netherlands-based astronomers, and while more developed countries (measured byGDP per capita) use the system more than less developed countries; the relationship between GDP per capita and ADS use is not linear. The range of ADS usage per capita far exceeds the range of GDP per capita, and basic research carried out in a country, as measured by ADS usage, has been found to be proportional to the square of the country's GDP divided by its population.[18]Statistics also imply that there are about three times as many astronomers in countries of European culture as in countries ofAsian cultures,perhaps suggesting cultural differences in the importance attached to astronomical research.[18]The amount of basic research carried out in a country is found to be proportional to the number of astronomers in that country multiplied by its GDP per capita, with considerable scatter.

ADS has also been used to show that the fraction of single-author astronomy papers has decreased substantially since 1975 and that astronomical papers with more than 50 authors have become more common since 1990.[21]

See also[edit]

References[edit]

  1. ^https://ui.adsabs.harvard.edu/about/
  2. ^Houzeau, J. C. (1887).Bibliographie générale de l'astronomie(in French). F. Hayez, Imprimeur de L'Académie Royale de Belgique.
  3. ^Houzeau, Jean-Charles (1882).Bibliographie générale de l'astronomie ou catalogue méthodique des ouvrages, des mémoires et des observations astronomiques publiés depuis l'origine de l'imprimerie jusqu'en 1880: Mémoires et notices insérés dans les Collections académiques et les Revues(in French).
  4. ^Squibb, G.F.; Cheung, C.Y. (1988). "NASA astrophysics data system (ADS) study".European Southern Observatory Conference and Workshop Proceedings.28:489.Bibcode:1988ESOC...28..489S.
  5. ^Adorf, H.-M.; Busch, E.K. (1988).Intelligent access to a bibliographical full text data base.European Southern Observatory Conference and Workshop Proceedings.Vol. 28. p. 143.Bibcode:1988ESOC...28..143A.
  6. ^Rey-Watson, J.M. (1988).Access to astronomical literature through commercial databases.European Southern Observatory Conference and Workshop Proceedings.Vol. 28. p. 453.Bibcode:1988ESOC...28..453R.
  7. ^Rhodes, C.; Kurtz, M.J.; Rey-Watson, J.M. (1988).A library collection of software documentation specific to astronomical data reduction.European Southern Observatory Conference and Workshop Proceedings.Vol. 28. p. 459.Bibcode:1988ESOC...28..459R.
  8. ^abKurtz, M.J.; Eichhorn G.; Accomazzi A.; Grant C.S.; Murray S.S.; Watson J.M. (2000). "The NASA Astrophysics Data System: Overview".Astronomy and Astrophysics Supplement Series.143(1): 41–59.arXiv:astro-ph/0002104.Bibcode:2000A&AS..143...41K.doi:10.1051/aas:2000170.S2CID17583122.
  9. ^Accomazzi, Alberto; Kurtz, Michael J.; Henneken, Edwin; Grant, Carolyn S.; Thompson, Donna M.; Chyla, Roman; McDonald, Steven; Shaulis, Taylor J.; Blanco-Cuaresma, Sergi; Shapurian, Golnaz; Hostetler, Timothy W.; Templeton, Matthew R.; Lockhart, Kelly E. (January 2018).ADS Bumblebee comes of age.231st Meeting of the American Astronomical Society. 362.17.Bibcode:2018AAS...23136217A.
  10. ^abcdefAccomazzi, A.; Eichhorn, G.; Kurtz, M.J.; Grant, C.S.; Murray, S.S. (2000). "The NASA Astrophysics Data System: Architecture".Astronomy and Astrophysics Supplement Series.143(1): 85–109.arXiv:astro-ph/0002105.Bibcode:2000A&AS..143...85A.doi:10.1051/aas:2000172.S2CID7182316.
  11. ^abcd"NASA ADS Abstract Service Mirroring Information".Harvard-Smithsonian Center for Astrophysics. 23 June 2005.Retrieved2 November2008.
  12. ^myADS-arXiv: A fully customized, open access virtual journal.March Meeting 2007, American Physical Society. Vol. 52. U20.9.Retrieved30 October2008.
  13. ^"SAO/NASA ADS at SAO: Mirror Sites".Harvard-Smithsonian Center for Astrophysics. Archived fromthe originalon 27 February 2008.Retrieved30 October2008.
  14. ^"ADS Bibliographic Codes: Journal Abbreviations".Harvard-Smithsonian Center for Astrophysics. Archived fromthe originalon 30 April 2008.Retrieved30 October2008.
  15. ^abEichhorn, G.; Kurtz, M.J.; Accomazzi, A.; Grant, C.S.; Murray, S.S. (2000). "The NASA Astrophysics Data System: The search engine and its user interface".Astronomy and Astrophysics Supplement Series.143(1): 61–83.arXiv:astro-ph/0002102.Bibcode:2000A&AS..143...61E.doi:10.1051/aas:2000171.S2CID2787647.
  16. ^"SAO/NASA ADS HELP: Abstract Query Form".Harvard-Smithsonian Center for Astrophysics. 2.2.2.2 - SIMBAD/NED/LPI/IAUC Object Names/Position. Archived fromthe originalon 9 May 2008.Retrieved30 October2008.
  17. ^"SAO/NASA ADS HELP: Abstract Query Form".Harvard-Smithsonian Center for Astrophysics. 2.2.1.2 - Stop Words. Archived fromthe originalon 9 May 2008.Retrieved30 October2008.
  18. ^abcdeKurtz, M.J.; Eichhorn G.; Accomazzi A.; Grant C.S.; Demleitner M.; Murray S.S. (2005). "Worldwide Use and Impact of the NASA Astrophysics Data System Digital Library".Journal of the American Society for Information Science and Technology.56(1): 36–45.arXiv:0909.4786.Bibcode:2005JASIS..56...36K.doi:10.1002/asi.20095.S2CID15181632.(Preprint)
  19. ^Woltjer, L. (1998). "Economic Consequences of the Deterioration of the Astronomical Environment".Preserving The Astronomical Windows. Proceedings of Joint Discussion number 5 of the 23rd General Assembly of the International Astronomical Union held in Kyoto, Japan 22–23 August 1997.23rd General Assembly of the International Astronomical Union. Vol. 139. p. 243.Bibcode:1998ASPC..139..243W.
  20. ^"ADS Awards and Recognition".Harvard-Smithsonian Center for Astrophysics.Retrieved25 March2022.

External links[edit]