Jump to content

Open Archives Initiative Protocol for Metadata Harvesting

From Wikipedia, the free encyclopedia
(Redirected fromOAI-PMH)

TheOpen Archives InitiativeProtocol for Metadata Harvesting(OAI-PMH) is a protocol developed forharvestingmetadatadescriptions of records in an archive so that services can be built using metadata from many archives. Animplementationof OAI-PMH must support representing metadata inDublin Core,but may also support additional representations.[1][2]

The protocol is usually just referred to as the OAI Protocol.

OAI-PMH usesXMLoverHTTP.Version 2.0 of the protocol was released in 2002; the document was last updated in 2015. It has aCreative Commons licenseBY-SA.

History

[edit]

In the late 1990s,Herbert Van de Sompel(Ghent University) was working with researchers and librarians atLos Alamos National Laboratory(US) and called a meeting to address difficulties related tointeroperabilityissues ofe-print serversanddigital repositories.The meeting was held inSanta Fe, New Mexico,in October 1999.[3]A key development from the meeting was the definition of an interface that permitted e-print servers to exposemetadatafor the papers it held in a structured fashion so other repositories could identify and copy papers of interest with each other. This interface/protocol was named the "Santa Fe Convention".[1][2][4]

Several workshops were held in 2000 at the ACM Digital Libraries conference,[5]at the 1st ACM/IEEE-CS joint conference on Digital libraries[6][7]and elsewhere to share the ideas from the Santa Fe Convention.[8]It was discovered at the workshops that the problems faced by the e-print community were also shared by libraries, museums, journal publishers, and others who needed to share distributed resources. To address these needs, theCoalition for Networked Information[9]and theDigital Library Federation[10]provided funding to establish anOpen Archives Initiative(OAI) secretariat managed by Herbert Van de Sompel and Carl Lagoze. The OAI held a meeting atCornell University(Ithaca, New York) in September 2000 aimed to improve the interface developed at the Santa Fe Convention.[11]The specifications were refined over e-mail.

OAI-PMH version 1.0 was introduced to the public in January 2001 at a workshop inWashington D.C.,[12]and another in February inBerlin, Germany.[13]Subsequent modifications to theXMLstandard by theW3Crequired making minor modifications to OAI-PMH resulting in version 1.1. The current version, 2.0, was released in June 2002. It contained several technical changes and enhancements and is not backward compatible.[14]

OAI workshops

[edit]

From 2001CERN,and later in collaboration withUniversity of Geneva,has organized bi-annual OAI workshops,[15]which over time have developed to cover most aspects ofopen science.Since 2021 the workshop series is named the Geneva Workshop on Innovations in Scholarly Communication, with the nick name OAI reflecting its origin.[16]

Uses

[edit]

Some commercialsearch enginesuse OAI-PMH to acquire more resources.Googleinitially included support for OAI-PMH when launching sitemaps, however decided to support only the standard XMLSitemapsformat in May 2008.[17]In 2004,Yahoo!acquired content fromOAIster(University of Michigan) that was obtained through metadata harvesting with OAI-PMH.Wikimediauses an OAI-PMH repository to provide feeds ofWikipediaand related site updates for search engines and other bulk analysis/republishing endeavors.[18]Especially when dealing with thousands of files being harvested every day, OAI-PMH can help in reducing the network traffic and other resource usage by doing incremental harvesting.[19]NASA'sMercurymetadata search system uses OAI-PMH to index thousands of metadata records from Global Change Master Directory (GCMD) every day.[20]

Themod_oaiproject is using OAI-PMH to expose content to web crawlers that is accessible fromApache Web servers.

OAI-PMH has later been applied to sharing of scientific data.[21]

Software

[edit]

OAI-PMH is based on aclient–serverarchitecture, in which "harvesters" request information on updated records from "repositories". Requests for data can be based on a datestamp range, and can be restricted to named sets defined by the provider. Data providers are required to provideXMLmetadata inDublin Coreformat, and may also provide it in other XML formats.

A number of software systems support the OAI-PMH, includingFedora,EThOSfrom theBritish Library,GNU EPrintsfrom theUniversity of Southampton,Open Journal Systemsfrom thePublic Knowledge Project,Desire2Learn,DSpacefromMIT,HyperJournal from theUniversity of Pisa,Digibib from Digibis,MyCoRe,Koha,Primo, DigiTool, Rosetta and MetaLib fromEx Libris,ArchivalWare from PTFS, DOOR[22]from the eLab[23]in Lugano, Switzerland, panFMP from thePANGAEA data library,[24]SimpleDLfrom Roaring Development, and jOAI from theNational Center for Atmospheric Research.[25]

Archives

[edit]

A number of large archives support the protocol includingarXivand theCERNDocument Server.

See also

[edit]

References

[edit]
  1. ^abLynch, Clifford A. (August 2001)."Metadata harvesting and the Open Archives Initiative".ARL: A Bimonthly Report(217). Archived fromthe original(PDF)on 25 May 2012.{{cite journal}}:CS1 maint: date and year (link)
  2. ^abMarshall Breeding (September 2002)."Understanding the Protocol for Metadata Harvesting of the Open Archives Initiative".Computers in Libraries.22(8): 24–29.Retrieved2021-02-08.
  3. ^Marshall, E. (1999)."Researchers plan free global preprint archive".Science.286(5441): 887a–887.doi:10.1126/science.286.5441.887a.PMID10577235.S2CID178990556.
  4. ^"The Santa Fe Convention by the Open Archives Initiative".Open Archives Initiative.February 15, 2000.RetrievedMay 29,2022.
  5. ^"The Santa Fe Convention of the Open Archives Initiative".dspace.library.uu.nl.Retrieved2021-02-10.
  6. ^Edward A. Fox; Christine L. Borgman, eds. (2001).Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries.Roanoke, Virginia, United States: ACM Press.doi:10.1145/379437.ISBN978-1-58113-345-5.{{cite book}}:CS1 maint: date and year (link)
  7. ^Lagoze, Carl; Van de Sompel, Herbert (2001)."The open archives initiative".Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries.Roanoke, Virginia, United States: ACM Press. pp. 54–62.CiteSeerX10.1.1.161.6800.doi:10.1145/379437.379449.ISBN978-1-58113-345-5.S2CID1315824.{{cite book}}:CS1 maint: date and year (link)
  8. ^Van de Sompel, Herbert; Lagoze, Carl (2000)."The Santa Fe Convention of the Open Archives Initiative".D-Lib Magazine.6(2).doi:10.1045/february2000-vandesompel-oai.ISSN1082-9873.
  9. ^"Homepage".Coalition for Networked Information.RetrievedMay 29,2022.
  10. ^"Homepage".Digital Library Federation.RetrievedMay 29,2022.
  11. ^"OAi-tech Meeting, Cornell University, September 7-8 2000".www.openarchives.org.Retrieved2021-02-10.
  12. ^"The Open Archives Initiative: Open Meeting Renaissance Hotel, Washington DC January 23, 2001".www.openarchives.org.Retrieved2021-02-10.
  13. ^"The Open Archives Initiative: Open Meeting Staatsbibliothek zu Berlin, Germany February 26, 2001".www.openarchives.org.Retrieved2021-02-10.
  14. ^Van de Sompel, Herbert; Young, Jeffrey A.; Hickey, Thomas B. (2003)."Using the OAI-PMH... Differently".D-Lib Magazine.9(7/8).doi:10.1045/july2003-young.ISSN1082-9873.
  15. ^"Previous OAI Workshops – OAI".The Geneva Workshop on Innovations in Scholarly Communication.Retrieved2023-01-13.
  16. ^Azwa, Adnan Siti Norfateha."Library Guide: Open Access Guide: The Latest on OA".umlibguides.um.edu.my.Retrieved2023-01-13.
  17. ^"Retiring Support for OAI-PMH in Sitemaps".Google Search Central Blog.April 23, 2008.RetrievedMay 29,2022.
  18. ^"Wikimedia update feed service".Wikimedia Meta-Wiki.Retrieved14 July2013.
  19. ^"OAI Harvesting System".DLXS.RetrievedMay 29,2022.
  20. ^R. Devarakonda; G. Palanisamy; J. Green; B. Wilson (2010). "Data sharing and retrieval uses OAI-PMH".Earth Science Informatics.4(1). Springer Berlin / Heidelberg: 1–5.doi:10.1007/s12145-010-0073-0.S2CID46330319.
  21. ^Devarakonda, Ranjeet; Palanisamy, Giri; Green, James M.; Wilson, Bruce E. (2011)."Data sharing and retrieval using OAI-PMH".Earth Science Informatics.4(1): 1–5.doi:10.1007/s12145-010-0073-0.ISSN1865-0473.S2CID46330319.
  22. ^"Overview".DOOR.RetrievedMay 29,2022.
  23. ^"eLab".Universita della Svizzera italiana(in Italian).RetrievedMay 29,2022.
  24. ^"PANGAEA® Framework for Metadata Portals".panfmp.org.
  25. ^"NCAR/joai-project".Github.com.31 May 2022.


[edit]