Harvesting Data from NTRS

The NTRS promotes the dissemination of NASA STI to the widest audience possible by allowing NTRS information to be harvested by sites using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAI-PMH defines a mechanism for information technology systems to exchange citation information using the open standards HTTP (Hypertext Transport Protocol) and XML (Extensible Markup Language). NTRS is designed to accept and respond to automated requests using OAI-PMH. Automated requests only harvest citation information and not the full-text document images.

If you are interested in harvesting from NTRS and you have any comments or questions about the process, please contact the STI Information Desk for assistance.

If you harvest from our database, please cite the NASA STI Program as a source of data.

Sites interested in harvesting from NTRS should review the following guidance before harvesting:

Use of Government Information
The NTRS serves out unlimited, unclassified, publicly available NASA citations and full-text documents (PDFs). Persons, organizations, and sites interested in obtaining NASA information should review Disclaimers, Copyright Notice, Terms and Conditions of Use for guidance.

Harvesting Images
NTRS actively blocks spidering, robots, and intelligent agents from automatically retrieving the full-text images. Links to full-text documents (PDFs) are included in the citations. The URL image link in the harvested NTRS metadata is a way for your users to access the full-text document image residing on NTRS.

Harvesting Metadata Citations

  • The NTRS is an OAI-compliant data provider. OAI-PMH is an implementation of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), a standard for retrieving metadata from digital document repositories.
  • NTRS supports OAI-PMH version 2.0. It does not support earlier versions of the protocol.
  • The base URL for the NTRS is https://ntrs.nasa.gov/
  • An OAI request for harvesting from NTRS will return a maximum of 100 records per request. If you plan on harvesting more than 100 records, please run those requests between 8PM-8AM U.S. Eastern Time. Do not make more than one request every 3 seconds.
  • Users can harvest the NTRS data by sending an OAI compliant request to the NTRS archive. The request URL is formatted as https://ntrs.nasa.gov/oai?verb=XXX (where XXX is the verb value). There are several valid verb values that provide useful information.
    Identify = Provides a description of the NTRS repository
    ListMetadataFormats = Gives the metadata format(s) available for request from NTRS
    ListSets = Provides a list of the NTRS defined sets. These results can help refine your request by asking for one specific set of data versus the entire NTRS collection
    ListIdentifiers = Gives a list of the OAI unique identifiers available within NTRS
    ListRecords = Gives a listing of N records at a time. NTRS is currently set to give 100 records at a time with a Resumption Token at the end if more records are available for the request received
    GetRecord = Will provide the user the XML file for a specific record
  • Records may be harvested from NTRS in the following formats: oai_dc and casi_dc — a more inclusive record format based on Dublin Core and supplemented with NASA terms. For more information, see the field description document for the casi_dc format, and the schema references returned from the following request: https://ntrs.nasa.gov/oai?verb=ListMetadataFormats.

Updated, Modified, and Deleted Citations and Full-Text Documents
Over time, metadata citations and full-text document images may be updated, modified, and/or deleted as a result of regular data management. The best method to detect changes in NTRS information is regular harvesting of NTRS using OAI-PMH. Newly updated and/or modified records will automatically replace previously harvested records. Records marked as ‘deleted’ will take additional processing on your site to detect NASA citations that should be deleted from your repository.

Top