Harvard Library Bibliographic Dataset


This dataset contains over 12 million bibliographic records for materials held by the Harvard Library, including books, journals, electronic resources, manuscripts, archival materials, scores, audio, video and other materials.

The metadata has been created, acquired and modified over decades, and represents a range of cataloging rules and practices. The records have not been altered or quality-checked during the export process and are offered as is.  For more information about the dataset, please see the Documentation file, below.

Use Terms

Harvard makes this set of bibliographic records available for public use under its Bibliographic Dataset Use Terms.


We suggest the following language to provide proper attribution when using this dataset:

This [title of report or article or dataset] contains information from the Harvard Library Bibliographic Dataset, which is provided by the Harvard Library under its Bibliographic Dataset Use Terms and includes data made available by, among others, OCLC Online Computer Library Center, Inc. and the Library of Congress.

Formats available:

MARC21 records

Download [ openmetadata.lib.harvard.edu/bibdata/data ]

File date: 2015-08-25
File format: application/x-gzip
File size: 4213159049
Record count: 13512033
MD5 checksum: 4855a49f1a8e5b0fbd0ed27ae21665bb


Other access:

A version of the data has been incorporated into the Digital Public Library of America (DPLA) beta platform. Access to the DPLA database is available through an application programming interface (API). Please see the DPLA API documentation for technical details on using the API.