Using a custom agent, we were able to harvest 6.95 million authority records using the publicly accessible interface to the Library of Congress authority files located at authorities.loc.gov.
Retrieved records have been converted into MarcXML
Accented characters have been converted into NFC (Composed Normal Form).
Initial checks against authorities.loc.gov indicate that the retreived data faithfully reflect that on the original system; however these checks are still only preliminary.
Cross checks against Classification Web have revealed some inconsistencies. For this reason, we are releasing this data for research purposes only. This data is not suitable for production use.