A Linked Data Landscape

Critical decisions for data licensing, shared standards, and system design

January 4, 2016

Erik Mitchell: Dispatches from the Field

Linked Data is an approach to publishing data that makes use of web technologies to create shareable information that can be easily used by humans and computers. In the past few years, the library, archive, and museum (LAM) community has developed new tools and standards, published new vocabularies, and explored new use cases (a list of steps that defines interactions between a user and a system) and applications. All of this activity is helping to share more data across the web. Recently, librarians and archivists have been pondering how to license this data to enable widespread use, how to develop and make use of shared standards, and how to design useful and effective systems.

Data licensing. The common practices that LAM communities have created to develop open source tools and support of open access are now influencing how we publish open data. Even though institutions are choosing different open-use licenses, open data is supporting new and broader uses of data. The Getty Museum, University of Pennsylvania, and University of British Columbia are among those that have released digital objects, full-text content, and metadata under open licenses. As these practices expand, we should see a dramatic growth in new scholarship.

Libraries that want to prepare for a linked future should focus on education, experimentation, and flexibility.

Shared standards. As major linked data projects such as BIBFRAME and the Digital Public Library of America progress, developers are making decisions about how to use vocabularies and standards that will affect the usefulness of LAM data. The LAM community has yet to reach consensus on these standards, and this poses a challenge to anyone seeking a way forward. As an example of potential confusion, data management company Zepheira, a partner with the Library of Congress (LC) in the development of BIBFRAME, developed a vocabulary called BIBFRAME Lite, which is similar but not always equivalent to the LC-managed BIBFRAME vocabularies. Their common-element names thus represent two schemas that lack a clear equivalence, and this impedes the interoperability of the standards.

System design and implementation. While much work on linked data is focused on converting existing data, a long list of projects explores new use cases and systems for linked data. High-profile projects such as the BIBFLOW project, an IMLS-funded project led by the University of California, Davis; and Linked Data for Libraries (LD4L), an Andrew W. Mellon–funded partnership between Cornell, Harvard, and Stanford university libraries, are exploring new workflows for creating linked data.

BIBFLOW is investigating technical services workflows using updated standards and user needs as a starting point. The BIBFLOW project is experimenting with the Open Library Environment to incorporate Resource Description Framework data and linked data to enable new workflows. Similarly, the LD4L community has explored the use of vocabularies in creating new linked data platforms. More information on products from the LD4L project can be found on its GitHub site.

What’s next? Libraries that want to prepare for a linked future should focus on education, experimentation, and flexibility. Here are three suggestions:

  • train your staff to deal with upcoming changes
  • explore new systems carefully before investing in them
  • monitor standards as they are developed

While the path ahead is not yet clear, individual libraries and museums will want to join in the effort to help solve these challenges.