Web-Scale Discovery

December 22, 2010

Connecting users with the information they seek is one of the central pillars of our profession. Web-scale discovery services for libraries are those services capable of searching quickly and seamlessly across a vast range of local and remote preharvested and indexed content, providing relevancy-ranked results in an intuitive interface expected by today’s information seekers. First debuting in late 2007, these rapidly evolving tools are more important today than ever to understand.

Web-scale discovery services for the library environment are an evolution holding great potential to easily connect researchers with the library’s vast information repository, whether physical holdings, such as books and DVDs; local electronic content, such as digital image collections and institutional repository materials; or remotely hosted content purchased or licensed by the library, such as e-books and publisher or aggregator content for thousands of full-text and abstracting and indexing resources. For our purposes, web-scale discovery can be considered a service capable of searching across a vast range of preharvested and indexed content quickly and seamlessly. They provide discovery and delivery services that often have the following traits:

  • Content harvested from local and remotely hosted repositories to create a vastly comprehensive centralized index—to the article level—based on a normalized schema across content types, well suited for rapid search and retrieval of results ranked by relevancy. Content is enabled through the harvesting of local library resources, combined with brokered agreements with publishers and aggregators allowing access to their metadata or full-text content for indexing purposes. 
  • Discovery provided by a single search box providing a Google-like search experience (as well as advanced searching capabilities).
  • Delivery of quick results ranked by relevancy in a modern interface offering functionality and design cues intuitive to and expected by today’s users, such as faceted navigation to drill down to more specific results.
  • Flexibility agnostic to underlying systems, whether hosted by the library or hosted remotely by content providers. These services are open compared to traditional library systems and allow a library greater latitude to customize the services and make them its own.

Why Web-Scale Discovery?

As illustrated by research from as far back as the 1990s, if not earlier, to as recent as 2010, library discovery systems within the networked online environment have evolved, yet continue to struggle to serve users. As a result, the library, or systems supported and maintained by the library, is often not the first stop for research—or worse, not a stop at all. Users have defected, and research continues to illustrate this fact.

Other factors, apart from user behavior and preferences, also give reasons for libraries to use web-scale discovery services. First, and most obvious, is that if something is not discovered, it has no chance of being used. Whether a librarian conducts a reference interview, a user browses the shelves, a friend provides word-of-mouth, a user searches in Google or a library database, or a user scans issues and article titles in an electronic journal, discovery must happen, either by focused intent or serendipitously. Libraries often spend tremendous amounts of money every year to purchase or pay for access to an ever-growing body of electronic content, and the cost for access to this content often increases on an annualized basis. But for the content to be used, it must be discoverable—and for today’s users, easily discoverable.

Jason Vaughan is the director of library technologies at the University of Nevada at Las Vegas. This is an excerpt from the January 2011 ALA TechSource on web-scale discovery.



When There Is No Frigate But a Book

Effective outreach services permit readers to voyage beyond their limitations