AI and Machine Learning

The challenges of artificial intelligence in libraries

March 1, 2019

Dispatches, by Jason Griffey

Artificial intelligence (AI) and machine learning are everywhere, giving driving directions and identifying objects in photographs. They are so engrained in our technology that often people don’t realize what they’re experiencing is a machine learning system. Everyone with a smartphone has an AI system that uses machine learning.

For example, Google’s Android operating system records, measures, and collects information and sends that data to servers. These servers use billions of data points collected from tens of millions of users as input for their machine learning systems. When you ask an Android phone to show you pictures from the beach, a complex set of data moves back and forth between your phone and Google’s servers, comparing your photos to the billions in its data set. The search results include pictures that the AI decided were most likely to be related.

Since Google has billions of photos to assess and millions of people helping it train its AI, the decisions that the AI makes are generally good. But AI is only as effective as its training data and the weighting given to the system as it learns to make decisions. If the data is biased, contains bad examples of decision making, or is simply collected in such a way that it doesn’t represent the full problem set, the system will produce broken, unrepresentative, or bad outputs.

For data privacy and security concerns, localized machine learning has an advantage.

Apple, on the other hand, has chosen to model its AI and machine learning by analyzing and weighting your data locally on the iOS devices themselves. Your devices use the same machine learning algorithms to include your photos in Apple’s preset weights, but they aren’t pushed to Apple’s servers. Because each data set is analyzed locally, there is no shared decision making as there is with Google. Each device must do heavy lifting itself, rather than rely on remote servers for the bulk of the work.

For data privacy and security concerns, localized machine learning has an advantage. If you don’t need to send photos and data back and forth from server to client, and if providers don’t need to store and host data, the data’s vulnerability to attack is greatly reduced.

The examples above focus on object and image recognition in photos by a machine learning system. This is only one of dozens of uses for AI and machine learning systems.

It’s also easy to see how an AI system is useful for libraries and archives in creating metadata from digitization projects. AI systems can be trained to recognize locations from a single photograph—including where the photographer was standing—based on angle, geography, and other factors. These systems can be enormously useful in making the processing and cataloging of archives and collections more discoverable.

As more libraries and library vendors move into developing AI and machine learning systems, we should be sensitive to the privacy implications of collecting and storing the data that’s needed to train and update those systems. As with existing systems where we outsource data collection and retention to vendors, libraries need to be aware of the mechanisms by which that data is protected and how it may be shared with others through training sets. Where libraries can provide local analysis in the style of Apple and iOS, they should.

The opportunities associated with new machine learning systems to reform large portions of library activities will be rich and varied. While it will be some time before AI will conduct full conversations or reference interviews with students and patrons, the use of AI as an increasingly powerful lever inside other systems will progress quickly over the next three to five years. Libraries can watch these systems as they develop, work with vendors, and create their own services and systems so that our values and ethics are baked into the technology at the outset.


Penn State University student Luz Sanchez Tejada uses the school's microcredentialing platform in Pattee Library to earn badges as part of her peer research consultant training. Photo: Steve Tressler

The Making of a Microcredential

Penn State University Libraries evaluates badge steps with help from artificial intelligence

Bohyun Kim, chief technology officer at the University of Rhode Island Libraries in Kingston

An AI Lab in a Library

Why artificial intelligence matters