Saving Digital Ephemera

Who’s collecting and archiving podcasts, tweets, emails, and other fleeting content?

By Adam Doster | January 4, 2016

In 2005, before the words “podcast” and “boom” ever appeared in the same sentence, an archivist named Jason Scott, proprietor of textfiles.com, attempted to collect every podcast in existence. (Many of those first files are still sitting on DVR discs in Scott’s attic.)

Larger institutions also got involved in attempting to preserve digital ephemera. That includes the Library of Congress (LC), which reached an agreement with Twitter in 2010 to build an onsite research archive.

“Archiving and preserving outlets such as Twitter will enable future researchers access to a fuller picture of today’s cultural norms, dialogue, trends, and events to inform scholarship, the legislative process, new works of authorship, education, and other purposes,” reads a 2013 white paper from LC on the topic.

However, at Twitter’s current size, its users send 200 billion tweets per year, and LC’s project eventually became unsustainable.

Academic libraries are helping to fill the void with social media research and data collection. George Washington Libraries at George Washington University in D.C., created an open source tool called Social Feed Manager to capture social media data for research, archiving, and academic work. In 2013, the project received a $24,550 Sparks! Ignition Grant from the Institute of Museum and Library Services (IMLS).

Likewise, Syracuse University’s School of Information Studies has created Social Media Tracker, Analyzer, and Collector Toolkit at Syracuse (STACKS), an open source project that collects and analyzes social media data related to the 2016 presidential campaign.

“It’s getting less and less expensive to save things digitally; it’s less of an issue,” says Rachael Bower, director of the Internet Scout Research Group, at the University of Wisconsin–Madison. “On the other hand, storing oodles and oodles of digital material with no easy way to access it, to look through it and know what you have, doesn’t seem ideal either.”

Librarians and archivists must also consider the speed with which technology evolves. An archival copy of a podcast, say, must include the relevant software to play the actual show, even if advances in computing will eventually make that software obsolete.

Alexis Rossi, director of media and access at Internet Archive (IA), which maintains the Wayback Machine—a program that constantly browses the internet to record and replicate websites at specific moments in time—says preserving digital files remains a subjective task, for the most part.

“People self-select,” she says. “Somebody has decided that I have this amazing collection of [personal] material, and I need to find a home for it.”

But IA’s mission, as Rossi describes it, is “to archive all of human knowledge and to make it accessible to everyone.” She says more than 2 million people use the site daily, and her colleagues are working to make it more searchable. (IA recently received a couple of large grants, including one for more than $350,000 from the Institute of Museum and Library Services (IMLS) to help expand the capacity for national web archiving.)

The Wayback Machine also houses collections of podcasts and blogs on the site, where individuals upload their own material onto secure IA servers. It’s where textfiles.com’s Scott now works.

Stanford University Libraries, meanwhile, with the assistance of a recent $685,000 National Leadership Grant for Libraries from IMLS, is developing the second phase of ePADD, an open-source discovery module that will provide researchers with easier access to email archives.

But most of the work around born-digital content is still preliminary. Kari R. Smith, a digital archivist at the Massachusetts Institute of Technology, says that, within umbrella organizations like ALA and the Society of American Archivists, there are round tables and working groups that are constantly looking at how to describe and capture this kind of material and how to ensure like-minded people don’t waste finite resources on projects with duplicate aims.

“Making sure you’ve got some sense of why you’re preserving what you’re preserving long term,” Bower says, “is incredibly critical.”

ADAM DOSTER is a freelance writer living in Chicago.

Tagged Under

Christy Karpinski and a selection of political buttons from the Busy Beaver Button Museum in Chicago. Photos: Rebecca Lomax/American Libraries (Karpinski); Busy Beaver Button Company (buttons)

Bookend: Pushing Buttons

Button Museum archives tiny pieces of history

Kansas City (Mo.) Public Schools students work on the computers at Kansas City Public Library. (Photo: Kansas City (Mo.) Public Library)

Linking Students to Libraries

Student IDs serve as library cards in Kansas City and Nashville

Latest Library Links

9h

Max Eddy writes: “It’s a nightmare scenario: You’ve protected all of your online accounts with two-factor authentication (2FA), but then your phone is broken, lost, or stolen, and you’re locked out of everything. Past You’s effort to protect Future You has made Present You’s life a living hell. 2FA is supposed to keep attackers and scammers out of your online accounts, but what if something happens to your second factor? With a little planning, you can reduce that risk and still keep your accounts safe.”

New York Times Wirecutter, Apr. 12
13h

Nik Altenberg writes: “About 100 librarians and their supporters rallied outside San Francisco Public Library’s Main Library on April 9 to demand the city hire security guards for every branch. Workers decried a lack of security at most of the city’s branches and said they are often forced to de-escalate volatile situations and step into the role of providing security themselves. The rally is the latest in a series of union actions [from 10 unions representing more than 25,000 city workers across city departments] seeking to draw attention to what they say is a pervasive understaffing crisis.”

KQED-TV (San Francisco), Apr. 9
1d

“Two years after Brooklyn Public Libraries’ (BPL) launch of Books Unbanned, an initiative to protect the freedom to read for young people, In Their Own Words: Youth Voices on Books Unbanned gives new insight into the impact of censorship on teen and young adults across the US and how restrictions and other barriers to access build upon and reinforce each other. The new report, published April 11 by BPL and Seattle Public Library, analyzes 855 stories shared by young people who signed up for a free Books Unbanned ecard from April 2022 through December 2023.”

Seattle Public Library, Apr. 11
2d

Samantha Guss, Sojourna Cunningham, and Jennifer Stout write: “Recruitment and retention are both critical to diversity, equity, and inclusion efforts in academic libraries, and failing to improve retention has and will continue to derail these initiatives. Research that addresses retention tends to focus on proposed strategies, such as stay interviews (structured interviews aimed at strengthening employee and employer relationships) and mentorship programs. But there is no agreed-upon definition of retention that would allow us to assess these strategies. We should recognize that involuntary staying can be just as negative an outcome for the individual and the organization as leaving, setting the stage for legacy toxicity.”

In The Library With The Lead Pipe, Apr. 10
2d

Jackie Edwards writes: “As spring rolls around, seasonal allergies can flare up among library workers, resulting in symptoms like sneezing; runny or stuffy nose; itchy eyes, ears, and nose; coughing; and asthma. Not only are these symptoms detrimental to well-being and productivity, but they can also mean you’re less able to provide excellent customer service to library users. Fortunately, by purifying the air, preventing mold, and keeping your work area clean, you can improve indoor air quality and combat allergies in your library.”

Library Worklife, Apr.
3d

In Episode 94, Call Number celebrates Preservation Week, to be held this year April 28–May 4. The week’s events aim to raise awareness of the role libraries and cultural institutions play in protecting historic and culturally significant collections. Segments include Traci Sorell, honorary chair of this year’s Preservation Week, discussing the role of preservation through storytelling; Kathleen Monahan, special collections public services supervisor at Boston Public Library, addressing the importance of security in preservation; and Rosie Grayburn and Melissa Tedone, cofounders of the Poison Book Project, talking about their research on potentially toxic bookbinding materials from the 19th century.

AL: The Scoop, Apr. 15
3d

Leigh Kunkel writes: “The legalization of marijuana in many states and municipalities in recent years has created a newly legal industry and budding entrepreneurs who can benefit from the expertise of business librarians. As soon as Washington state introduced an initiative to legalize recreational cannabis use in 2012, Seattle Public Library librarian Jay Lyman started fielding questions from potential entrepreneurs. Since then, 24 states and the District of Columbia have legalized recreational cannabis use, and 14 more have legalized medical use of cannabis. This cultural shift brings a new opportunity for libraries to step in with support services.”

American Libraries Trend, Mar./Apr.