Saving Digital Ephemera

Who’s collecting and archiving podcasts, tweets, emails, and other fleeting content?

By Adam Doster | January 4, 2016

In 2005, before the words “podcast” and “boom” ever appeared in the same sentence, an archivist named Jason Scott, proprietor of textfiles.com, attempted to collect every podcast in existence. (Many of those first files are still sitting on DVR discs in Scott’s attic.)

Larger institutions also got involved in attempting to preserve digital ephemera. That includes the Library of Congress (LC), which reached an agreement with Twitter in 2010 to build an onsite research archive.

“Archiving and preserving outlets such as Twitter will enable future researchers access to a fuller picture of today’s cultural norms, dialogue, trends, and events to inform scholarship, the legislative process, new works of authorship, education, and other purposes,” reads a 2013 white paper from LC on the topic.

However, at Twitter’s current size, its users send 200 billion tweets per year, and LC’s project eventually became unsustainable.

Academic libraries are helping to fill the void with social media research and data collection. George Washington Libraries at George Washington University in D.C., created an open source tool called Social Feed Manager to capture social media data for research, archiving, and academic work. In 2013, the project received a $24,550 Sparks! Ignition Grant from the Institute of Museum and Library Services (IMLS).

Likewise, Syracuse University’s School of Information Studies has created Social Media Tracker, Analyzer, and Collector Toolkit at Syracuse (STACKS), an open source project that collects and analyzes social media data related to the 2016 presidential campaign.

“It’s getting less and less expensive to save things digitally; it’s less of an issue,” says Rachael Bower, director of the Internet Scout Research Group, at the University of Wisconsin–Madison. “On the other hand, storing oodles and oodles of digital material with no easy way to access it, to look through it and know what you have, doesn’t seem ideal either.”

Librarians and archivists must also consider the speed with which technology evolves. An archival copy of a podcast, say, must include the relevant software to play the actual show, even if advances in computing will eventually make that software obsolete.

Alexis Rossi, director of media and access at Internet Archive (IA), which maintains the Wayback Machine—a program that constantly browses the internet to record and replicate websites at specific moments in time—says preserving digital files remains a subjective task, for the most part.

“People self-select,” she says. “Somebody has decided that I have this amazing collection of [personal] material, and I need to find a home for it.”

But IA’s mission, as Rossi describes it, is “to archive all of human knowledge and to make it accessible to everyone.” She says more than 2 million people use the site daily, and her colleagues are working to make it more searchable. (IA recently received a couple of large grants, including one for more than $350,000 from the Institute of Museum and Library Services (IMLS) to help expand the capacity for national web archiving.)

The Wayback Machine also houses collections of podcasts and blogs on the site, where individuals upload their own material onto secure IA servers. It’s where textfiles.com’s Scott now works.

Stanford University Libraries, meanwhile, with the assistance of a recent $685,000 National Leadership Grant for Libraries from IMLS, is developing the second phase of ePADD, an open-source discovery module that will provide researchers with easier access to email archives.

But most of the work around born-digital content is still preliminary. Kari R. Smith, a digital archivist at the Massachusetts Institute of Technology, says that, within umbrella organizations like ALA and the Society of American Archivists, there are round tables and working groups that are constantly looking at how to describe and capture this kind of material and how to ensure like-minded people don’t waste finite resources on projects with duplicate aims.

“Making sure you’ve got some sense of why you’re preserving what you’re preserving long term,” Bower says, “is incredibly critical.”

ADAM DOSTER is a freelance writer living in Chicago.

Tagged Under

Christy Karpinski and a selection of political buttons from the Busy Beaver Button Museum in Chicago. Photos: Rebecca Lomax/American Libraries (Karpinski); Busy Beaver Button Company (buttons)

Bookend: Pushing Buttons

Button Museum archives tiny pieces of history

Kansas City (Mo.) Public Schools students work on the computers at Kansas City Public Library. (Photo: Kansas City (Mo.) Public Library)

Linking Students to Libraries

Student IDs serve as library cards in Kansas City and Nashville

Latest Library Links

6h

Bobbi L. Newman writes: “This week, an article from the BBC caught my attention: ‘Without support, many menopausal workers are quitting their jobs.’ Supporting employees going through menopause is important for those experiencing it and for everyone’s wellbeing. It reflects a commitment to employee wellbeing and a strategic approach to workforce management. Libraries can adopt strategies to support employees experiencing menopause, enhancing their wellbeing and workplace productivity. Remember, with all wellbeing, the goal is to support and empower staff to make the choices that best improve their health. Here are some practical approaches.”

Librarian by Day, Apr. 19; BBC, Apr. 9
12h

Libraries have a long history of helping to deliver on a wide variety of development goals, from literacy and school readiness to research productivity and urban cohesion. Their unique potential has been recognized not just by the governments or others that traditionally fund them, but also by a range of other funders, private and public alike. The International Federation of Library Associations and Institutions has created a dataset to help librarians easily discover examples of private philanthropic grants, as well as other funding sources, that other libraries have been able to leverage.”

International Federation of Library Associations and Institutions, Apr. 22
1d

Jackie Jennings writes: “It feels like the debate over whether #BookTok is bad has been raging since the moment the term was first coined. I’m starting off with a strong stance: BookTok is indeed bad. However, the problem with BookTok is not crappy books or bogus influencers. The problem with BookTok is TikTok itself. BookTok isn’t actually a community driven by fans, writers, influencers, or even publishers: it’s part of a social media corporation, controlled by the most mysterious, fickle god of all, the algorithm.” Not surprisingly, librarian recommendations can overcome some of BookTok’s limitations.

Jezebel, Apr. 18; Book Riot, Apr. 22
2d

ALA announced the launch of its state Intellectual Freedom Helpline grant program April 22. Over the next two years, 10 pilot program sites will operate a confidential reporting system that will help connect those experiencing censorship attempts with professional support, in-state peers, or referral to ALA’s Office for Intellectual Freedom, as appropriate. State or school library associations or agencies wishing to either establish an Intellectual Freedom Helpline in their state or expand existing efforts may apply for $10,000 grants through July 14.

ALA Office for Intellectual Freedom, Apr. 22
2d

In celebration of the release of his latest nonfiction title, The Secret Lives of Booksellers & Librarians, bestselling author James Patterson is honoring select American Bookseller Association and American Library Association members with bonuses. He announced plans April 11 to give $200 each to 250 library workers across the country. The deadline for ALA members may nominate members to receive bonuses through April 30. Winners will be announced at ALA’s 2024 Annual Conference in San Diego.
2d

Catherine Hollerbach writes: “In early 2020, when the world shut down for COVID, many people got interested in houseplants. Anne Arundel County (Md.) Public Library’s Crofton Library embraced this trend and then some!” While preparing to reopen after the COVID shutdown, the library installed plants at the information desk to discourage patrons from sticking their heads through gaps in newly installed acrylic shields. They were well received and cared for, and the library gradually added more plants and built educational tools, programming, and partnerships around the plants.

Public Libraries Online, Apr. 18
2d

Rodney Freeman writes: “I am proud to be a librarian—and rare. Less than 7% of librarians in the US are Black. Libraries symbolize the literacy that was denied to so many of our ancestors. For our enslaved forebears, something as fundamental as learning to read was illegal and dangerous, but they did it anyway. Separate but ‘equal’ schools and ‘colored’ libraries filled with cast-offs from white libraries were key features of the Jim Crow era. Today we are seeing the same impulse to distort access to information into a tool to suppress and control, and to make some people ‘other.’”

Newsweek, Apr. 16