Welcome back to the C-SPAN Chronicles! I am your host, Kelsey Kim, the newest addition to SCRC, and the new C-SPAN Project Archivist.
Hopefully our return hasn’t set your head spinning. Amanda Brent, processor extraordinaire, walked you through the lengthy course of processing all of the hundreds of boxes of C-SPAN material in 2017. I am here to walk you through the next stage: digitization.
Now, for many people, digitization conjures up images of a lone person sitting in a dark room, slowly moving a paper from the “to do” pile, to the scanner, to the “done” pile. While that is part of the process, it really just scratches the surface of all of the work that goes in to digitizing archival records. In fact, the actual scanning is approximately a third of the work involved in digitization.
Before beginning any digitization project, there is a LOT of prep work to do. My first step was to get to know the records! This is no easy task—post-processing, the C-SPAN collection comprises 469 boxes filled with paper records, photographs, film negatives, audiotapes, videotapes, hand-drawn illustrations, and more! As I started going through these records, I was trying to get a sense of the breadth of the content, its organization, the physical condition of the material, the extent of protected information, as well as just generally what seems interesting! This involved many days of me sitting in the climate-controlled stacks room, just going through boxes one at a time, taking notes on what I found. I also needed to get material counts of a sample of boxes so that I could loosely estimate exactly how much there was to digitize. I took a 10% sample (keep in mind, that’s more than 40 boxes) and counted the items within them, calculating at the end that there were, on average, 766 documents per box. Oof!
The next step was to run some tests. In the SCRC, the digitization work is done on flatbed scanners, which makes the digitization project lengthy. As I tested how long it would take me to digitize a single box of material, I quickly learned that we needed to speed up the process significantly. This formed the next step of the process—equipment upgrades. Now, for a small sidebar.
*SIDEBAR* Document scanning is done in different ways, but for archival digitization, there are specific requirements to fulfill. Because these documents are more fragile, you cannot stick them in a sheet feeder (they might jam, and there goes your unique, historic document!), you can’t expose them to too much heat (they can fade or become more brittle or straight up burn!), and you can’t put too much pressure on them (they can actually break!) Flatbed scanners have been a go-to for digitization in a bunch of places, from your grandma scanning old family photos, to your insurance agent sending you policy documents, to your local archives offering digital copies of original records. The reasoning here is that flatbed scanners are easy to use, inexpensive to purchase, and produce a quality image. They are generally safe for archival digitization because they are a low-heat, low-pressure, single-sheet solution. However, flatbed scanners have drawbacks: they are relatively slow (and sometimes need breaks to keep from overheating), they can only digitize flat records (don’t press down on old book binding!), and the quality, while fairly good, has its limits. As a result of these drawbacks, many institutions (university libraries, archives, museums, etc.) have switched to a new method: digital cameras. On a digital camera, a technician can get a high-quality image in the space of a second, and you can capture more than just flat material without applying unsafe pressure. The downside? They are more expensive (but getting cheaper) and require a lot of setup. For professional mass digitization, you don’t want to just stand there and shoot picture after picture like you do on your phone. You have to make sure the documents are properly lit (with cool lights that don’t radiate heat), the camera is stationary and level, the page is appropriately framed, the color is accurate, and you have to make it consistent. That calls for a copystand!
A copystand is essentially a camera mounted to a tall arm so that it shoots directly down at a document on a flat, nonreflective surface. The document is lit by LED or CFL bulbs arranged at angles to avoid glares. Sounds easy, right? Well, it can be, but we needed a BIG ONE, so we could fit records of all kinds and sizes on it. So my next task was to figure out how to get one of those! It took a lot of research, too many sales phone calls, and a whole lot of persuasion, but finally, the order was in! Now, back to getting the materials ready to go. *SIDEBAR OVER*
Given the number of documents and the time available, it was clear that digitizing the entire collection was not possible, so my next task was to use my notes from my initial perusing to prioritize what I thought should be digitized. I zeroed in on a few series which struck me as informative, interesting, and illustrative. Now, with the box number whittled down, I had another big task before we could move to the actual “paper to computer” phase.
That big task? METADATA. For those unfamiliar with the term, here’s a description from the National Archives. Want a shorter summary? Metadata is basically information related to an object or document or file. Searching for an item in your library catalog? Metadata helps you find it. Looking for a song on Spotify? You’re searching metadata. Trying to find that unflattering photo of yourself that grandma recently scanned in?
Oof, better hope you have some metadata (or you’re in for a lot of random clicking)! Metadata can mean anything from title or author to file size or date modified. For an archival record, you might want to collect metadata on who the author is, when it was written, what the subject is, when it was digitized, what file format it’s now in, and so on. Of course, you have to be selective about what you need—you don’t want to give yourself 50 boxes to fill in for every item! Luckily, there are some guidelines to help identify what metadata is useful. For C-SPAN, we identified several elements we wanted, including title, description, date, location, and others. For our own purposes, we’ll also collect data on the digitization process.
Now we’re done, right? Ready to scan? Almost! We still have to outline the workflow, physically prepare the records (remove staples, smooth out folds, etc.), set up rules about file naming, and determine storage structure. For some records, we’ll also collect preliminary metadata before it even goes to the camera! We definitely have more work ahead! But more on that later…
Fascinated and want to learn more? Here you go!
Why don’t archivists digitize everything?
https://peelarchivesblog.com/2017/05/31/why-dont-archivists-digitize-everything/
More on copystands
https://www.sustainableheritagenetwork.org/digital-heritage/copystand-equipment-and-setup-tutorial
More on metadata—it’s everywhere!
https://www.lifewire.com/metadata-definition-and-examples-1019177
The C-SPAN Chronicles: Season 2, Part I