Technology sometimes falls short

This issue is a collection of stories about new technologies, about optimism, and about limits. A technological advantage that allows humans to travel to space, to build higher, longer, faster, or to bridge one condition or another may not always have the outcome that we really hoped for. And some technologies reshape our ways of thinking and living to such an extent that they themselves become platforms for new speculation.

Article 15 of 18

How to Access Digital Files from the Nineties

A Case Study by Tessa Walsh

Step 1: Start with a physical piece of media. Identify what it is and what hardware you need to read it.

(This is a Zip disk.1)

Step 2: Connect the drive to BitCurator2 using a hardware write blocker3 to prevent any unintentional changes to the disk.

Make a bit-by-bit disk image4 of the drive.

Step 3: Analyze the disk image.

(In this case, an HFS-formatted disk5 indicates that the files originate from a Mac.)

Step 4: Characterize the disk image and extract files.

(For raw disk images of HFS-formatted disks, we can use HFSExplorer,6 either by itself or from within Brunnhilde.7)

Step 5: Look at the results and determine what you have.

(What are these eight .sea files? What else is on the disk?)

Step 6: Figure out how to work with it.

(In this case, we should check whether switching to a modern Mac helps us identify and work with these Mac-originating files. It seems likely that the .sea files are some sort of archive file.8 We can see if the command-line9 version of The Unarchiver10 knows what to do with them.)

Repeat step 5: Look at the results and determine what you have.

(The .sea files are archive files—StuffIt Expander Archives,11 to be precise. Our Mac recognizes some files within these packages as older Word documents and leaves others unidentified.)

(Brunnhilde can help us understand the contents of this .sea archive.)

(But there’s still a question of making each individual file usable. Let’s focus on one file in particular—file “01.05” in the folder “Building Manag”. Modern macOS does not natively know which application to use to render this file.)

Repeat step 6: Figure out how to work with it.

(We can use Siegfried12 to identify the precise file format and version of the file in question.)

(Our file is a ClarisWorks word processing file! Maybe Word will know what to do with it… Hmm. We’re getting there, but what is with the weird formatting? Maybe LibreOffice13 can do a better job of rendering the file.)

(Voila! Now we export the file into a preservation-friendly file format such as PDF/A14 to head off having to repeat this process every ten years.)

Step 7: Scale your operation.

(If we want to enact this type of digital archaeology or digital preservation at larger scales—say, every file in that SEA package, or in every SEA package on that disk, or in that archive, or in the whole of the CCA Collection—we need to automate processes where we can. Luckily, we can use scripts and digital preservation software like Archivematica15 to handle tasks such as format identification, characterization, and normalization of the most common formats in batches, allowing us to spend our time on the difficult and interesting cases.)

This Zip disk is part of the KOL/MAC Project Records.

  • 1

    Zip disks are high-capacity floppy disks produced by Iomega that were commonly used as backup and transfer media in the nineties. 

  • 2

    The BitCurator Environment is a suite of free and open-source digital forensics tools and software. BitCurator was created at the University of North Carolina at Chapel Hill specifically for use by collecting institutions, and is now maintained by the non-profit organization BitCurator Consortium, of which the CCA is a member. 

  • 3

    Write blockers are devices that allow users to interact with and acquire data from digital storage media without altering their contents (i.e., files and file system metadata such as the dates files were last modified). 

  • 4

    A disk image is a digital file that perfectly replicates the content and structure of a physical storage medium such as a hard drive or floppy disk. Disk images retain all qualities of the original media in a software form that is easier to interact with and preserve over the long term. 

  • 5

    Storage media such as hard drives and floppy disks use file systems to keep track of files and their key metadata (dates, file type, permissions, and so on). Hierarchical File System (HFS) is a proprietary file system used by Apple on their “Classic” Mac OS–era devices. 

  • 6

    HFSExplorer is an application that can read and export files from Mac-formatted storage media, including those using the HFS file system. 

  • 7

    Brunnhilde is a free and open-source characterization tool for directories and disk images that produces human- and machine-readable reports to assist in the triage and description of digital archives. 

  • 8

    An archive file is a digital file comprised of one or more digital files bundled together for portability and storage. Archive files are often but not always compressed. 

  • 9

    Command-line utilities are programs that do not have graphical interfaces (i.e., windows with buttons). Instead, users interact with command-line utilities by issuing commands as text in a terminal.  

  • 10

    The Unarchiver is an archive un-packaging program for Mac designed to handle more archive file formats than can be natively handled by macOS, with both graphical and command-line interfaces. 

  • 11

    StuffIt Expander is a proprietary archive utility with the ability to create “self-extracting archive” files, which theoretically can unpack themselves without any software necessary. StuffIt was extremely popular in the 1990s. Despite the promise of the name, “self-expanding archives” are not always able of unpacking themselves on modern computers without the assistance of additional software. 

  • 12

    Siegfried is a free and open-source signature-based file format identification tool. “Signature-based” means that Siegfried identifies files by comparing parts of the file’s code against known signatures collected in format registry databases rather than relying on the file’s (arbitrary and sometimes inaccurate) extension. 

  • 13

    LibreOffice is a free and open-source office suite forked from OpenOffice and maintained by the Document Foundation. LibreOffice is capable of reading files created in a number of obsolete and/or niche word processing formats. 

  • 14

    PDF/A is an ISO-standardized version of the popular PDF file format intended for use in long-term preservation and archiving. 

  • 15

    Archivematica is a free and open-source digital preservation system that manages the characterization, packaging, storage, monitoring, and retrieval of digital files in large-scale digital archives. A number of institutions besides CCA use Archivematica, including MIT Libraries, Tate, and the Museum of Modern Art. 

  • 0
    0

    Sign up to get news from us

    Email address
    First name
    Last name
    By signing up you agree to receive our newsletter and communications about CCA activities. You can unsubscribe at any time. For more information, consult our privacy policy or contact us.

    Thank you for signing up. You'll begin to receive emails from us shortly.

    We’re not able to update your preferences at the moment. Please try again later.

    You’ve already subscribed with this email address. If you’d like to subscribe with another, please try again.

    This email was permanently deleted from our database. If you’d like to resubscribe with this email, please contact us

    Please complete the form below to buy:
    [Title of the book, authors]
    ISBN: [ISBN of the book]
    Price [Price of book]

    First name
    Last name
    Address (line 1)
    Address (line 2) (optional)
    Postal code
    City
    Country
    Province/state
    Email address
    Phone (day) (optional)
    Notes

    Thank you for placing an order. We will contact you shortly.

    We’re not able to process your request at the moment. Please try again later.

    Folder (0)

    Your folder is empty.

    Email:
    Subject:
    Notes:
    Please complete this form to make a request for consultation. A copy of this list will also be forwarded to you.

    Your contact information
    First name:
    Last name:
    Email:
    Phone number:
    Notes (optional):
    We will contact you to set up an appointment. Please keep in mind that your consultation date will be based on the type of material you wish to study. To prepare your visit, we'll need:
    • — At least 2 weeks for primary sources (prints and drawings, photographs, archival documents, etc.)
    • — At least 48 hours for secondary sources (books, periodicals, vertical files, etc.)
    ...