Zen And The Art Of Data Restoration

Standardization is elusive, even though there are so many standards to choose from. Or perhaps BECAUSE there are so many. Nevertheless, for many years I have been pursuing an elusive goal of digitizing, standardizing, and archiving all of my data. In this chapter of the journey, I ended up purchasing a 15-year-old computer to restore one file.

By Erik J. Heels

First published 5/1/2003; Law Practice Management magazine, “nothing.but.net” column; American Bar Association

I have a love/hate relationship with computers (see “Top 10” lists below). I love them because they can be powerful and helpful, but I hate them because they can be annoying and counterproductive. Plus they do fail, they will break, people will make mistakes, and data will be lost.

In Robert M. Pirsig’s “Zen and the Art of Motorcycle Maintenance,” the storyteller (I hesitate to say “main character”) Faedrus says, “Motorcycle maintenance gets frustrating. Angering. Infuriating. That’s what makes it interesting.” So, too, with computers. Through the years, I have modified my data backup procedures in order to account for the ever increasing amount of data I create each year. There have been, however, notable failures along the way:

  • In May 1988, a Macintosh floppy disk containing my thesis failed the day before my thesis was due and resulted in the loss of three key hours of work.
  • In May 1999, a denial-of-service attack on my website led to a hard disk crash and partial loss of three months of recent data as well as most of my data from MIT and law school (http://www.abanet.org/lpm/mo/premium-ep/magazine/articles/magarticle31_front.shtml). My thesis was spared, thanks in part to excellent data recovery work by Ontrack (http://www.ontrack.com/).
  • In September 2000, my first Windows online backup provider failed to include the last month of data in restoring a full backup to CD-ROM (then went out of business). So I switched to a second Windows online backup provider, Connected.com (http://www.connected.com/), with whom I am still satisfied. I had used the first provider for both Macintosh and Windows backups, but Connected.com does not support Macintosh, so I also switched to a second Macintosh online backup provider.
  • In January 2002, I changed the name of the top-level directory on my Macintosh that contained all of the data that I backup. My second Macintosh online backup provider ended up making two copies of all of my data because their algorithm (unlike Connected’s) wasn’t sophisticated enough to figure out that the underlying files had not changed.
  • In March 2002, while doing a test restore from a Windows98 backup (from the first Windows online backup provider mentioned above), I discovered that my Windows2000 computer could not read the Windows98 backup CDs. Nor could any other computer. This made me want to sue that (out-of-business) first backup provider, but I started writing this article instead.
  • In April 2002, I started standardizing my Macintosh and Windows filenames. I adjusted the MIME settings in Windows, MacOS, Eudora, and Netscape to correctly handle downloads and attachments. I eliminated non-standard characters (i.e. all but a-z, 0-9, “.”, “-“, “_”) from filenames, and converted old files to more portable formats (e.g. JFax files to PDFs, MacPaint files to GIFs). For Macintosh files, I added three- or four-letter extensions to each filename. I used two programs to rename batches of files at once: UtilityDog (http://www.probabilityone.com/) for Macintosh files and Magic File Renamer (http://mfr.queryweb.com/) for Windows files. I consider both utilities essential. There were about five times as many file types on the Macintosh than on Windows, but the majority of files on both platforms were in a handful of formats. On the Macintosh, there were 12,500 files and 60 file types, and of these:
    1. 60% were standards-based web files (HTML, “.gif, “.jpg”);
    2. 17% were proprietary MS Word files (“.doc”);
    3. 6% were standards-based Eudora mailbox files (“.mbx”) (yes that’s 745 mailbox files containing 80,000 messages);
    4. 2% were proprietary MS Excel files (“.xls”);
    5. 2% were proprietary (but portable) Adobe Acrobat files (“.pdf”); and
    6. 13% were everything else. I also noticed that I was no longer using PowerPoint, and I began to wonder why I kept purchasing MS Office if I was only using Word and Excel (but more on this topic later).
  • In May 2002, I decided to move all of my archived Eudora e-mail from my Macintosh to my Windows computer. I later decided to move all of the rest of my files so that I could have all of my data on one computer and backed up by a reliable online backup provider. I canceled my account with my second Macintosh online backup provider.
  • In October 2002, as I was preparing to purchase two new computers (http://www.abanet.org/lpm/mo/premium-ep/magazine/articles/magarticle31_front.shtml), I discovered that one file (out of 12,500) from the May 2002 Macintosh-to-Windows data migration had been lost: the Macintosh program that I wrote for my thesis. Essential? No. Sentimental value? Yes.

Problem – Finding an Old Macintosh

After checking with the MIT Libraries, my thesis advisor, and various backup providers (including my last Macintosh backup provider, which had already deleted all of my files), I concluded that the only way to recover my thesis would be to recompile the application from the source code, which I still had. I needed to purchase a computer, operating system, and the compiler in order to recompile my thesis.

I decided to try to purchase exactly the same environment that I had in 1988: a Macintosh SE with 4 MB of RAM, a 20 MB hard disk running MacOS 5.1 and the Think Lightspeed C 2.13 compiler. Thanks to eBay, I had to “settle” for a Macintosh SE with 4 MB of RAM and 40 MB of RAM ($32 including postage) and Think C 3.0 ($8 including postage). I tried to purchase the compiler directly from Symantec, but they never responded to my e-mail. And Apple’s website (http://www.info.apple.com/support/oldersoftwarelist) included a newer version of Macintosh System 7.1 that would run on my Mac SE.

Problem – Reading/Writing Macintosh Disks in Windows

Although my Mac SE included a “SuperDrive” (what should be called a “normal drive” – see “Top 10” list below) for reading/writing both Macintosh and DOS disks, it did not include the software (i.e. PC Exchange by Apple, Access PC by Insignia Solutions, or DOSMounter by Dayna Communications) for enabling this feature. I had no analog modem, and the Ethernet card had no driver, so the floppy disk was the only way to copy files to the Mac SE. Fortunately, I found a piece of software for Windows, TransMac v5.4 (http://www.asy.com/), that was able to read/write Macintosh HFS disks. Fortunately, I had purchased a Dell laptop with a floppy disk drive, so I ended up with the hardware and software necessary to get data onto my Mac SE (i.e. from the Internet, to the Dell, to TransMac, to a floppy disk, to the Mac SE).

Problem – Installing the Ethernet Driver and Connecting to the Internet

The Mac Driver Museum (http://www.macdrivermuseum.com/network.html) was an excellent source of drivers for Macintosh NICs (Network Interface Cards), but I didn’t know which brand of Ethernet card was installed in my Mac SE. I used Google to figure out the brand of Ethernet card by searching for the random text that was visible on the back of the Ethernet card (http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=tpl+tpn+tk). By doing this, I discovered that it was an Asante card, and I was able to find and install the correct driver.

My Macintosh does not use the newer OpenTransport networking but instead uses the older “classic networking,” which does not support dynamic IP addressing, but my network uses a DHCP router (Asante FR3004LC) to dynamically assign IP addresses. I also have a router at home and eventually plan to connect the two networks with a VPN (Virtual Private Network). As such, I have configured the routers so that my work router assigns IP addresses ending in 100-149 and my router assigns IP addresses ending in 150-200. I then assigned my Mac SE a static IP address out of the range of IP addresses dynamically assigned by the routers.

Problem – Some Stuffit Archives Would Not Open on the Mac SE

I wanted to cut out the PC middleman and download software directly from the Internet to my Mac SE, but I needed to install various Internet software programs (web browsers and the like) in order to do this. As I did this, I discovered that I was able to open some Stuffit archives (Stuffit “.sit” files are to the Mac what WinZIP “.zip” files are to Windows) on my new iMac but not on my Mac SE. I ended up learning more about compression from Stuffit’s website (http://www.stuffit.com/stuffit/compression.html). It turns out that not all Stuffit archives are created equal, that the old versions of Stuffit (which the Mac SE can run) can’t open files archived with newer versions of Stuffit (which the Mac SE can’t run), and if I wanted to open Stuffit archives on my Mac SE, they would have to be encoded with the older Stuffit Lite v3.5 (as opposed to the now-current Stuffit Deluxe v7.0.1. So I had use my new iMac to read Stuffit archives and then re-archive them with Stuffit Lite v3.5 so that they could be expanded properly on the Mac SE. In order to make sure that all of my old Stuffit archives could be un-stuffed on the Mac SE, I re-stuffed these and used “.sit35” as the filename extension for these files. Finally, since Stuffit archives (and ZIP archives, for that matter) are binary (i.e. 8-bit) files, they must be FTP-ed in binary mode. If you mistakenly FTP a binary file in ASCII (i.e. 7-bit) mode, the file will be toast. So to be really safe, you should convert all of your binary archives to 7-bit files with BinHex (which StuffitLite v3.5 can also do). UUencode does the same thing for UNIX and Windows files, and WinZip includes the option to UUencode your ZIP archives.

Problem – Finding Old Macintosh Software

Some of the original FTP shareware archives were operational but had only a smattering of shareware (what would Orwell say?), including:

  • Apple Computer (ftp://ftp.apple.com/dts/mac/).
  • Dartmouth University (ftp://ftp.dartmouth.edu/pub/), home of Fetch FTP software.
  • Info-Mac (ftp://sumex-aim.stanford.edu/info-mac/).
  • NCSA (ftp://ftp.ncsa.uiuc.edu/pub/), home of NCSA Telnet and NCSA Mosaic web browser.
  • University of Michigan (ftp://mac.archive.umich.edu/mac/).
  • Washington University (ftp://wuarchive.wustl.edu/mirrors/info-mac/).

Fortunately, much of this software has also been archived (albeit sporadically) on the web:

The following websites are good sources for current (and occasionally some older) Macintosh shareware:

Finally, the following websites were helpful in my quest to get my Mac SE configured and on the Internet:

Problem – Recompiling my Thesis

Once I managed to get my Mac SE configured and on the Internet, it was actually quite easy (comparatively) to recompile my thesis. I had to change a couple of things due to a few syntax changes from Think Lightspeed C 2.13 (which I originally used) to Think C 3.0. Whenever I got en error message, I searched Google and found the solution to my problem. So this is version 1.1 of my thesis. And it is very well backed up.

Lessons Learned and Next Steps

Along the way, I discovered that some old System 6 and System 7 applications (i.e. those written for Macs with Motorola 68000 CPUs, also called “68K Macs”) also ran on my new iMac, which runs Mac OS X natively but which also runs MacOS 9 in “classic mode.” But the iMac did not run the compiler, so my Mac SE purchase was not in vain. I now keep the Mac SE in my office as a reminder of why good data backup (and restore) procedures are important, why they should be periodically tested, and why old Macs are cool.

I was not able to find a copy of my favorite game MacCommand, but I was able to remember the author’s name simply by thinking really hard, which was an interesting sort of “restore” process in its own right. Was it Al Tervalon? No, he was a guy who sang with me in the MIT Logarhythms. Was it Linus Torvalds? No, he’s some Finnish programmer who went to school with my cousin Heidi and ended up writing a nifty piece of software you may have heard about. It was Avadis Tevenian, and Avie, if you’re out there, I’d enjoy purchasing a copy of MacCommand for my kids (OK, and for me).

This article took a long time to write, but (again quoting Pirsig’s “Zen and the Art of Motorcycle Maintenance”) “[w[hen you want to hurry something, that means you no longer care about it and want to get on to other things.” I do care about the Quality of my data. I do want to digitize my old photos, cassette tapes, VHS tapes, and vinyl records. I do feel a need to move away from proprietary file formats and operating systems to standards-based ones. In January 2003, I moved to a UNIX-like directory structure (e.g. keeping my data in “/usr/erik/”). But why do I keep purchasing Microsoft Office when all I use is Word and Excel? And why Windows? When I converted all of my Macintosh files to Windows-like filenames, I ended up moving all of my Macintosh data to my Windows computer. Will moving to a UNIX-like directory structure signal a move from Windows to UNIX? Stay tuned.


Sidebar – Top 10 Annoying Things About Macintosh Computers Sidebar – Top 10 Annoying Things About Windows Computers
Mice. The one-button mouse provides limited functionality. Mice. The two-button mouse is confusing and hard to use for left-handed people (or right-handed people like me who use the mouse with their left hand).
Disks. The 3.5″ floppy disks cannot be read in a PC without additional software. When Apple finally fixed this flaw, they called the new drive a “SuperDrive,” which, annoyingly, requires additional software in order to read/write PC floppy disks. Disks. PCs are not smart enough to know when floppy disks are inserted or removed. The “abort, retry, fail” error still occurs today and is a classic example of horrible user interface.
Monitors. Computers frequently come with integrated monitors, which leaves users with no display options. Monitors. Computers typically do not include monitors, and there are too many monitors to choose from.
Keyboard Commands. The keyboard has oddly named keys such as “escape” and “control” and baffling commands such as command-control-power to reboot and command-option (on startup) to rebuild the desktop database. Keyboard Commands. The keyboard has oddly named keys such as “escape” and “control” and baffling commands such as control-alt-delete to reboot and F8 (on startup) to boot in “safe mode.”
Text Files. In UNIX, text file line-endings are terminated with a newline (\n), which is also called a linefeed (LF), but in Macintosh (pre-OS X, which uses the Unix standard), line-endings are terminated with a single carriage return (CR)(\r). Text Files. In UNIX, text file line-endings are terminated with a newline (\n), which is also called a linefeed (LF), but Windows, line-endings are terminated with a combination of a carriage return (\r) and a newline(\n), also referred to as CR/LF.
Files. Some versions of MacOS allow only 32-character file names. Some non-alphanumeric characters are allowed in filenames, which can cause problems on UNIX systems. Plus files are stored in two parts or “forks,” a data fork and a resource fork, so if you copy a binary file from a Macintosh computer to a non-Macintosh computer, you may end up losing half of the file (which could render the file useless). Files. Some versions of Windows allow only 32-character file names, and older versions allowed only 8-character filenames (plus three-letter extensions). Some non-alphanumeric characters are allowed in filenames, which can cause problems on UNIX systems. Plus many files are stored as “hidden” files by default.
File/Application Mappings. Files include a four-letter “creator” code that dictates which program will open the file when you double-click on it. It’s nontrivial to change the mapping for a particular file, especially within e-mail and web applications. File/Application Mappings. Files include a three-letter “extension” that dictates which program will open the file when you double-click on it. It’s nontrivial to change the mapping for a particular file, especially within e-mail and web applications.
Networking. Rather than use existing Unix networking protocols, Apple decided to create its own protocols such as AppleTalk. Networking. Rather than use existing Unix networking protocols, Microsoft decided to create its own protocols such as Windows Networking.
User Interface. Apple arguably stole the user interface for the Macintosh from Xerox PARC and got away with it. User Interface. Microsoft arguably stole the user interface for Windows from Apple and got away with it.
Operating System. The MacOS operating system is inherently proprietary. Even though Mac OS X is based on FreeBSD, there is no indication that the operating system will ever be licensed to third party developers or converted to open source. Operating System. The Windows operating system is inherently proprietary, and Microsoft itself has admitted that Windows95 is “inherently insecure.” There is no indication that Windows will ever be licensed to third party developers or converted to open source.

One Reply to “Zen And The Art Of Data Restoration”

  1. Back up is a very vast area, i’ve got windows and linux systems that i constantly have to back up. thank god for tape drives

Leave a Reply

Your email address will not be published. Required fields are marked *