Vintage Technology and Bit Rot

By on December 8th, 2017 in Blog Post, Privacy & Security

I’m still looking for a Word file containing the 1995 family history I published. When I find this, it will be on a 5 1/4 inch floppy disk, and it is quite possible that current versions of Word will not support it. Fortunately I’ve saved a USB floppy disk drive, and to my wife’s frustration still have an old Windows system or two. I also have a 9-track tape with files saved from an earlier life — nice for show and tell. Most of my punch cards are gone (including my Master’s project), and I’m not sure if I have any paper tape samples left. All of these point to a problem that Vint Cerf calls “bit rot.”

Vint, “father of the Internet,” raised these issues at least as early as 2008. He has been a advocate for addressing these concerns since then. The good news is that the Library of Congress shares his concerns and has started to address this issue, at least for their own massive collections. David Pogue describes his own concerns and their resolution in a Nov. 2017 Scientific American column. But these issues need consideration by all technologists and technology users.

I had the opportunity to speak with Vint after he presented his perspective to the Library of Congress. He pointed out that it is not just having the floppy disk that is the issue. You need the drive, the operating system version that supports that device, and also the software package used. You need that software, and some ability to display, or better, convert the content to something more current. Once you have brought your 8-inch floppy WordStar 95 file into a current format, you will need to consider how to maintain it in a format you can read in 2095. This is all happening in a world where many companies try to use “proprietary formats” to lock customers into their products.

The OSI Model for communications that informed many pre-Internet efforts to network computers had a key stumbling block — the “presentation layer.” This small bit of magic would allow the data transferred by the lower layers of the model to be understood and used by the highest level, the application layer. In the early 1990s a number of us were concerned about the presentation layer. It affected everything from computer architecture to document formats. Which bit is presented first, the low order bit or high order bit– the “endian” problem. How do you store bytes in a 16 or 32 bit word — the NUXI problem (that’s UNIX on systems with the other 16 bit arrangement.) Tim Berners-Lee (“father of the Web”) effectively solved much of this problem with HTML. In part the HTML format was easy to adopt and present, and it also became ubiquitous. There are still format problems on the Internet, vestiges of the proprietary thinking of the 20th century, and results of more recent innovation.

Oddly this leads to the reverse issue. A question I asked students when I was teaching is “what is the half-life of a web page?” While it might seem that Web content will disappear in the mists of time, that may be optimistic. Students too often post “amusing” images from a frat party, or spring break, only to find these images can affect job interviews and future opportunities. And it’s not just the pictures you take — with face-recognition everyone else’s photos become a source as well.

I think I know how this will play out. My 1995 family heirloom is lost forever, a victim of bit rot. The unexpected photos, perhaps inadvertently captured by a 21st century life-logger … those are forever.