Can You Trust Your Computer? - Part 2

by Michael Gemignani

My late wife Carol collected editions of The Night Before Christmas. I duly cataloged her collection using a CP/M program called Superfile. (Anyone out there remember CP/M?) Many years have passed since she died, and I am happily remarried, but I happened to come across some diskettes that contained her collection’s database. I was able to recover Superfile, but not the database. Alas, the diskettes were 5.25 inch floppies, notorious for how quickly they developed bad sectors. I happened to have a working 5.25 inch floppy drive, but the diskettes could not be read. The database I had so lovingly compiled many years previously was lost.
Note how many contingencies had to work to recover the database. First, I had to have a drive that could read 5.25 inch floppies. Second, the database on floppies had to be readable. Third, I had to have the program used to construct the database. Fourth, because the program was written for CP/M, I had to have a way to run the program on a Windows system. This would have required a CP/M emulator – yes, they are out there. All of these steps had to work to recover the database, a long shot at best.

There is a lesson, of course, here. Thoughtfully designed and well-kept data from years ago may be worthless today for any one of a number of reasons: 1) the media on which the data were recorded may not be readable either because the media has become corrupted or there no longer exists a device that can read that media. 2) The program which was used to record the data may be unavailable, or it may not run on available operating systems (Electric Pencil anyone?). 3) The file system used to record the data may be inaccessible to modern operating systems, or there may be no programs that recognize its structure.
To some extent, of course, someone willing to go to the trouble could circumvent the problems, at least to the extent that some of the data could be recovered, but does anyone really want to do this for book collections, family pictures, recordings of family stories, etc., and how many may have the expertise to undertake such recovery in the first place?

Thankfully, backup mechanisms have become more reliable, and many of the problems with early tape drives, floppies, etc., have largely been overcome. But it still remains true that storing data requires media to write it on, a device and software that does the writing, a file structure of some sort to organize the data, and a device and software to get it back.

Anyone who follows the current computer scene recognizes that media, input and output devices, and software are evolving rapidly, probably far too rapidly for the average user and far too rapidly to insure that data stored today will be fully recoverable ten years from today. You may hope that the family history so lovingly collected from relatives all around the country or the world will be available for your great grandchildren because you burned the files and pictures to a DVD. Maybe it will be available to them, or maybe not.

But there is a problem beyond that posed by rapidly evolving technology. The sheer volume of data being churned out on computers around the world is so enormous that the amount itself presents the question of how much of this data is readily available to do useful work or aid in making intelligent and reasoned decisions. I have, therefore, formulated Gemignani’s Laws.

Gemignani’s First Law: Data is useless if you can’t find it.

Gemignani’s Second Law: Data is useless if there is so much of it that you cannot make sense of it.

If you have an important file you need immediate access to, but there are tens of thousands of files like it on your hard drive, you have a real problem. Search programs like X1 can be life-savers, but you still have to have a reasonably good idea of what you are trying to find.       
My second law, I fear, is amply illustrated in the state of the economy where very smart people can choose their data to fit the conclusions they want to reach. Too much data allows anyone to support even positions that common sense tells us are wrong. And where do most economists get most of their data? Either directly or indirectly from a computer.

So you can trust your computer with your data, but only for so long. And you can find the data useful, but only if you can find it, and if there is not so much of it that trying to make sense of it leaves your head spinning. There are serious limits to trusting the machines on which we have become so dependent.
The Rev. Dr. Michael Gemignani, an attorney and Episcopal priest, is also a former professor of computer science who has written extensively on legal issues related to computers. Although he is now retired, he enjoys writing and speaking about computer law and security. Contact him at mgmign2@hal-pc.org with any questions or comments about this topic.