The first time I heard about Digital Dark Age was from Vinton Cerf, VP of Google, several years ago on a conference. At that time it sounded like an apocalyptic Sci-Fi story.
“As the way that we store information about ourselves develops, memories stored in files that use older technology are becoming harder to access. That could mean that historians of the future are unable to learn about our lives”
However, as the digital age progress with ever increasing pace that apocalyptic future seems to already be here. Code we wrote few years ago is depending on libraries and tools that are no longer being maintained or supported. Software tools we used can no longer be started on newer operating systems. Virtual machine snapshots can not be restored as they depend on specific outdated OS or even on a chip architecture that no longer is being used (think ARM vs Intel x86).
“Abandonware is a product, typically software, ignored by its owner and manufacturer, and for which no official support is available. Types of abandonware are: commercial software unsupported but still owned by a viable company, commercial software owned by a company no longer in business, unsupported or unmaintained shareware, open source and freeware programs that have been abandoned, orphaned code.“
It happened to me several times in the past few months. I had to restore some content from my backups and it failed!
- Code: 2 years ago I wrote some code in NodeJS v8 for Lambda. That can’t be executed any longer as Lambda doesn’t support v8 versions. => only option is to replatform
- AMI: 4 years ago I created a snapshot of my EC2 virtual machine as Amazon Machine Image. It is using old Linux version that cannot be started again on EC2. => abandonware
- Database: 6 years ago I developed an application on MySQL v5.5 that I wanted to start again. Amazon RDS doesn’t support that version any longer. => abandonware
This is nothing specific to any public or private cloud vendor. It happens everywhere. There are (still, but for how long before they become abandonware as well?) tools out there that can migrate older formats to newer versions, but it’s an investment at the end. Takes time/money and the results might not be 100% successful (data corruption etc.).
I often see companies neglecting backup strategies. Even worse, not having backups at all.
“Jesus saves, but God backs up“
Saving data is not the same as preserving data. Data preservation is done through formal activities that are governed by policies, regulations and strategies directed towards protecting and prolonging the existence and authenticity of data.
There are industries where regulators demand keeping backups for last 10 years. If those backups are VM or DB snapshots (like my case 2. and 3.) then good luck with that. How and where will you restore them?
The question is what can we do about avoiding abandonware?
I am not sure I have the answer. I suppose a careful thinking about the backup strategy is needed. Format that we choose to write our data to should probably be textual wherever possible. Database tables should be exported in CSV or similar format that should be readable and importable by any future database engine. Source code is anyway in textual format. I wouldn’t try to compress textual files (zip/tar) just to save space. Zip/Tar decompression tools might not be available/compatible in the future. Images and videos are in danger but probably less than others. As for virtual machines, the best is not to backup them at all.
In any case, a backup strategy should take into consideration “restore and refresh” times every year or every second year. In those times a selected backups should be taken and restored. Data should be verified and eventually migrated to a newer platform before putting them back on the backup storage. It’s a manual and extensive work, but probably needed if you want make any sense of your historical data at some point in time.
As for my backups, I will probably trash everything that is older than 3 years (except pictures/videos). I am under no regulatory obligations to keep them. But the same probably cannot be said for your company…