Changing How We Physically Back Up Data

Backing up data should extend further than just taking a "screenshot" of a drive.

Is the current way you're doing backup really the way it should be done? Is it time that we change the script on physical backups?

In a book I wrote a few years back, "Definitive Guide to Windows Application and Server Backup 2.0" (Realtime Publishers, 2010), I postulated a "mission statement" for backup and recovery: Backups should prevent us from losing any data or losing any work, and ensure that we always have access to our data with as little downtime as possible.

But here's the truth: Traditional backup and recovery products don't typically do a very good job of meeting this simple statement.

Traditional backup and recovery has essentially relied on snapshots: Grabbing the data at a certain point in time and dumping it to tape as fast as possible, so that we can grab as much data as possible in as short an amount of time as possible. Sometimes, our backup windows are so small and the data so large that we have to rely on differential and incremental backups, which grab the data faster but require even longer to perform a recovery. In the book, I coined the term "Backup 1.0" for this old-school style of backup, which has been basically unchanged since the 1960s.

We Can Do Better
I began using the term "Backup 2.0" to refer to a new way of thinking about backups. Backup 2.0 is fundamentally the concept of continuous data protection, where our servers and applications are backed up in real time or near-real time, so we never really have any at-risk data. A Backup 2.0 solution provides a way to reconstruct anything up to and including an entire disk volume to a very specific point in time, so that we can "roll back" a server to that point in time, or just access particular files or objects from that point in time without actually restoring the data anywhere.

The way this works technically is typically through a file system "shim" and the same technology used to implement third-party disk quota systems. The shim is just a sort of file system driver that gets notified of every disk change at the block level. The shim can grab each disk block as it changes, and transmit that information -- along with a timestamp -- to a central backup server. The backup server can do fancy stuff like de-duplication and compression, if necessary, so that the backups are smaller (potentially much smaller) than the source data.

Most importantly, the backup server can reconstruct disk volumes to a specific point in time by simply assembling the disk blocks leading up to that point in time. With the right tools, you could mount a backup image and browse it through the OS. If the solution had the right knowledge of database structures for popular products, you could restore anything from an individual message or document up to an entire data store, all to a specific point in time -- and all much more rapidly than streaming that same information from tape (although you'd likely still make copies of the backup data to tape for off-site storage, they wouldn't be your first line of defense).

Habits Are Horrible
I guess the real lesson here is that old habits -- like the backup techniques we've relied on for more than 40 years -- can die hard. But can you honestly say that you're satisfied with your old-school backup techniques? That you yearn to dig through tape indexes and wait for data to stream off disk? That you've never been let down by a corrupted tape, or a missing tape, or data that was lost in between backups? We should be constantly questioning the shortcomings of our technologies and processes, constantly defining our "pie in the sky" wishes for how they should work, and constantly pressuring vendors to deliver newer and better techniques and technologies.

About the Author

With more than fifteen years of IT experience, Don Jones is one of the world’s leading experts on the Microsoft business technology platform. He’s the author of more than 35 books, including Windows PowerShell: TFM, Windows Administrator’s Scripting Toolkit, VBScript WMI and ADSI Unleashed, PHP-Nuke Garage, Special Edition Using Commerce Server 2002, Definitive Guide to SQL Server Performance Optimization, and many more. Don is a top-rated and in-demand speaker and serves on the advisory board for TechMentor. He is an accomplished IT journalist with features and monthly columns in Microsoft TechNet Magazine, Redmond Magazine, and on Web sites such as TechTarget and Don is also a multiple-year recipient of Microsoft’s prestigious Most Valuable Professional (MVP) Award, and is the Editor-in-Chief for Realtime Publishers.

comments powered by Disqus