Data Protection 101: Backup and Restore Methods
Here's a look at how full backups, incremental backups and differential backups differ when backing up enterprise data.
- By Scott D. Lowe
Do you back up your data? While it's one of the least appealing parts of the IT function, it's really one of the most important. In fact, I once worked for an organization that had, in its past, fired the entire IT department when it was discovered that the group had failed to back up the primary financial system for more than six months. This was discovered when the system failed and recovery was impossible.
For those that are new to backup, though, there are number of different backup options to consider and each impacts how the really important part of data protection -- recovery -- is carried out. Below, you will learn the differences between a full backup, an incremental backup and a differential backup.
A full backup is exactly what it sounds like. Each time a full backup process runs, all of the data included in the backup job is copied from production data sources to a backup target. This backup target is generally either a disk or a tape. Backing up to disk is much faster than backing up to tape, but tape has the advantage of being more portable and is often used for archival purposes. As such, many companies have moved from backing up to tape directly to backing up to disk and then offloading that data to tape during production hours. In addition, cloud is an increasingly common backup target. Rather than backing up to an on-site repository, companies back up to a service such as Amazon.
Full backups have some distinct advantages and disadvantages. On the plus side, all of the data is protected during each backup window. However, a full backup has a number of downsides, too:
- Long backup windows. Every byte of data has to be copied each and every night. This results in a backup window that may become insufficient to handle all of the company data. This also means that a full backup can't be run multiple times throughout the work day.
- Retention becomes a challenge. If you have 10 TB of data to back up every night and you want to keep backups for 6 months, you need a ton of capacity -- whether that's disk or tape -- to handle this capacity for the full retention period.
When it comes to simplicity, though, full backups are the gold standard. After all, the data is always there in each backup job. For recovery purposes, an administrator just needs the most recent backup set.
However, because of the aforementioned challenges, it's not uncommon to do a full backup just once per week and then use a different backup method during the work week.
Incremental Backup (Cumulative Incremental)
An incremental backup solves some of challenges associated with full backups, but does add a little bit of complexity to restore operations. Here's the way that an incremental backup works:
- An incremental backup is always coupled with a full backup of some kind. For this scenario, let's say that our example company does full backups on Saturdays only.
- From there, throughout the week, the backup software is configured to perform an incremental backup each day. In this scenario, only data that has changed since Saturday will be copied each night. So, during Monday's backup, any data changed since Saturday will be backed up. On Friday, the same holds true. So, if data is changed on Sunday, it will appear in the incremental backups that take place every day.
- Once the next full backup takes place, the incremental process starts over, working with that full backup as its baseline.
An incremental backup can help to solve the challenges associated with full backups. Because only changed data is copied, backup windows are much shorter and less space is needed to protect data. During recovery, though, an administrator needs to have available both the most recent full backup as well as the most recent incremental backup. The recovery process will combine the two backups into a single recovery process.
One challenge remains, though: If there is a high rate of data change, the incremental backups will increase in size throughout the week at a rate that might not be sustainable.
Differential Backup (Differential Incremental)
And that's where a differential backup comes into play. Like an incremental backup, a differential backup starts with a full backup -- again, often on a weekend. From there, the administrator would schedule daily differential backups. A differential backup backs up all data that has changed since the most recent full or differential backup. As such, on Sunday, the differential backup will back up data that was changed since Saturday's full backup. On Thursday, the differential backup will back up any data that was changed since Wednesday's differential backup. This process provides the administrator with a process that protects just that day's data and minimizes the amount of space needed to store the information.
However, on the recovery side of the equation, a differential backup can be difficult. You need to have the most recent full backup plus all of the differential backups that have taken place since the last full backup in order to recover the most recently protected data. Imagine this scenario: it's Friday and you need to recover a database. You will need last Saturday's full backup and well as the differential backups for Sunday, Monday, Tuesday, Wednesday, and Thursday in order to achieve a full recovery. If any of those backup sets is unavailable or is no good, the recovery process won't work.
Scott D. Lowe is the founder and managing consultant of The 1610 Group, a strategic and tactical IT consulting firm based in the Midwest. Scott has been in the IT field for close to 20 years and spent 10 of those years in filling the CIO role for various organizations. He's also either authored or co-authored four books and is the creator of 10 video training courses for TrainSignal.