Automating Network Backup & Restore: Performance Considerations for DLT-based Backup Solutions
With storage requirements rising at an incredible rate of 50 percent per year and employees logging onto the network at all hours and from all places, when is a good time to backup your network? This article looks at common network performance issues, including bottlenecks andlimitations, and provides a recommended configuration for network backup.
Storage requirements are skyrocketing at a rate of 50 percent per year according to some estimates. At the same time, data users are coming to work earlier and staying later, and logging on to the network at all hours – from home, in the office or on the road. When are you supposed to backup your network? How can you protect your company’s mission-critical data when backup windows get tighter each year?
This paper discusses common network performance issues, including bottlenecks and limitations, and provides a recommended configuration for network backup. It shows how automated backup solutions can bring tangible results to your organization, including more reliable backup, easier data management, and higher productivity for IT staff. Perhaps, most importantly, it shows how automating your backup procedures helps protect your company’s most valuable asset, and allows you to perform backup and restore operations with fewer headaches and fewer surprises.
Backup Performance: Options and Obstacles
There are two primary backup models for most corporate data: local and remote. Local backup performance is affected only by the components within a single computer. Remote backup must take into account not only the local components, but also the remote devices that comprise the network environment. In all cases, backup performance will be dictated by the slowest component in this chain.
Every data bit that moves along the network faces potential bottlenecks. The file system, amount of server memory, RAID configuration, tape technology and client application can affect how data moves between the network and storage environments. File size and file compression utilities also affect data rates. But, the biggest factor affecting performance of network-based backup solutions is network infrastructure.
Network Infrastructure. Network infrastructure refers to the type of networks used, and how they are connected. Two of the most common types of networks in the industry today are 10baseT and 100baseT. Transferring data across a 10baseT network can be excruciatingly slow. Network transfers from Microsoft Windows NT and NetWare clients can manage only 40 MB/m at best, while transfers from Windows 3.1 clients can be as slow as 8 MB/m.
Even in server-to-server network backup, 10baseT network bandwidth between servers can create a performance bottleneck. Faster networks can solve the problem: 100baseVG, 100baseT, FDDI or ATM network hardware delivers order of magnitude improvements in available network bandwidth. Upgrading the network to these faster technologies mitigates potential bottlenecks, but it’s still important to match the technology to your network.
For example, 100baseT and FDDI networks have the same performance specification of about 12 MB/s1. But FDDI uses a slot-based protocol to guarantee bandwidth. This minimizes network collisions when multiple clients are trying to use the network at the same time. Performance in a 100baseT network drops off significantly when three or more clients are actively using the network. This can be resolved to some degree by using network switches instead of hubs. Another consideration is that FDDI is expensive, and the performance gains over 100baseT are not always significant.
Multiple Network Connections. In an environment with multiple network connections, network interface cards (NICs) are used to control the flow of network traffic to and from the backup server. A switch or hub connects these NICs to the network. A network comprised exclusively of 100baseT NICs will be inherently fast, with transfer rates of about 8 MB/s. But, if the network is mixed, transfer rates will be limited to the lowest common performer. If a 10baseT component is present, maximum transfer rate will be about 800 KB/s. That rate can be improved somewhat by packet optimization software or by adding NICs to the backup server, which increases bandwidth when backing up multiple network clients.
Software. Push agents, interleaving and parallel streaming are techniques used by commercial applications to alleviate network bottlenecks.
• Push agents package data at the client machine or server, assembling it into packets that can be transmitted more efficiently using available network bandwidth.
• Interleaving allows data to be gathered and sent to tape from multiple clients, simultaneously, rather than backing up each client sequentially, so the data transfer rates maximize total network bandwidth.
• Parallel streaming allows multiple data backup streams to go to multiple backup drives automatically, potentially increasing performance.
File Size and Compression. Another significant factor affecting performance is average file size and file compression. Large files reduce overhead both on the backup client, where file system look-ups and disks are minimized, and on the backup server, where file information is recorded to a database for file-level restore.
File compression can increase backup performance by increasing the effective write speed of the tape. Data is typically compressed by hardware that resides in the DLT drive. When data is highly compressed, the drive can accept more data relative to the amount of data being written to tape. Data files that are very sparse or highly repetitive, such as database or text files, can be greatly compressed.
Backup Client Performance. The backup client is responsible for gathering the data through a series of file system operations during network backup, sending its shipment to the backup server, and communicating with the backup server to confirm proper delivery.
Overall backup performance is significantly affected by the computational and I/O capabilities of the client. Specific examples affecting client performance include amount of memory, processor type and clock speed, disk seek time and data rate, RAID configuration, and PCI bus architecture.
Dedicated Network Backup. Several elements should be considered when identifying network performance issues. Network backup increases traffic volume and can therefore reduce performance levels for clients on the network. And if the network infrastructure is inherently slow – 10baseT, for example – or includes gateways or routers between the backup server or clients, network clients will see significant performance drops during network backup. In many cases, deploying a separate dedicated backup network may make sense. A dedicated backup network uses a separate, private network for performing backup. Clients typically connect to the dedicated backup network via a separate network card in the client system that talks only to the private backup network.
The preferred configuration for a dedicated backup network adds high speed (100baseT) NICs to each of the servers to be backed up, and connects the servers via a switch to the backup server. This provides a direct path for backup, and removes backup traffic from the client network, thus maintaining performance. The most significant advantage, however, is increased backup performance since a dedicated network provides the full bandwidth of the network for backup operations.
Automating the Backup Process
There are three common configurations in network backup: server-client, server-server and direct-server. Each of these configurations requires a separate tape drive attached to a PC or workstation, or multiple servers. While some of these configurations can be extremely beneficial for smaller locations, mid-size to enterprise-level businesses may find manual tape backup inefficient.
With an automated backup solution, servers can be backed, with less operator intervention and with fewer errors and mishaps. As a result, the overall cost of backup is less expensive since fewer tape drives and fewer person-hours are required for backup. At the same time, the ability to recover from a data disaster is improved. With automation, backup happens in a controllable, predictable fashion, which means your data can be restored in the same way: predictably, reliably and quickly.
Getting the backup performance you need takes a combination of skill and technology. The right approach for your network depends on many factors: performance requirements, data volumes, staff resources, ease-of-use requirements, data sensitivity, end-user expectations, system scalability and cost all come into play.
For many mid-size to enterprise-level businesses, automated backup solutions are a popular alternative to manual backup. Automation meets extremely tight backup windows, and is constantly evolving to meet ever-tighter backup windows in the very near future. Just as importantly, performance meets the needs of most networks – a very real concern for IT managers with growing data volumes and shrinking backup windows.
The key to implementing a successful automated backup strategy – one that meets performance expectations – often rests with potential bottlenecks and network infrastructure. Address bottlenecks and infrastructure first, and an automated solution will meet your needs for increased reliability, manageability and productivity.
About the Author: Jeff Dicorpo is the DLT Library Storage Solutions Project Manager for the Storage Systems Division at Hewlett Packard (Greeley, Colo.).
Scott Paul is a Future Product Marketing Manager for DLT Libraries in the Storage Systems Division at Hewlett Packard (Greeley, Colo.).