In-Depth
Hard-drive Shortage: Lessons Learned
The recent hard-drive shortage has shown us that there are many ways to manage data.
By Rick Vanover, Product Strategy Specialist, Veeam Software
Any infrastructure administrator takes pride in introducing as many separation layers and failure domains as possible into a design. As we have learned from the recent hard-drive shortage due to the flooding in Thailand, we cannot design or plan for a redundant supply chain.
It is difficult to buy any storage at all right now, and even if one is lucky enough to find some hardware for sale, it will be very expensive. Recent reports from Western Digital, for example, state that hard-drive unit sales are down by 51 percent and that their selling prices for those that have shipped is up 50 percent. As IT professionals, we need to be ready to go with new solutions to tackle this challenge.
Because no single solution addresses the hard-drive shortage, I’ve collected a number of strategies I and other IT pros have used. We have different strategies because we also have different data profiles: backup data, primary storage for virtual machines, large expanses of unstructured data, or areas that we really can’t categorize. Storage requirements are not all created equal.
As our data profiles differ, how we address the shortage of hard drives also differs. This realization is critical.
Refined Storage Practices
The first way to address the shortage is operational. Simply put, does your organization have enough governance and policy for the data it has? This could take the form of a formal data retention policy or be as simple as checking with stakeholders to ask whether specific data is truly necessary to store long-term. I’ve no doubt that every data profile has data that really isn’t needed anymore, so re-assessing what should be kept and for how long can serve as an effective measure to reduce the amount of storage your organization needs.
It may not feel right, but you’re probably storing data that can be deleted. Whether it is simply a housekeeping endeavor or a full-blown data retention policy, chances are that there are substantial opportunities to delete data and reduce the storage footprint across all storage resources in your organization.
Can Deduplication be Part of the Solution?
Another way to reduce the amount of storage retained is deduplication. Deduplication simply replaces like patterns on disk with pointers to the original pattern. In this way, multiple occurrences of the same string of data are not repeatedly kept on disk. Typically, deduplication is used to reduce the footprint of backup storage data, either through the use of a hardware appliance or a software-based implementation, but deduplication can also be applied to primary storage resources. Looking ahead, Windows Server 8 will provide deduplication on volumes that will be optimized for file content. However, structured data types (such as databases and e-mail systems) are not good candidates for the Windows Server deduplication implementation.
Deduplication is a critical part of a data management strategy, but alone it is not enough to correctly manage all data or get IT through this immediate crisis.
Are We Getting Enough from Existing RAID Technologies?
Refined storage practices and deduplication can help reduce the storage footprint, but these measures will most likely only slow the rate of storage growth. Organizations will still face a growing storage footprint. Next-generation RAID technologies may be able to help.
Traditional RAID technologies are good, but not very flexible, because they require same-size drives to fully utilize all available space. Some of the new RAID technologies are breaking this mold, however. Two technologies in particular, Drobo’s BeyondRAID and NETGEAR’s X-RAID2, are RAID algorithms that permit the use of dissimilar-drive geometries, which gives IT incredible flexibility given the shortage of certain drive types. With these products, an array can be built with drives of different sizes that use the same interface. Free space is aggregated across the drives and protected to allow one or more drive failures. This is clearly a flexible option but not necessarily a replacement for all storage resources given the current situation of hard-drive availability. For certain storage tiers, these products may fit the bill.
The Time May be Ripe for the Cloud
Finally, when it comes to addressing enterprise storage, should we seriously consider leveraging the cloud? This may be the perfect end-state destination for certain data collections that we really just don’t know what to do with. There will surely be collections of data that we don’t want to completely delete that we still want to have easily accessible. Another eligible data profile could be certain archive data, which can easily be moved to the cloud with the right bandwidth.
Moving data to the cloud isn’t without considerations of its own, however. For storage clouds, the Amazon storage clouds are a good choice, even though there was a widespread outage recently in select services and specific regions. Designing for cloud storage is no different than designing internal infrastructure; both involve planning for accommodating domains of failure.
The S3 cloud now has seven regions over the globe. This is a key factor to ensuring that data moved to a public cloud is properly distributed (across two regions, for example) and easily retrievable when needed. One product that can move data to the cloud very economically is CloudBerry Server. With built-in encryption, datasets can be moved to multiple S3 regions or to different storage clouds such as Microsoft Azure blob storage and others.
Before considering cloud storage, however, the most important planning point is to ensure that the data can be removed or checked-out of the cloud. Ease of use, available bandwidth and storage to land the data are all factors to ensure that it is usable from the cloud when needed.
No One-size-fits-all Solution
The hard-drive shortage is not an impossible problem to solve when we apply the right techniques for each particular data profile. Whether the solution includes one or more of the tips outlined above -- simple housekeeping, a formal data-retention policy, deduplication, next-generation RAID technologies, or leveraging cloud storage -- there are options for just about every organization.
Rick Vanover (vExpert, MCITP, VCP) is a product strategy specialist for Veeam Software based in Columbus, Ohio. Rick is a popular blogger, podcaster, and active member of the virtualization community. Rick’s IT experience includes system administration and IT management with virtualization being the central theme of his career recently. Follow Rick on Twitter @RickVanover or e-mail him at [email protected].