Q&A: Securing Storage in the Cloud

Moving your data to the cloud is not without risk. Here's what you need to know about securing your data.

Moving your data to the cloud is an attractive proposition -- for example, it can reduce the demand on corporate IT resources. Securing that data, however, requires a different approach than does securing on-premise data. To learn about the differences, and to understand when moving data to the cloud makes operational and economic sense, we turned to Jeffrey Bell, director of corporate marketing for Zetta, Inc.; the company offers a cloud-based storage service.

Enterprise Systems: What are the top requirements for entrusting enterprise primary data to the cloud?

Jeffrey Bell: There are a number of requirements that are remarkably similar to the concerns you would have for storing and protecting data on-premise. At the top of the list would be security and privacy, followed by data integrity or data protection, and then service availability. These requirements were specifically called out, in that order, in a recent survey of more than 400 enterprise IT professionals. There are other requirements, but these are the major ones that fall into the “entrust” category.

What are the requirements for insuring data security in the cloud?

Security comes down to making sure that only authorized people can access the data. From a technology standpoint, data encryption is key. When the data is properly encrypted, unauthorized access becomes nearly impossible. Proper encryption should include both the transport of the data (the network) and the storing of the data (at rest). Encryption key management should be robust and include separate and expiring (changing) keys for each data volume.

Other technology capabilities would include integration with existing enterprise security infrastructures such as Active Directory and LDAP. From a people standpoint, the service provider needs to have documented and audited processes in place to ensure good hiring practices and good physical security policies.

How does this compare to data security concerns with on-premise storage?

Although not all storage traffic goes across the network in “on-premise” cases, there is still a fair amount of corporate network traffic that needs to be secured. The concept of securing networks is not particularly new to storage. When thinking about data at rest, though, most on-premise storage systems do not encrypt data. This can actually be a point of vulnerability, especially when disk drives are replaced for maintenance reasons. From a people standpoint, it’s actually the same, but you are at arm’s length in terms of the people managing the data facility. That’s why it’s important to look to a service provider’s procedures and ensure that they have passed audits such as SAS70.

What levels of data protection are required in a cloud environment?

The first aspect of protection is simply “If I put the data in, can I be assured that I can get it back out?” or, in other words, customers need to make sure that the data will not be lost.

A second aspect of data protection is a little less obvious but may be even more important. “Can I be sure that the data hasn’t unintentionally, or intentionally, been changed since it’s been stored?”

These are the same issues any storage admin would think about. We’ve learned a lot about storing and protecting data over the past decades, and the best practices are called for from storage service providers. This should include advanced levels of data protection such as better than double-parity RAID protection. It should also include mechanisms for detection of “silent” data corruption -- that’s data corruption that occurs on disks and is not detected and reported by the disk firmware. Simply generating multiple copies of a customer’s data is not advanced data protection.

Are these different from what you would require in an on-premise solution?

Great question! Sometimes people forget that large scale will magnify (or, more correctly, bring out) the natural failure rates of storage system components such as disks, memories, and processors. As storage systems approach multiple hundreds of terabytes and even petabyte-scale, the probability for data corruption and potential data loss increases. These large-scale systems must be designed with this in mind and include the necessary checks and redundancies to detect and correct failures. Simple RAID protection and two-way failover is not sufficiently reliable for a service-provider. Three-way failover offers orders of magnitude better availability and multi-site operations will afford protection against area-wide or facility outages.

What visibility does the end user need into how the cloud operates and what’s happening to their data?

That’s an interesting question and one that we gave a lot of thought to when designing our system. One of the significant benefits of a storage-as-a-service model is that you are not required to manage the physical storage infrastructure. Enterprise IT professionals have a need (or maybe it’s a desire) to see what’s happening to their data. A good service model provides transparency to the underlying infrastructure and events without actually requiring the end user to make decisions and take actions. System-wide events (such as software changes and configuration changes) and volume events (such as data rebuilds), replication, snapshot, and backup status should all be visible. Customers should be able to view and set data location and select data location among different service centers. Data capacity, usage, availability, and performance statistics should all be available.

How does this compare with operational concerns for on-premise storage?

The difference is striking. With on-premise storage, the user bears the full brunt of every aspect of managing the storage and protecting their data. Think about womb to tomb: Selecting the storage, purchasing the storage, installing the storage, configuring the storage, maintaining the storage, fixing the storage, planning and implementing the data protection (RAID) schemes, planning and implementing the replication/snapshot processes, backing up the data, reloading data, planning capacity, managing utilization rates, analyzing performance, balancing loads, upgrading the storage hardware and/or software, expanding the storage system, retiring the storage system, and migrating the data -- and repeat multiple times.

With all of this considered, when does it make economic and operational sense to consider cloud storage as an option?

It is important to look at the data profile or use case and to match the requirements of that use case with available storage service offerings. In the enterprise context, unstructured data (think file-oriented data) makes the best candidate for moving to a storage service. Look for use cases that don’t have severe latency constraints. Many unstructured data use cases fall within those boundaries. We have seen a lot of movement from old data backup models (such as tape) to an on-line active archive where files are copied into the cloud and accessed as if still stored locally. We have also seen many people moving to file sharing and collaborative workgroups. There are a number of enterprise storage features we haven’t talked about, such as native file system access versus API access and POSIX compliance insuring read/write fidelity, that really open up the gates in terms of the available application pool that could easily move to the cloud.

What role does Zetta play in this market?

Zetta provides enterprise-class storage-as-a-service on-demand. We consider our service “on-demand NAS.” It looks to the applications and users just like an on-premise NAS system with all the enterprise features you would expect. It also provides all the business benefits of a hosted, fully managed, instantly available, instantly deployable, and instantly scalable enterprise-class storage service. Zetta’s internal architecture was designed to deliver enterprise-class availability and data protection. Even early on, customers are deploying both primary data and secondary data applications to the Zetta storage service. When people see what we are offering, very often, their comment back to us is, “this really isn’t cloud storage, is it?”

comments powered by Disqus