Building a Data Center without Walls: The Need for Intelligent Networking
A new intelligent cloud network between data center and cloud is needed for efficient workload mobility.
By Jim Morin
Much has been written about the network within the data center, with Ethernet and Fibre Channel dominating the discussion about user-to-server connectivity and input/output (I/O). This internal data center network will change as increased server virtualization drives growth in I/O, and organizations upgrade from 1 Gigabit Ethernet (GbE) to 10 GbE to achieve greater capacity.
As data centers evolve, we also need to consider the network that connects the data centers -- particularly how these inter-data-center networks need to change to support new cloud use cases and associated network requirements for bandwidth scalability, low latency, security, virtualization, and automation.
To illustrate why this needs to happen, let's first look at today's most common enterprise cloud use: Software-as-a-Service (SaaS), or applications run in the cloud.
In a SaaS model, the cloud resource being accessed is an application; user connectivity is re-directed from the enterprise data center facility to the application running in the cloud data center facility. Some applications have low-bandwidth and lenient latency demands, making them suitable for remote cloud server location. Security and virtualization for these types of applications is often implemented with a virtual private network (VPN) over the Internet using a public telecommunications infrastructure. Private MPLS/VPLS networks can also provide connectivity and be traffic-engineered for higher performance.
For example, many of us run Google Apps and Gmail from Google servers located in the Google cloud, either to supplement or replace Microsoft Office and Exchange applications that run on local servers in our corporate data center. These applications -- whether running in the corporate data center or in the cloud -- generally provide good performance using standard public Internet connections of 3 to 30 Mb/s bandwidth speeds and network security enabled using a VPN. Latency issues for these types of applications can be minimized by using "store and forward" architectures and with WAN optimization.
In addition to SaaS applications, the cloud is now increasingly being used for Infrastructure-as-a-Service (Iaas) applications, which add significantly more workload demands on the network, as I will illustrate in the following example.
From SaaS to IaaS
The roots of IaaS date back to the emergence of utility computing in the early 2000s, which Amazon brought to market in 2006 as a public cloud service called Amazon Elastic Compute Cloud (EC2). For the first time, Amazon defined a computing unit, or "instance," as server processing speed, memory, and storage that can be purchased with per-hour pricing and consumed on-demand.
AWS EC2 Instance
Standard Small – 1 EC2
Standard Large – 4 EC2
Standard Extra Large – 8 EC2
Table 1: Sample AWS instances
Table 1 shows some of Amazon's EC2 instances. The company now markets several additional instance definitions, plus a separate storage infrastructure service. Its estimated 80-90 percent share of the public cloud market provides an industry benchmark for all other public cloud service providers and internal private clouds.
With the emerging cloud environment, many use cases have emerged for operating the server instance (also called virtual machine or VM) in the cloud as well as for moving the virtual machines and storage between data centers. For example, use cases might involve moving or re-balancing virtual machines over the network while maintaining the application online. They could also involve moving VMs to a safe location for disaster avoidance, or migrating VMs to a new data center location.
Another typical IT use case is making a server platform change -- either to upgrade or change a server hardware platform -- where moving the VM over the network to the new platform can save time. This new network mobility, enabled by the virtualization of physical resources, places greater demand on the inter-data-center network -- far beyond the typical SaaS connection.
To more clearly see how workload mobility use cases affect the network, we will generalize virtual server instances, or VMs, to have the following characteristics:
Virtual Machine (VM)
Table 2: VM configurations
The number of VMs and their collective memory and storage become very important when sizing the amount of information to be transported over the network. For example, if it's only one server that needs a platform change, that may mean a need to move 10 VMs, as a typical physical server now operates anywhere from 4-15 virtual machines. Assuming 10 VMs of medium size, the total data to transfer would be about 10 terabytes (10x5 GB memory + 10x1000 GB storage).
Transferring 10 TB of data over a typical 40 Mb/s MPLS connection would take approximately four weeks, assuming full bandwidth utilization, no re-transmissions and 80 percent utilization of the network. Clearly this use case is not a good fit, time-wise, over a typical 40 Mb/s shared network and would negatively affect all other users.
At this point, some might respond by throwing more bandwidth at the problem. However, expanding the bandwidth speed to 1 Gb/s would take our example of moving 10 VMs 28 hours to complete, still pushing practical limits. Many other use cases (such as business continuity efforts, peak workload computing and bulk migrations for data center re-location) could require the movement of hundreds of VMs of various sizes over the network in a relatively short time period. This workload might also need to be done periodically, putting increased and repeated strain on the network and its users.
The Case for an Intelligent Cloud Network
Continuing to add more bandwidth for peak workloads leads to inefficient utilization and higher costs. Instead, IT needs to adopt a more strategic approach for intelligently networking to cloud-based infrastructure services. Today this may involve consolidation of low-speed circuits into sharing high-speed, 10 Gb/s connections with quality of service controls. Another strategy may be to look into the rapidly growing Ethernet Layer 2 services that can easily scale from 1 Mb/s to 10 Gb/s with low, predictable metro latency. Bandwidth improvements can be justified not only from improved data protection but now, also, as an enabler for shifting computing from the server closet to the cloud.
The new intelligent cloud network characteristics would include:
- Greater availability and reliability from a flatter, switch-based architecture
- Easily scalable bandwidth using fiber transport that adapts to growing requirements
- Low latency for predictable user application experience enabling servers and storage to be confidently placed in cloud data center locations
- Network hypervisor-driven virtual capacity and automation for on-demand network provisioning
As with cloud computing processing power, memory, and storage, the network should also be considered a resource that can be virtualized and shared. Intelligent allocation or orchestration of network resources, provided by a network hypervisor, will maximize the efficiency for the network operator while providing the promise of lower-cost, pay-for-use billing for the network user.
Using the example above for a 10 TB data transfer requirement, an intelligent network could more easily accommodate these workloads, by dynamically expanding to 5 Gb/s and completing the job in less than 5 hours. When the job is done, the network would immediately return to standard levels so that the premium bandwidth is billed only as used.
The ability to respond to IaaS workload demands with dynamic bandwidth is a key benefit of an intelligent network for the cloud. In addition, this network would have higher availability, lower latency and greater reliability, as it would be designed for critical infrastructure services.
The precursor to this operating model is found with another service from Amazon, called AWS Direct Connect. Just introduced in August 2011, the on-demand network service is available with an hourly cost at either 1 Gb/s or 10 Gb/s. The service connects Amazon data centers with partner data centers supported by an ecosystem of 14 network providers. The Direct Connect service validates the need for all cloud providers to offer an on-demand, high capacity, dedicated network that can accommodate the large workload transfers described above.
The network itself might be the easy part, as we also need the cloud operation to provide on-demand, elastic, and measureable services, which will require new workload orchestration software. On-demand networking has been discussed for more than 10 years but has been difficult for carriers to deploy due to legacy billing and operations software. With the very nature of cloud services being on demand, service providers understand that the network will need to also have this operating characteristic, not just for bandwidth capacity but also for performance assurance.
Future data center architectures will federate everything -- networks, applications, and physical locations. The resulting "Data Center without Walls" operating model will give IT tremendous operational flexibility and agility to better respond and support business initiatives by transparently using both in-house and cloud-based resources. This virtual data center, connected with an intelligent network, will become a key piece of IT cloud strategy as enterprise looks to take advantage of the new opportunities cloud computing brings.
Jim Morin is the product line director for the managed services and enterprise group at Ciena and is the deputy commissioner at TechAmerica CLOUD2 (). You can contact the author at email@example.com