In-Depth
Web Data Centers: Big QoS and TCO Opportunities Require Innovation
Realizing the potential benefits for large-scale service deployments presents difficult challenges. We discuss these opportunities and challenges and recommend solutions.
By Dr. John Busch, Chief Technology Officer and Founder, Schooner Information Technologies
The explosion in Web usage and content places a tremendous load on data centers. IDC reports that 487 exabytes of new data were created in 2008, more than in the prior 5,000 years, and that IP traffic will quintuple by 2013. However, data center efficiency is low. The U.S. Department of Energy reports that for every hundred units of energy piped into a data center, only three are used for actual computing.
The success of a Web site is highly dependent on the quality of service (QoS) that it delivers. Providing excellent user response time and continuous service availability are fundamental requirements for success. The quantity of servers and storage required for the performance, scalability, and availability can be very large, and managing the provisioning, data partitioning, and high-availability configurations across them can be daunting.
Data center QoS and total cost of ownership (TCO) have a major impact on the enterprise’s business growth and profits. Application developers and IT departments seek architectures and deployments providing resilient performance scalability and high service availability to effectively meet rising service demand while controlling capital and operating expenses.
Tremendous technology advances have been made in recent years that offer the potential for major improvements in QoS and TCO. Commodity hardware industry advances in enterprise flash memory and multi-core processors offer significant potential improvements in performance while reducing power and space consumption. Software technologies (including virtualization, cloud, and data stores) provide new deployment architectures that offer large potential improvements in scalability, availability, and cost structure.
Realizing these potential benefits for large-scale service deployments presents difficult challenges. In this article we’ll look at these opportunities and challenges and recommend solutions.
Exploiting Key Technology Advances: Architectural Change Needed
Commodity hardware technologies continue to advance rapidly. In the past year, the number of cores on commodity Intel Nehalem processors has doubled. Flash memory provides about 1/100th the latency and about 100x the random I/O operations/second (IOPS) of hard drives, and about 1/100th the power dissipation of DRAM at much higher density and lower cost. In the past year, the capacity of enterprise flash memory SSDs has quadrupled, and when loaded with new enterprise multi-level cell (eMLC) flash memory, they provide low cost and high endurance in addition to high capacity. However, these fundamental technology advances in processor and I/O have not translated to data center QOS and TCO in commensurate terms. What’s holding us back?
The potential benefits of server virtualization and cloud computing are clear and compelling. Server virtualization provides increased computing resource utilization and elastic scaling. Additionally, cloud computing provides cloud infrastructure companies economies of scale by sharing computing resources among multiple tenants. Finally, instead of devoting capital expenditures up-front to buy hardware -- which has the risk of rapid obsolescence -- cloud computing lets IT organizations pay as they compute with a controllable operating expense.
Cloud computing elasticity and potential cost savings have made it a key piece of most organization’s enterprise IT strategy. However, the cloud industry and enterprises are hitting barriers in deploying enterprise class services into the cloud at scale. For many classes of applications and services, the realized performance and availability characteristics of cloud deployments at scale are disappointing, and the large quantity of cloud instances needed to support scaling a deployment drive the cost of cloud deployment to unacceptable levels.
Every industry undergoing commoditization eventually faces this issue of diminishing returns on technology advances. As an industry commoditizes, standards are defined at every level of the technology and sub-industries are formed around specializing and optimizing components that support the standards. Solutions are constructed from these standardized, locally optimized components and subsystems. Eventually, new integrated architectures are required to advance innovation, with these eventually becoming standardized and commoditized.
In the case of our industry, this effect has been accelerated in magnitude and compressed in time. The cost reductions and development productivity afforded by standardization and commoditization in our industry have been the driving forces enabling ubiquitous computing and Web content and services. The proliferation of computing devices and the innovation in Web services has been tremendous, but efficiency is trailing far behind. New, integrated architectures are required to leverage technology advances in a manner that efficiently meets service requirements while preserving business-application-level compatibility and investment.
Very large Web properties recognize this and understand that the standard tools and architectures are limiting and inefficient. They are investing billions of dollars and tens of thousands of engineers and scientists to create new proprietary architectures, and designs for scalable software and efficient data centers, often designing and sourcing both their own hardware and software rather than procuring from the standard industry value chain. Their internal investment and innovation model is intended to provide them with a competitive advantage while reducing their operating costs.
With large-scale investments, the largest Web companies are making progress in both QoS and TCO, albeit with spurts and lags and proprietary solutions, most of which are opaque outside their enterprises in order to protect their intellectual property. However, our industry in general is making slow progress. To broadly realize the large potential data center productivity, scaling, availability, and cost efficiencies offered by technology advances, commensurate architectural advances are required, coupled with new, efficient, cost-effective, standard, integrated building blocks and services.
The Big Gaps and Big Levers
In an enterprise service deployment, Web application servers access the data-access tier, which consists of shared databases, key value stores, document stores, wide-column stores, and caching services. The overall QoS and TCO of the service deployment are largely dependent on this data-access tier.
Schooner Labs has done extensive exploration and evaluation of advanced technologies and architectures for the data center data-access tier, analyzing the effects on data center QoS and TCO. We share these results in our blog. We present a summary of the key gaps and innovation opportunities.
Big Lever #1: Balanced Systems with Multi-Core, Flash Memory, and Integrated Data Access Software
Commodity multi-core processors and flash memory offer huge potential for improving the QOS and TCO of the data center’s data-access tier. Databases, data stores, and caching services have the inherent potential to exploit parallel processing cores and flash devices since they are designed to process thousands of concurrent independent client connections and transactions using multi-threading.
However, current Web site databases, data stores, and application software fundamentally underutilize multi-core processors and flash IOPS. Current open source databases and data stores are typically only able to utilize two to four cores, and they gain less than 50 percent improvement from flash memory.
To effectively utilize multi-core and parallel flash memory, key software architecture changes are required. The data and resource management algorithms, originally designed for hard disk drives, must provide significantly higher levels of parallelism, more granular concurrency control, considerably more intelligent storage hierarchy management, and specific management algorithms tailored to multi-core and flash characteristics. These improvements can be made while preserving the client/server application layer API, thereby maintaining compatibility and leveraging all of the software investment in Web- and application-server business logic.
When the data access software is improved with these mechanisms and executed on balanced configurations with leading commodity multi-core processors and flash SSDs, databases and data stores achieve a tenfold improvement in throughput/watt/cm3 when compared with legacy data access software on hard drive-based systems. Overall data center power consumption can be cut by 50 percent, along with a 10 to 1 consolidation in the number of servers in the data-access tier.
Big Lever #2: Data-Access-Tier Virtualization Paradigm and Clouds
Current server virtualization technologies rely on provisioning application instances in virtual machines onto servers under the management of a hypervisor. This is an easy way to combine existing applications with multi-core systems. Running multiple virtual machines on a server managed by a hypervisor provides the opportunity to utilize multi-core servers by sharing the cores among the application instances. It also provides for elasticity of service capacity through dynamic provisioning of more or fewer application instances based on the current workload demand.
This virtual machine approach works well when the data needed by application instances fits in DRAM. When it doesn’t and applications require significant I/O, overall service performance and availability become highly variable and drop dramatically compared to non-virtualized servers. As a consequence of this gap, Gartner reports that less than 10 percent of data-tier-server workloads are virtualized today. When data-access=tier applications are executed in virtual machines, it becomes necessary to work around the diminished performance by providing additional data partitioning and caching layers, as well as provisioning of many more instances than in a non-virtualized environment. These numerous, small data partitions, caches, and application instances drive up application and management complexity, increases cost and reduce service availability.
In data center deployments, Web application servers can be effectively run in virtual machines in production, but data-access tier servers can only be effectively run in virtual machines for development or small-scale production. Yet most clouds today require using virtual machines for all applications running in the cloud. Clouds limited to this virtualization approach for the data-access tier will not realize the potential for QoS and TCO benefits for deployment of enterprise class, scaled services. In particular, the large QoS and TCO benefits achievable through the architectural innovation of balanced data-access tier systems exploiting leading-edge commodity flash and multi-core with integrated data access software are lost. We require innovation in data access server virtualization technologies and in cloud architectures to capture these benefits for enterprise class, scaled deployments.
In the short term, hybrid clouds can fuse the benefits of the compelling industry trends of architectural improvements in data-access-tier solutions and clouds. Hybrid clouds can use virtualized machine instances for the Web and application tier while exploiting non-virtualized, vertically scaling data-access-tier solutions in balanced flash-based system configurations. In these deployments, the optimized data-access tier servers integrate into cloud data centers as shared, networked servers with explicit virtualization based on management APIs for provisioning, accounting, monitoring, security, and multi-tenancy controls rather than as virtualized machine instances.
In the longer term, improved virtualization technologies are needed so that integrated data-access tier software, flash, and multi-core can be effectively virtualized with a unified virtual administration model applicable to all tiers in the data center, including dynamic provisioning, management, monitoring, and accounting,
The Bottom Line
We are at a point in our industry evolution where we can and must apply architecture innovation to effectively address QOS and TCO in the face of exponentially increasing service demands. Key opportunities lie in integrating databases, data stores, and data caching services with advanced commodity multi-core and flash memory, and in creating new cloud virtualization technologies. These offer the potential for order-of-magnitude improvements in performance scalability and service availability to meet the exponentially expanding service demand while controlling capital and operating expenses and reducing power consumption.
Dr. John Busch is chief technology officer and founder of Schooner Information Technologies. You can contact the author at [email protected]