In-Depth

Hey! You! Get Onto the Cloud

If your BI project needs more resources -- from processing power to storage space -- cloud computing may just be what you need.

Most business intelligence practitioners have, at one time or another, encountered a situation where they simply did not have the necessary resources to solve a pressing analytical business issue. If the required resources were processing power, storage, or software, the answer could be "cloud computing."

Ask six people to define cloud computing and you are likely to receive (at least!) half a dozen different answers. All the definitions will almost certainly involve utilizing computer services external to the organization, and (more often than not) specify that access to these services be provided via the Internet. However, just as there are many types of cloud formations, there are many services that can be offered via cloud computing, including on-demand software, data storage, compute processing power, and perhaps even support for collaborative efforts through on-line meetings and events.

Historical Perspective

While some might think that cloud computing is a new concept, it dates back to at least 1964 when Martin Greenberger, an associate professor at the MIT Sloan School, wrote about organizations being able to plug into a computer utility in his May, 1964 Atlantic Monthly article, "The Computers of Tomorrow." After all, early in the U.S. industrial revolution, companies generated their own electricity with on-site power plants. The idea of wiring a city to carry electricity from a central power plant to individual homes and office buildings was considered too costly or impractical.

A computer utility is an analogous offering, where remote computing power and services are made available over a network. However, unlike the original electrical power plants that had to also wire the city to create the electric distribution network, today the Internet is already in place and readily accessible from almost anywhere.

One of the first practical applications of this utility concept was the advent of commercial time-sharing services in the late 1960s and 1970s. These services allowed users to remotely connect to mainframe computers via phone lines using dial-up modems and acoustical couplers. Time-sharing users were typically charged by the amount of CPU time and memory their applications utilized as well as the amount of disk storage they used; in some cases there were also surcharges for the use of specific software applications.

In the 1990s, application service providers (ASPs) hosting commercial applications that were available via the Internet represented another incarnation of cloud computing. Yet another example (which never took off) was, in my opinion, Oracle's Network Computer initiative. It utilized a striped-down desktop device with no local storage and a thin-client interface to access data, applications, and office productivity software. (Given the seemingly endless string of stories about PC problems -- such as upgrade issues, multi-vendor software incompatibilities, security threats, and data theft -- the concept of a Network Computer now seems more appealing. However, this is another topic for another day.)

Pluses and Minuses

Cloud computing can offer several advantages, such as the ability to offload some processing rather than investing in hardware upgrades, provide the protection of off-site backup of primary data storage, accommodate one-time tasks such as massive file conversions, or permit organizations to utilize software applications on a subscription basis, all with minimal upfront investments in hardware, software licensing, or IT staff. One example of cloud computing, of particular relevance to business intelligence efforts, would be to facilitate massive data cleansing efforts, perhaps as a result of a merger or acquisition, when you must consolidate customer or vendor master data files. Cloud computing vendors could provide raw processing power, incremental disk storage, or even on-demand data cleansing solutions. Recognizing the potential for cloud computing, Kognitio, a data warehouse appliance vendor, now markets DaaS (Data Warehousing as a Service), allowing companies to undertake large-scale analyses on a "pay-as-you-go" basis without having to first acquire additional in-house computing resources.

However, cloud computing raises other issues that organizations should be aware of: security, availability, and recovery. For example, while cloud computing can provide incremental or even primary storage, any data stored off the premises (in the cloud) must be secure, readily available, and quickly recoverable in the event of a disaster. Furthermore, there must be adequate bandwidth to allow for multiple users and quick response times.

A company must also be protected in the event the cloud vendor runs into financial difficulties. Even if contracts provide for monetary damages and guaranteed service levels, it may be more important that an organization can continue to conduct its business and suitable procedures must be established in advance. Organizations utilizing on-demand software-as-a-service applications should make sure they avoid situations that lock them into their on-demand vendor and thus make them unable to utilize other on-site or on-demand competitors in the future. In particular, these organizations must be sure to retain ownership to any of their files that reside in the cloud and be able to access it without undue delay. Although company financial and marketing analysts will be pleased that cloud computing may have allowed them to perform analyses that could not otherwise have been undertaken, these same analysts will be less than pleased (if not hostile!) if they can't refine their analyses because of a "cloudburst."

It is also important that cloud applications be able to integrate with on-site applications and share data where appropriate. I highly recommend that any initial foray into cloud computing involve a relatively independent, standalone application (perhaps for off-site data backup or to host a one-time standalone application such as a massive file conversion or a data profiling analysis). If an organization continues to move additional applications to the cloud, it must develop a computing architecture that encompasses and integrates both on- and off-site computing.

In addition to the growing cadre of application and BI vendors now offering on-demand, subscription-based applications and/or tools, several well-known companies (including Amazon, Hewlett-Packard, and Sun) currently offer the ability to utilize the excess capacity in their data centers. Their success has helped popularize cloud computing, perhaps leaving the impression that it is closely, if not exclusively, aligned with the selling of raw processing power. Although cloud computing encompasses more than access to raw processing power, it is certainly one of its components.

Organizations not already doing so should start to think about how cloud computing can benefit them. The time to gain experience with cloud computing should precede any pressing need so that your organization can plan for the technology, not react to it. Future computing platforms are likely to be even more Web-centric and IT organizations would do well to consider "getting their heads into the clouds" soon.

Must Read Articles