Data Visualization Principles

In a previous column I discussed some of the categories of data analysis that are applicable to business information. Effective data analysis and understanding can be helped or hindered through the use of data visualization technology. A good graphical presentation of data can make it easy to understand information otherwise buried in the data. Conversely, a poor graphical data presentation can make it impossible to comprehend what the data is trying to tell you.

In this column, I'll discuss a few effective data visualization principles, which based on the work of Edward Tufte, a professor at Yale University. Given the increasing usage of graphics, especially on the Web, this information can be very helpful if you need to design business graphics, presentations or a user interface.

Tufte has published three excellent books on data visualization. The Visual Display of Quantitative Information is about pictures of numbers and examines statistical charts, graphs and tables. Envisioning Information is about pictures of nouns and analyzes maps of data. Visual Explanations is about pictures of verbs and focuses on depicting cause and effect.

One fundamental principle is that a display should be focused on the data. While this seems like an obvious concept, it is one that is often ignored. For example, designers often clutter a chart with cute little icons or other visual elements that they think are cool or interesting. But these have no value in terms of clarifying the viewer's understanding of the data. Another related concept is that the order of the data can have a significant impact on a viewer's ability to understand it.

When you are designing a graph, chart or user interface, resist the temptation to add extraneous graphical elements that don't clarify or expose the deeper meaning of the data. Also, ensure that the data is presented in a meaningful order.

One example that Tufte relates is how the rocket scientists at Morton Thiokol -- who were concerned about the O-rings on the shuttle Challenger -- created ineffective charts and graphs to support their contention that the shuttle launch should be delayed. After the shuttle exploded, they prepared additional graphs to illustrate the relationship between launch pad ambient temperature and O-ring erosion. They still didn't get it right, however, for two reasons. One is that the chart had little rocket icons on it, cluttering the display of the data. Second, the data was ordered in terms of launch sequence, from first to last, rather than by temperature.

This demonstrates the principle that clear, precise thinking about a problem will lead to clear, precise understanding about how to present the data. In this case, the rocket scientists should have created a simple scattergram with two axis: one for the degree of O-ring damage and the other for the launch pad ambient temperature. Once the data is ordered and displayed this way, it becomes obvious that there was a direct relationship between O-ring damage and a cold launch temperature. An observer would also have noted that the Challenger was launched at an ambient temperature of 29° F, while the next coldest launch -- which had the highest degree of O-ring damage to date -- occurred at 53° F.

Another principle to consider is that a good graph is multidimensional. An example given by Tufte is a chart that shows the fate of Napoleon's army as it entered and exited Russia in the winter of 1812. The chart has a band that illustrates the size of the army. Where Napoleon entered Russia with 400,000 men, the band is wide; the width is cut to one-fourth the size by the time 100,000 men reach Moscow. It is further reduced to a thin line to represent the 10,000 men who made it back to Poland. The graph illustrates the army's direction, shows its location over time and has a link to a daily temperature graph. A good graph will visibly demonstrate the relationship between cause and effect. In this instance, the relationship between the number of surviving soldiers in the army and the temperature -- which fell to -30° C -- is striking. It leads to the conclusion that most of Napoleon's army was wiped out by the cold.

In conclusion, use elements like color, direction and text to illustrate the richness and complexity of data. Once you learn to do this, you'll never be content with creating a two-dimensional pie or bar chart again. --Robert Craig is vice president of marketing at WebXi Inc. (Burlington, Mass.), and a former director at the Hurwitz Group Inc. Contact him at