In-Depth
Achieving Optimal Database Application Performance: Benchmark Tips and Techniques (Part 1 of 3)
In this first installment in our three-part excerpt adapted from The Data Access Handbook, authors John Goodson and Rob Steward explain why benchmarks and how your database applications fit into the picture.
Excerpted from The Data Access Handbook by John Goodson and Rob Steward; published by Prentice-Hall; used by permission. For more information, visit http://dataaccesshandbook.com/
Odds are very good that one or more of your company’s database applications is suffering from performance issues. This should hardly come as a surprise to those of you who deal with application design, development, or deployment. What may be surprising to you is that the root cause of many of these issues is in the database middleware -- the software that connects applications to databases.
Ten to 15 years ago, if you were having performance issues with your database application, 95 percent of the time the issues were in your database management software. Back then, tuning the database was considered magic to everyone except for a select group of engineers and database experts, who most likely worked for the database vendors. They were keeping details about their databases a secret; it was proprietary.
The performance landscape started to change when these experts began writing books about database tuning and giving seminars to the public. Today, because of the volumes of database tuning information, years of experience, and vastly improved database monitoring tools, the task of tuning databases has become less painful.
Where Are We Today?
Today, even when the database is tuned well, we know that database applications don’t always perform as well as we’d like. Where does that leave us? Where do we find the performance issues today?
For the majority of database applications, the performance issues are now found in the database middleware. For most applications, 75-95 percent of the time it takes to process a data request is spent in the database middleware compared to 10 to 15 years ago, when the majority of the time was spent in the database management software.
In most cases, performance issues seen in database application deployments are caused by:
- The network
- Database drivers
- Software environment
- Poorly coded database applications
The Network
One of the most common performance issues of the network is packet size. Network packets carry an application’s messages via the database middleware to the database and vice versa. The size of the packets makes a difference in the performance of your database application. The main concept to remember is that fewer packets sent between the application and the database equates to better performance -- fewer packets mean fewer trips to and from the database.
Think of it this way. Jim’s manager, Cindy, asks him to move five cases of diet soda from her second floor office to the first-floor kitchen. If Jim’s packet size is a six pack rather than a case, then he has to make 20 trips to the kitchen instead of five, which means he is taking longer to move the soda to the kitchen.
The Database Driver
All database drivers are not created equal. The choice of which driver to use in a database application deployment can have a large impact on performance. The following real-world scenario explains how one company solved their performance issues by changing only the database driver.
DataBank serves the information needs of companies both large and small through the retrieval, storage, and delivery of data. DataBank’s reputation and financial security depends on response time and system availability. They have contracts with their customers that require specific response times and system availability. If these requirements are not met, they must pay their customers a fine.
After a mandated upgrade to a new version of their database and its accompanying middleware, DataBank began having serious performance issues. They were routinely paying over $250,000 a month in fines due to missed contractual obligations. The situation was unacceptable; the company had to find the performance issues in their database application deployment. They started by making sure that their database was optimally tuned. Even with the database performing well, the company was still missing their contractual service-level requirements.
The system architect made a phone call to a database consultant, and the consultant asked, "Have you considered trying a different database driver?" The architect responded, "I didn’t even know that was an option." The consultant recommended a database driver that he had used with success.
Losing no time, the architect had the recommended database driver installed in a test environment. Within two days, the QA department reported a 3x improvement in average response time between the new and the currently deployed database drivers and the elimination of stability problems.
Based on the results of their performance testing, DataBank purchased the new database driver. After the new database driver had been deployed, DataBank analyzed the revenue they were saving. In just the first two months, DataBank cut its fines from $250,000 to $25,000, a savings of 90 percent by simply changing the database driver. The new driver handled connection pooling and memory management more effectively than the old driver. DataBank solved several issues by deploying a new database driver: loss of revenue, customer dissatisfaction, and overworked IT personnel, to name a few.
Your Database Application
Another important performance-related component is your database application. If your application is not coded efficiently, the data request that your application passes along to the database middleware can negatively affect performance. One common example is transaction management. With most standards-based applications, the default transaction mode requires the database middleware to process expensive commit operations after every API request. This default "auto commit" mode can impose severe performance constraints on applications.
To illustrate this, consider the following real-world example. ASoft Corporation coded a standards-based database application and experienced poor performance in testing. Their performance analysis showed that their problem resided in the bulk five million Insert statements sent to the database. With auto-commit mode on, this meant an additional five million Commit statements were being issued across the network and that every inserted row was written to disk immediately following the execution of the insert. When auto-commit mode was turned off in the application, the number of statements issued by the driver and executed on the database server was reduced from ten million (5 million Inserts + 5 million Commits) to five million and one (5 million Inserts+ 1 Commit). As a consequence, application processing was reduced from 8 hours to 10 minutes. Why such a dramatic difference in time? There was significantly less disk I/O required by the database server and there were half as many network round trips.
In general, your database application should be written to:
- Reduce network traffic
- Limit disk input/output
- Optimize application-to-driver interaction
- Simplify queries
In the next article in our series we'll explore how to develop good benchmarks.
- - -
Rob Steward has spent more than 15 years developing database access middleware, including .NET data providers, ODBC drivers, JDBC drivers, and OLE DB data providers. He is currently vice president of research and development for DataDirect Technologies (http://www.datadirect.com). John Goodson is the executive lead at DataDirect Technologies. For more than 15 years, he has worked closely with Sun Microsystems and Microsoft on the development and evolution of database connectivity standards including J2EE, JDBC, .NET, ODBC, and ADO.NET.