Achieving Optimal Database Application Performance: Benchmark Tips and Techniques (Part 2 of 3)

In this second part of our three-part excerpt adapted from The Data Access Handbook, authors John Goodson and Rob Steward begin their discussion of nine guidelines that will help you create meaningful, useful benchmarks in your own organization.

by Rob Steward

Excerpted from The Data Access Handbook by John Goodson and Rob Steward; published by Prentice-Hall; used by permission. For more information, visit http://dataaccesshandbook.com/

Developing Good Benchmarks

In the first part of this series we discussed the elements of a good benchmark. We now turn to why benchmarks are an essential tool to measure and predict performance and how you can develop valid, usable benchmarks in your own organization using our set of nine guidelines.

Some factors that affect performance are outside of your control, but you can influence other factors, such as how efficiently your application is coded to access data and how efficiently your database middleware operates.

Benchmarks measure the performance of an application or system on a well-defined task or set of tasks and are often designed to accomplish one or more of the following goals:

  • Predict the performance of a new application or system component
  • Diagnose and pinpoint performance bottlenecks
  • Plan for a system’s future capacity
  • Determine the impact of a change to an application, such as a change to application code or a change in a hardware or software component
  • Ensure a minimum level of performance
  • Compare components, such as different database drivers

Performance is typically measured by throughput, scalability, response time, or a combination of these factors. In our line of work, we see a significant number of benchmarks. Because most of those we encounter don’t do what they were intended to do, we felt compelled to discus how to develop good benchmarks.

Developing the Benchmark

Standardized benchmarks are available for measuring the performance of databases (TPC, for example) and application servers (SPECjAppServer, for example), but standard industry benchmarks for measuring data access code and database middleware don’t exist. To help you develop benchmarks that measure these important factors, follow these guidelines.

  • Define benchmark goals
  • Reproduce the production environment
  • Isolate the test environment
  • Reproduce the workload
  • Measure the right tasks
  • Measure over a sufficient duration
  • Prepare the database
  • Make changes one at a time
  • Assess other factors

Define Benchmark Goals

Before you design your benchmark, carefully define what it is you want the benchmark to measure and define what you consider good performance. Benchmark goals typically are driven by business needs. For example:

  • The application must complete at least 10 transactions per second
  • The application must have a response time of no longer than 500ms when not executing transactions
  • The application must retrieve at least 100,000 rows in under 10 seconds
  • The application must insert at least a million rows in under 2 hours

In addition to measuring throughput, such as how many rows are retrieved, updated, or inserted over a period of time, measuring CPU and memory usage on the machine running your application can provide information about the scalability of that application or system. However, be careful to measure CPU and memory usage in a way that provides useful results.

For example, suppose your benchmark executes the same set of SQL statements in a loop over a period of 100 seconds. Let’s take a look at two different high-level approaches to measuring CPU usage. We arrive at the total CPU time used by taking snapshots of the CPU time using standard operating system calls. The difference between those snapshots allows us to calculate the total time spent by the process on the CPU.

Example A: Measuring Individual Operations

In this example, we take a CPU time snapshot within the loop, essentially measuring the elapsed CPU for each operation. To get the total CPU time, add each CPU time measurement. The problem with this approach is that the duration of the time that is measured for each operation is short. Benchmarks run over short durations provide results that often do not scale to real-world performance or results that may be inaccurate.

  1. Start the loop.
  2. Save the CPU time.
  3. Execute the SQL statements.
  4. Save the CPU time.
  5. End the loop.
  6. Determine the difference between each CPU time snapshot, and add those times to arrive at the sum of CPU time.
Example B: Measuring the Overall Operation

In contrast, this example takes a better approach because it samples the CPU time at the start and end of the loop; the duration of the benchmark is measured over an entire period of 100 seconds -- a sufficient duration.

  1. Save the CPU time.
  2. Start the loop.
  3. Execute the SQL statements.
  4. End the loop.
  5. Save the CPU time.
  6. Determine the difference between the CPU time snapshots to arrive at the CPU time.

Design Your Test Environment

Before you design your test environment, you need to gather information about the essential characteristics of your production environment so that you can duplicate them in your test environment. The following is a list of important questions you should ask before defining your test environment.

  • What is the version of your database?
  • Is the database installed on the same machine running the application, or is it installed on a different machine?
  • What are the model, speed, cache, and number of CPUs and cores of the processor hardware on your database server and application server?
  • How much physical memory (RAM) is available on your database server and clients?
  • What is the size and bus interface type of the hard disk on your database server and application server?
  • What is the speed of the network adapter on your database server and clients?
  • What is the version of the operating system on both the client and database server?
  • If your application uses a JVM, which JVM is it, and how is that JVM configured?
  • What compiler/loader options were used to build your application executable?
  • Does your application run on an application server?
  • At peak times, how many users run your application?
  • Do network requests travel over a LAN or a WAN? Do network requests travel over a VPN?
  • What tuning options are set for your database driver?

Make Your Test Data Realistic

Using a copy of your production data is a good idea, but that may not be possible in all cases. At a minimum, model your test data after real data, as shown in the following examples.

Example A: Design Test Data to Match Production Data

If your application retrieves data from a database table with 40 columns and 1,000 rows, design the test database table to have 40 columns with 1,000 rows.

Example B: Retrieve the Same Type of Data That Your Application Retrieves in Production -- Long Data

If your application retrieves long data, such as Blobsand Clobs, in addition to numeric and character data, make sure that your benchmark retrieves long data. Many database drivers emulate retrieving LOBs. You need to gauge how efficient the database driver is at retrieving long data.

Example C: Retrieve the Same Type of Data That Your Application Retrieves in Production -- Unicode

If your application retrieves Unicode data, make sure that your benchmark retrieves Unicode data. Unicode is a standard encoding that is used to support multilingual character sets. If your application, database driver, and database do not fully support Unicode, more data conversion is required, which affects performance. You need to gauge how efficient the database and database driver is at retrieving Unicode data.

Example D: Avoid Using Duplicate Values in Your Test Data

As a shortcut to creating test data, some benchmark developers populate test tables with duplicate values. For example, the following table contains a high percentage of duplicate values.

  first_name  

  last_name  

  SSN  

 Grover 
 Grover 
 Abraham 
 Grover 
 Grover 
 Grover 
 Grover 
 Abraham 
 Abraham 
 Ulysses 
 Cleveland 
 Cleveland 
 Lincoln 
 Cleveland 
 Cleveland 
 Cleveland 
 Cleveland 
 Lincoln 
 Lincoln 
 Grant 
 246-82-9856 
 246-82-9856 
 684-12-0325 
 246-82-9856 
 246-82-9856 
 246-82-9856 
 246-82-9856 
 684-12-0325 
 684-12-0325 
 772-13-1127 

Isolate the Test Environment

Because you must be able to reproduce consistent results to know whether changes have a positive or negative effect, it’s important to isolate the test environment from influences that can skew benchmarking results. For example, if your benchmark is influenced by the ebb and flow of corporate network traffic, how can you trust the benchmark results? Isolate the network traffic generated by your benchmark runs from corporate network traffic by connecting your test machines through a single router, which can then connect to the corporate network. In this way, all the network traffic of your test environment goes through the router and is not influenced by the rest of the corporate network.

For the same reason, make sure that your test machines are “clean.” Only run software that your application requires. Other applications running at the same time or in the background can profoundly influence test results. For example, if a virus-checking routine kicks off during a benchmarking run, it can slow performance significantly.

Reproduce the Workload

To design a good benchmark, you must have a solid understanding of the workload your application will deal with in the production environment. Ask yourself:

  • What tasks does my application commonly perform? Which tasks are significant enough to measure?
  • How many users does my application accommodate during peak traffic times?

Duplicating your real-world workload exactly can be impractical or impossible, but it’s important to emulate the essential characteristics of your workload and represent them accurately. For example, if you have a customer service application that typically performs the following actions, your test application should perform the same type of actions using the same data characteristics.

  • Retrieves the customer record (one large row) from a table
  • Retrieves invoices (multiple small rows) from another table
  • Updates an invoice (one small row) as part of a transaction

Emulate the peak traffic that your application encounters in the production environment. For example, suppose that you have an intranet application that has 500 end users, many working in an office on the West Coast of the United States. At 8:00 a.m. PST on a typical workday, as few as 20 users are active, whereas at 3:00 p.m. PST, approximately 400 users are active. In this case, design the benchmark to emulate 400 (or more) users. Commercial load test tools such as HP’s LoadRunner allow you to easily emulate many concurrent users.

In the final installment in our series, we'll conclude our discussion by examining the last five of our nine guidelines.

Rob Steward has spent more than 15 years developing database access middleware, including .NET data providers, ODBC drivers, JDBC drivers, and OLE DB data providers. He is currently vice president of research and development for DataDirect Technologies (http://www.datadirect.com). John Goodson is the executive lead at DataDirect Technologies. For more than 15 years, he has worked closely with Sun Microsystems and Microsoft on the development and evolution of database connectivity standards including J2EE, JDBC, .NET, ODBC, and ADO.NET.