WAP Analytics – Bottleneck Analysis

Introduction

The intent of this article is to provide some help with how we might look at the web performance bottleneck areas and step through a process. The process should take us from being aware of potential bottlenecks to confirming where the bottlenecks exist, if any. Yes. There is a possibility that a performance test project reveals no bottlenecks. This is a good problem to have and bottleneck analysis can confirm that condition.

Allow me to communicate two definitions. First, what is application bottleneck analysis? And second, what is a bottleneck?

Application Bottleneck Analysis is a process tool which helps identify where application performance is constrained, find the root causes of those constraints, and address the root causes that have been identified.

A bottleneck is a congestion point in an application or its related frontend and backend processes (such as computer network). This condition occurs when business requests arrive or returns too quickly for the application, network, web, client, or database processing to handle. The inadequacies brought about by the bottleneck often creates response delays and ultimately higher costs.

Performance bottlenecks can result in causing an application to slow down to a crawl. The term “bottleneck” refers to hardware and software supporting the application, network, web, client, and database. At least one of these areas is unable to keep pace with the rest, thus slowing overall response time performance.

Finding Bottlenecks

For web application processing it is thought that there are five potential bottleneck areas – client, network, web, application, and database. One or more of these areas may or may not become a bottleneck in a test or production environment. It might help now to talk first in terms of principles to follow to discover areas of bottlenecks. In other words, how do you identify where a bottleneck might exist in these areas?

Let’s consider the following as principles of web application bottleneck analysis to follow.

  1. Through process of elimination reduce the number of potential bottleneck areas. As I mentioned in a previous article, the network area can be eliminated as a potential if you agree with the network engineers. They are usually right with this assumption.
  2. Measure queue counts in each area of potential bottlenecks. For instance, web, application, and database servers take the respective web, application, and database requests into queues. A long queue count is an indication that constrictions may be in progress which is slowing down responsiveness.
  3. If possible, separate the overall user response time into the five potential bottleneck processing times. Use the timings for comparison. For instance, how long is the database taking to respond? Is the processing time significant in any of the potential bottleneck areas?
  4. Use the bottleneck area processing time to identify potential bottleneck areas. For instance, if the response time average is 5 seconds, and one or more of the areas is 2 seconds or more, consider those high areas the target for finding bottlenecks. Eliminate the other areas.
  5. If load balancers are in use for any of these potential bottleneck areas, verify that the load balancing method in use is not a root cause. Otherwise, make sure the I/O and request queues are not persistently long. If load balancing is not in use for any of these bottleneck areas, except the client and network areas, consider the if installing load balancers would improve performance.
  6. CPU, memory, and throughput metrics are like queues in that they can fill up to capacity. Capacity constraints result in bottlenecks. Reduce capacity constraints.

Analyzing Bottlenecks

Before you can begin to accomplish adequate analysis of a web application, it is necessary to conduct decent inventory of the business flows or transactions. In the early stage of a performance test project, it is prudent to gather requirements. One of the deliverables should be an inventory of the business processes. You may want to dig deeper and account for the processes as I am showing in the next three spreadsheets. The objective is to get understanding of

  • how much data transmission flows between the client and server machines?
  • how much data and how many pages are managed for page rendering?
  • and what amount of data is processed on the database.

It is important to confirm that the end user experience is within the expected SLA. Analyzing the bottlenecks brings you one step closer to that assurance. Or the analysis alerts you to where the bottleneck issues are lurking.

The larger the inventory of business processes the more the charts can help. It is a burden to do this kind of analysis if the number of business flows or transactions are few. But having this in mind can take you a long way.

Treat the charts as a set of steps to complete. Step A is a first level of inventory. Here you identify the business processes and specify their functions in terms of data processing. How is the database managed for each business process? Step B takes the business processes to quantify their impact at each stage of performance areas where bottlenecks can occur. This data is crucial to Step C. In step C the last five columns are transformed into process analysis. Each process analysis item is assessed with their impact to capacity and data throughput.

Capacity and data throughput are two metrics that are important to monitor during performance testing. Capacity is a good measure in determining scalability. This is where the use of different types of performance tests (Load, Volume, Stress, and Spike) can reveal bottleneck existence. Using throughput to measure network speed is good for troubleshooting because it communicates data traffic volume. When the volume is moving slow in the network there is higher potential for packet loss. It can be measured in packets per second, bytes per second, or bits per second. Throughput is a good measure in determining if bandwidth is close to consumption. This can also tell you if an application is scalable.

Step A

Business Process

Insert

Update

Inquire

Delete

Report

Business Process 1

X

    

Business Process 2

 

X

   

Business Process 3

  

X

  

Business Process 4

   

X

 

Business Process 5

  

X

 

X

Business Process 6

X

 

X

  

Business Process 7

 

X

X

  

Business Process 8

  

X

X

 

Business Process 9

X

   

X

Business Process 10

 

X

  

X

Step B

Request Type

Client

Network

Web

Application

Database

User Login and Logoff requests

Single Page

Single Packet

User

Connect

Update

Insert only requests

Single Page

Single Packet

Data

Post

Insert

Update only requests

Single Page

Single Packet

Data

Patch

Update

Inquire only requests

Single Page

Single Packet

Data

Get

Read

Delete only requests

Single Page

Single Packet

Data

Delete

Delete

Inquire & report requests

Mult Page

Mult Packets

Page/Rpt

Get/Put

Inquire

Inquire & Insert requests

Single Page

Single Packet

Page

Get/Post

Read/ISRT

Inquire & Update requests

Single Page

Single Packet

Page

Get/Patch

Read/UPD

Inquire & Delete requests

Single Page

Single Packet

Page

Get/Delete

Read/DEL

Insert & Report requests

Mult Page

Mult Packets

Page/Rpt

Post/Put

Insert

Update & Report requests

Mult Page

Mult Packets

Page/Rpt

Patch/Put

Update

Step C

Process Analysis

Capacity Impact

Throughput Impact

Single page Client activity

Low

Low

Multi-page Client activity

Moderate-High

Moderate-high

Single packet network traffic

Low-Moderate

Low-Moderate

Multi-packet network traffic

Moderate-High

Moderate-high

User request web transmission

Moderate-High

Moderate-high

Data request web transmission

Moderate

Moderate

Page & Report request web transmission

Moderate-High

Moderate-high

Page request web transmission

Moderate-High

Moderate-high

Application Connect requests

Moderate

Moderate

Application Post requests

Moderate-High

Moderate-high

Application Patch requests

Moderate-High

Moderate-high

Application Get requests

Moderate

Moderate

Application Delete Requests

Moderate-High

Moderate-high

Application Get/Put requests

Moderate-High

Moderate-high

Application Get/Post requests

Moderate-High

Moderate-high

Application Get/Patch requests

Moderate-High

Moderate-high

Application Get/Delete requests

Moderate-High

Moderate-high

Application Post/Put requests

Moderate-High

Moderate-high

Application Patch/Put requests

Moderate-High

Moderate-high

Resolving Bottlenecks

In each bottleneck area the same constrictions are possible. The performance constrictions that are common to the bottleneck areas follow:

  • CPU utilization – a computer’s usage of processing resources, or the amount of work handled by a CPU.
  • Memory utilization – the average utilization resulting from the percent of available memory in use at a given instance in time.
  • Network utilization – the proportion of the current network traffic to the maximum amount of traffic that can be handled. It is normally bandwidth consumption.
  • Operating System limitations – an operating system that is not accompanied with adequate task management features, adequate virus protection software, and adequate thread management software.
  • Disk usage – the percentage of computer hard disk currently being used to execute programs and carry out I/O tasks. This includes database and other file I/O tasks.

For general help with tuning an application to mitigate performance bottlenecks, consider reviewing the links below. They are not exhaustive but can help lead you in the right direction.

Hardware Upgrades

Resolving bottlenecks by replacing hardware should always be a consideration unless the client does not have an adequate budget. If that is true, it is not the end of the world. There are other actions available. But when the budget is available, consider the following:

  • Memory issues can be resolved with larger memory chips installed
  • CPU upgrades are possible for CPU utilization issues
  • I/O issues can be replaced by disk upgrades
  • NIC cards can be upgraded
  • Old desktops, laptops, and servers can be replaced with newer machines
  • Network routers, switches, hubs, bridges, modems, and gateways

Software Upgrades

When hardware is not possible or not enough, software changes must be considered. The software operating system on each client machine and server used by an application should be under consideration.

The software scope includes:

  • Browsers – either upgrade or change to another that meets the need
  • Operating system release on client and server machines
  • Middleware release on servers
  • ERP vendor application releases, when involved

Application Improvements

Sometimes an application is home made. What I mean is the application was developed internally – not purchased. In this case, the application design can be a bottleneck itself with bad SQL calls, bad management of memory usage, and high disk I/O. Sometimes business processes are designed poorly where too much data is sent to the client causing a high number of packet sends multiple times before a dialog or business request is complete.

Database Tuning

In the early part of this segment, I mentioned a couple of links regarding the use of Oracle or SQL Server DBMS. There are other database brands, such as, MySQL and PostgreSQL. What about them?

Well, I intend this part of the discussion to be kept general. With that in mind, there are two essential thoughts for consideration.

First, consider SQL dialog as one of the measures to take in tuning database activity.

  • Use SELECT fields instead of using SELECT *…
  • Refrain from using SELECT DISTINCT…
  • Create joins with INNER JOIN (not WHERE) …
  • Use WHERE instead of HAVING to define filters …
  • Use wildcards at the end of a phrase only …
  • Use LIMIT to sample query results.
  • Reference this link for more detail: DB Tuning

Second, utilize DB tuning tools where possible. Oracle and SQL Server have built-in tuning software and extensive documentation on the subject. There is other tuning information for some other DBMS offerings. But you will need to do some research of your own.

Network Tuning

A little research on the internet via search engines reveals this: “One of the principle causes of poor data transfer performance is packet loss between the data transfer client and server hosts. There are many possible causes of packet loss, ranging from bad or failing hardware to misconfigured hosts or network equipment.” This is another reason I endorse getting network engineers involved in your performance test project if they are available. They can be very helpful if you engage them as the experts in their field. They know what to do with certain observations you present in your project analysis reporting.

Closing Remarks

Performance Analysis and performance tuning are not exact sciences. But following a methodic approach will lead you to identify any bottlenecks that exist in the five potential bottleneck areas. In each of the areas are computers or servers that can be affected by CPU, Memory, Processors, and disk I/O management issues.

Using performance test tools allow working with Application Performance Monitoring tools (APM). These tools work in tandem. While the performance test tool creates the traffic for the application business processes, the APM collects valuable performance metrics for all computers involved if defined. The result is performance analysis data from both tools that can be correlated and yield information that tells the story. Are there any performance bottlenecks encountered?

One more website I think can help with better understanding of web metrics is a site that posted an article entitled Web Application Performance Metrics. It focuses on the top 8 metrics that are important to track and interpret.

Let me know if there is more detail you would like to have to help you with performance testing, monitoring, and analyzing.