Thursday, January 5, 2012

Performance Test Monitoring


Performance monitoring is the process of collecting and analyzing the server data to compare the server statistics against the expected values. It helps the performance tester to have the health check of the server during the test. By monitoring the servers during the test, one can identify the server behavior for load condition and take steps to change the server behavior by adopting software or hardware performance tuning activities.
Each performance counter helps in identifying a specific value about the server performance. For example, % CPU Utilization is a performance counter which helps in identifying the utilization level of the CPU.
In a nutshell, server monitoring should provide information on four parameters of any system: Latency, Throughput, Utilization and Efficiency, which helps in answering the following questions.

      • Is your server available?
      • How busy is your CPU?
      • Is there enough Primary Memory (RAM)?
      • Is the disk fast enough?
      • Is there any other hardware issues?
      • Is the hardware issue result of software malfunctioning?
Always start with few counters and once you notice a specific problem, start adding few more counters related to the symptom. Start monitoring the performance of the major resources like CPU, Memory, Disk or Network. This section provides you the details about key counters from each of above mentioned 4 areas which are very important for a Performance Tester to know.


Processor Bottlenecks
The bottlenecks related to processor (CPU) are comparatively easy to identify. The important performance counters that helps in identifying the processor bottleneck includes

% Processor Utilization (Processor_Total: % Processor Time) – This counter helps in knowing how busy the system is. It indicates the processor activity. It is the average percentage of elapsed time that the processor spends to execute a productive (non-idle) thread. A consistent level of more than 80% utilization (in case of single CPU machines) indicates that there is not enough CPU capacity. It is worth further investigation using other processor counters. 

% User time (Processor_Total: % User Time) – This refers to the processor’s time spent in handling the application related processes. A high percentage indicates that the application is consuming high CPU. The process level counters needs to be monitored to understand which user process consumes more CPU. 

% Privileged time (Processor_Total: % Privilege Time) – This refers to the processor’s time spent in handling the kernel mode processes. A high value indicates that the processor is too busy in handling other operating system related activities. It needs immediate attention from the system administrator to check the system configuration or service.

% I/O Wait (%wio - in case of UNIX platforms) – This refers to the percentage wait for completion of I/O activity. It is a good indication to confirm whether the threads are waiting for the I/O completion.

Processor Queue Length (System: Processor Queue Length) – This counter helps in identifying how many threads are waiting in queue for execution. A consistent queue length of more than 2 indicates bottleneck and it is worth investigation. Generally if the queue length is more than the number of CPUs available in the system, then it might reduce the system performance. A high value of % usr time coupled with high processor queue length indicates the processor bottleneck.

Other counters of interest:
Other counters like Processor: Interrupts per second, System: Context Switches per second can be used in case of any specific issues. Interrupts per second refers to the number of interrupts that the hardware devices sends to the processor. A consistent value of above 1000 in Interrupts per second indicates hardware failure or driver configuration issues. Context Switches refers to the switching of the processor from a lower priority thread to a high priority thread. A consistent value of above 15000 per second per processor indicates the presence of too many threads of same priority and possibility of having blocked threads.

Windows operating system comes with a performance monitoring tool called Perfmon. This monitoring tool can be used to collect the server statistics during the performance test run. Normally, most of the performance testing tools have its own monitors to monitor the system resources of the system under test. In this case, it becomes easy to compare the system load related metrics with the system resource utilization metrics to arrive at a conclusion. Infact, any performance testing tool that monitors windows machine internally talks to perfmon to collect the system resource utilization details. 
There are lots of licensed tools available in the market which is used for post production monitoring. These tools are used to capture the web server traffic details and provide online traffic details. A very popular tool of this category is WebTrends. Many organizations uses this tool to get to the traffic trends of an application running in production environment. There are other tools like HP OpenView tools which run in production servers and monitor the server resource utilization levels. It provides alarming mechanism to indicate the heavy usage and provides easy bottleneck isolation capabilities. But due to the cost involved with these kinds of tools, most of small organizations don’t opt for them. But post production monitoring data would be of great use for designing realistic performance tests.
· Allows you to analyze and isolate the performance problem.
· Understand the resource utilization of the server and make best use of them.
· Plan for Capacity Planning activity based on the resource utilization level.
        · Provides the server performance details offline (by creating a alert of sending a mail/message when resource utilization reaches the threshold value).

Tips to define the Performance Test Strategy

Best practices in defining performance test strategy
 
Most of the application would need Load tests, Stress tests and Stability(Endurance) tests to be planned before moving it to production. For data voluminous applications, there might be more emphasis on Volume testing and some applications like Auction site might emphasis on conducting spike tests with sudden spikes. For all other applications, its common that at least 3 cycles of load tests followed by stress test and endurance test run (at least for 12 hours) needs to be planned. (This is on a general case, but based on the application context, definitely there will be changes done to this approach).

For any application, irrespective of time availability , its always recommended to start with Baseline test. Baseline test is nothing but one user test which is run to identify the correctness of the test scripts and also it helps is checking whether the application meets the SLAs (Service Level Agreements) for 1 user load. This values can be used a benchmark for newer versions to compare ths performance improvement.
Next comes the Benchmark test for at least 15-20% of the target load. It helps in identifying the correctness of the test scripts and tests the readiness of the system before running target load tests.
Its a very good practice to always start with Baseline & Benchmark tests on any application before conducting the scheduled load tests. I would say, in fact its Scott barber's way of course.
When it comes to the scheduled tests - Load Tests, always plan to run at least 3 rounds of load test. Irrespective of doing a slow ramp , its advised to have 3 individual load tests for 50%, 75% and 100% target load. (The load level should be defined based on the system performance rather than just 50%, 75% & 100% target load levels). 

Test Scenario - Have a slow ramp up followed by a stable period for at least an hour and ramp down. During this stable period, the target user load needs to perform various operations on the system with realistic think times. All the metrics measured should correspond only to the stable period and not during ramp up/ramp down period. Don't conclude any transaction response time just based on 1 or 2 iterations. The server should be monitored at least for a minimum of 5 iterations (at the same load level), before concluding the response time metrics, because there could be some reason for high /less response time at any single point of time. Thats the reason watch the server performance for at least 5 iterations at the same load level(during stable load period) and use the 90th percentile response time to report the response time metrics.

Test Metrics - Look for 90th percentile response time and standard deviation, of-course along with hits/sec and other server resource monitors. More the deviation(more than 1), more burst is the graph and hence its recommended to rerun the test. 

Load Tests should be always followed by Stress Tests - Based on the Load test results, slowly increase the server load step by step and find out the server break point. For this test, realistic think time settings and cache settings are not required as the objective of this test to know the server break point and how it fails.

Endurance (Stability) Tests needs to be run at least for 10-12 hours in order to identify the memory bottlenecks. This need not be run for peak load, but it can be run for average load levels.

Directions for fellow Performance Testers

Before certifying the application performance,

are you satisfied with your own performance? are you sure that you are conducting effective performance test? Have you ever thought how to ensure the your performance test strategy and are you sure that your performance test report is bug free?

The following tips would help you to get more maturity and to do your job better

Yes..You need to be comfortable in the terminologies used in the performance engineering field and need to learn one or two popular and effective performance test tools and develop bottleneck analysis skills. That's the basic skills required for being a performance tester. Then you are matured enough to look back at your performance and look for how effective you are in your activities.

That's where, you need to start looking at Analytical modeling and simulation modeling approaches to move to your next level of expertise.

Foremost thing is that, understand the operational laws and start applying those laws in your day to day test activities.

Secondly start learning about Workload Modeling concepts and Statistical distribution fitting and apply those learning's to define performance testing goals/SLAs.

Thirdly start learning about Demand planning and Performance prediction techniques. Learn Mean value analysis and Erlang's method.

Fourth step is start learning about Capacity Planning.

This is the second level of maturity you need to gain. The next advanced third level is moving towards simulation modeling and advanced capacity planning.

Hope this helps to set your career path and plan to work towards it.

How do you know if your Load Test has a bottleneck

The bottleneck in a system may not be obvious. (Life would be easier but less fun if there where always easy to find). This is because there are two types “hard” and “soft”. Hard bottlenecks are the ones where a resource such as a CPU is working flat out which limits the ability of the system to process more transaction. While a soft bottleneck is some internal limit such at number of threads or connections that once all used limit the ability to process more transaction. Therefore, how do you find know if you have a bottleneck. If you are looking at the results from a single load test you may not know you will need to run multiple load tests at different numbers of virtual users and then see if you number of transactions per second increase with each increase in virtual users. The results can be seen in the two graphs below. The first shows how the throughput (transaction per seconds) increases and levels off when saturated and the second shows the response time. You will probably have heard the express below the knee of the curve and this is an the point that is to the left of the bend in the response time graph.

Throughput Graph
Throughput Graph
Response Time Graph

The graphs above where actually generated using a spreadsheet model for the performance of a closed loop model. This is like LoadRunner and other testing tools where the are a fixed number of users that use the system then wait and return to the system. The reality is that the performance graphs may look different from the expected norm. An example is shown below from a LoadRunner test the first graph shows how the number of VUser where increased during the test and the second graph shows the increase in response times. In this case the jump in response time is dramatic. However, in some cases the increase in response time will be less dramatic as the system will start to error at high loads which will distort the response time figures.
Example LoadRunner VUser Graph

Example LoadRunner Graph Showing Increasing Response Times

Having discovered there is a bottleneck in the system then you have to start looking for it.

Importance of Little's Law

Little's law is quite simple and intuitively appealing.

The law states that the average number of customers in a system (over some time interval), N, is equal to their average arrival rate, X, multiplied by their average time in the system, R.

N = X . R (or) for easy remembrance use L = A . W

This law is very important to check whether the load testing tool is not a bottleneck.
For Example, in a shop , if there are always 2 customers available at the counter queue , wherein the customers are entering the shop at the rate of 4 per second , then time taken for the customers to leave from the shop can be calculated as

N = X. R

R = 2/4 = 0.5 seconds


A Simple example of how to use this law to know how many virtual users licenses are required:

Web system that has peak user load per hour = 2000 users
Expected Response time per transaction = 4 seconds
The peak page hits/sec = 80 hits/sec

For carrying out Performance tests for the above web system, we need to calculate how many number of Virtual user licenses we need to purchase.

N = X . R
N = 80 . 4 = 320

Therefore 320 virtual user licenses are enough to carry out the Load Test.

Thursday, November 10, 2011

Welcome to Performance Testing Blog

Hi, I am Sudhakar. Currently am working as a Performance Test Engineer @MNC company Hyderabad.

I am providing training sessions (online & regular classes) on performance testing with load runner and rational performance tester (RPT) tool with real time process.

If anybody interested to learn performance testing tools, you can reach me ”sudhakar6r@gmail.com”.

Siebel Correlation with custom code in LoadRunner

“Siebel _Star_Array” is a function generated by LoadRunner for correlating Siebel field values separated by Token "*". A sample format of the server response is given below and these field values are required to pass to the subsequent requests.

"1*N9*1-18506217*GEOFFTH4*Open10*Unassigned3*OEM12*Hardware-OEM14*Administrative1*01*010*Symptom X18*MSE-Perf7*GEOFFTH1*112*Geoff Thomas1*10*6* Normal19*10/16/2002 03:40:127*Sev - 47*1-13NY5"

Last Character/s (Highlighted in Red) in each of the fields denotes the length of the next field values. Because of which left and right boundaries will be dynamic and difficult to correlate.

The below "siebel _Star_Array” function provided with LoadRunner has some limitations and does not work at all times. It is it is tough to debug the errors without the source code

web_reg_save_param("WCSParam96",
"LB/IC=`ValueArray`",
"RB/IC=`",
"Ord=10",
"Search=Body",
"RelFrameId=1",
"AutoCorrelationFunction=flCorrelationCallbackParseStarArray",
LAST);

The sample code I had created can be used to parse the response and use it for correlation when compared to the Loadrunner Automactic Correlation

vuser_init()
{

char str[1024];
char separators[] = "*";
char *token;
char arrValues[50][20];
char arrFinalValues[50][20];
int i;
int j;
int length_old;
int length_new;
char newlength[2];
char actualValue[20];

/****************** Sample Text format****************************** */

strcpy(str, "1*N9*1-18506217*GEOFFTH4*Open10*Unassigned3*OEM12*Hardware-OEM14*Administrative1*01*010*Symptom X18*MSE-Perf7*GEOFFTH1*112*Geoff Thomas1*10*3 - Normal19*10/16/2002 03:40:127*Sev - 47*1-13NY5");

lr_output_message("%s",str);

/***** The following code will be used to split the text into strings based on the token *******/
token = (char *)strtok(str, separators); /* Get the first token */
if(!token) {
lr_output_message("No tokens found in string!");
return( -1 );
}

i=0;
while( token != NULL ) { /* While valid tokens are returned */
strcpy(arrValues[i],token);
lr_output_message("Initial array values is %s",arrValues[i]);
token = (char *)strtok(NULL, separators); /* Get the next token */
i++;
}

/*******************************************************************/
/*************** To remove Trail charecters ***********************/

for (j=0; j less than i; j++) //use lessthan sysmbol
{
if (j==0) {
strcpy(arrFinalValues[j],arrValues[j]);
length_old=strlen(arrValues[j]);
}
else{
length_new=strlen(arrValues[j]);
strncpy(arrFinalValues[j], arrValues[j], length_old);
if(length_new>length_old+1){
sprintf(newlength,"%c%c",arrValues[j][length_old],arrValues[j][length_old+1]);
length_old=atoi(newlength);
}
else{
sprintf(newlength,"%c",arrValues[j][length_old]);
length_old=atoi(newlength);
}//End of Else
}//End of Out Else
}//End of For

/* Final Data in the Array are */
for (j=0; j less than i; j++)//less than symbol
{
lr_output_message("Values after removing tail charecters %s",arrFinalValues[j]);
}
return 0;
}