Monday, August 25, 2014

How to Improve Database Performance by Measuring User Experience?



What is Response Time Analysis?
Response time analysis is a new approach to application and database performance improvement that allows DBAs and developers to manage their databases guided by the most important criteria - what causes application end-users to wait. Also referred to as wait time analysis, it allows IT teams to align their efforts with service level delivery for IT customers.

The picture represents the Response Time monitoring process. Each SQL query request passes through the database instance. By measuring the time at each step, the total Response Time can be analyzed.



Rather than watching server health statistics and making guesses about their performance impact, wait and response time methods measure the time taken to complete a desired operation. The best implementations break down the time into discrete and individually measurable steps, and identify exactly which steps in which operations cause application delays. Since the database primary mission is to respond with a result, response time is the most important criteria in making database performance decisions.

Response Time = Processing Time + Waiting Time
Response time is defined as the sum of actual processing time and the time as session spends waiting on availability of resources such as a lock, log file or hundreds of other Wait Events or Wait Types. Even when the session has access to the CPU (a CPU Wait Type for example), it is not necessarily being actively processed, since often the CPU is waiting for an I/O or other operation to complete before processing can continue. When multiple sessions compete for the same processing resources, the wait time becomes the most significant component of the actual Response Time.

Wait Events and Wait Types
 
To accurately measure the Response Time for a database, it is necessary to discretely identify the steps accumulating time. The steps corresponding to physical I/O operations, manipulating buffers, waiting on locks, and all other minute database processes are instrumented by the database vendors. In SQL Server, these steps are called Wait Types. In Oracle, Sybase and DB2, they are referred to as Wait Events. While the specifics are unique for each vendor, the general idea is the same. These Wait Types/Events indicate the amount of time spent while sessions wait for each database resource. If the Wait Types/Events can be accurately monitored and analyzed, the exact bottlenecks and queries causing the delays can be determined.

Differences vs. Conventional Statistics
 
Typical database performance monitoring tools focus on server health measures and execution ratios. Even with a sophisticated presentation these statistics do not reflect the end-user experience or reveal where the problem originated. Knowing an operation took place millions of times does not inform whether it was actually the cause of an application delay.

Key criteria to distinguish Response Time vs. Conventional analysis methods:

Measure response time for an action to take place, from receipt of request to beginning of response.
Measure each SQL query separately, so the response time effects of a specific SQL can be isolated and evaluated. Measuring total response time across the instance does not give useful information.
Identify the discrete internal steps (Wait Types/Events) that a SQL query takes as it is processed. Treating the instance as a black-box without seeing where the time is consumed internally does not help problem solving.

Practical Considerations for Response Time Analysis
The Response Time approach to performance monitoring is only practical if it can be implemented efficiently in a performance sensitive production environment. Confio uses low-impact agentless technology to meet this requirement. Here are some practical considerations:
  • Low Impact Data CaptureData capturing should not place a burden on your production systems. 
  • Agentless architectures offload processing to a separate system that reduces production database impact to less than 1%.Agentless Database OperationEliminate need to test, install and maintain software on production servers.
  • Passive Monitoring of Production DataMonitor real production sessions, not simulated test transactions.
  • Continuous 7/24 MonitoringInsist on continuous monitoring across all sessions on all servers to ensure any operation can be deeply examined at any time. Occasional trace files will not provide continuous coverage.

Do You Know If Your Database Is Slow?



The time to respond:

There was a question at Pythian a while ago on how to monitor Oracle database instance performanceand alert if there is significant degradation. That got me thinking, while there are different approaches that different DBAs would take to interactively measure current instance performance, here we would need something simple. It would need to give a decisive answer and be able to say that “current performance is not acceptable” or “current performance is within normal (expected) limits”.

Going to the basics of how database performance can be described, we can simply say that database performance is either the response time of the operations the end-user do and/or the amount of work the database instance does in a certain time period – throughput.

We can easily find these metrics in from the v$sysmetric dynamic view:

SQL> select to_char(begin_time,'hh24:mi') time, round( value * 10, 2) "Response Time (ms)"
from v$sysmetric
where metric_name='SQL Service Response Time
'

TIME Response Time (ms)
--------------- ------------------
07:20 .32


So this is the last-minute response time for user calls (here in ms). We can check the throughput by checking the amount of logical blocks (it includes the physical blocks) being read, plus we can add direct reads (last minute and last several seconds output here for a database with 8 KB block):

SQL> select a.begin_time, a.end_time, round(((a.value + b.value)/131072),2) "GB per sec"

from v$sysmetric a, v$sysmetric b

where a.metric_name = 'Logical Reads Per Sec'

and b.metric_name = 'Physical Reads Direct Per Sec'

and a.begin_time = b.begin_time


BEGIN_TIME END_TIME GB per sec

-------------------- -------------------- ----------

16-jun-2013 08:51:36 16-jun-2013 08:52:37 .01

16-jun-2013 08:52:22 16-jun-2013 08:52:37 .01


We can check more historical values through v$sysmetric_summary, v$sysmetric_history and dba_hist_ssysmetric_summary.

So did these queries answer the basic question “Do we have bad performance?”? 100 MB/sec throughput and 0.32 ms for a user call? We have seen better performance, but is it bad enough that we should alert the on-call DBA to investigate in more detail and look for the reason why we are seeing this kind of values? We cannot say. We need something to compare these values to so that we can determine if they are too low or too high. It is somewhat like being in a train that passes next to another moving train, going in same direction but at a different speed. We don’t know the speed of our train, and we don’t know the speed of the other train, so we cannot answer the question “Are we going very fast?”. If we turn to the other side and see a tree passing on the other side of the train, we will be able to estimate the speed of the train (also taking into account our experience of what is very fast for a train…). So we need something that has an absolute value. In the case of the tree, we know that the tree has speed of 0

The Difference Between HTTP and HTTPS



Most web addresses begin with "HTTP," which is an acronym for "Hyper Text Transfer Protocol." It's the protocol used to allow you to communicate with web sites.


"HTTPS" stands for "Hyper Text Transfer Protocol Secure." It means that information exchanged between you and a web site is encrypted and cannot be hijacked by someone who might want to electronically eavesdrop when you type a credit card number, a password, a social security number, or any other person information.

The purpose of the email is to encourage you to check for the "HTTPS" before you give financial information. Most web sites are not HTTPS, but when you click a link to make a purchase, many of them will direct you to an HTTPS site.


According to Verisign.com, a provider of Internet infrastructure services, Secure Socket Layer Encryption is a technology that protects Web sites and makes it easy to develop trust by means of an "SSL Certificate that enables encryption of sensitive information during online transactions. Each SSL Certificate contains unique, authenticated information about the certificate owner and a Certificate Authority verifies the identity of the certificate owner when it is issued. "

Just because a website uses such SSL encryption does not safeguard internet users from phishing and other schemes. When visiting websites that accept financial information online it is always a wise practice to make sure the online company is legitimate, has a good reputation in customer service and uses SSL encryption in their transactions.