Three key symptoms to start diagnosing problems.
* Resource bottlenecks: CPU, memory, and I/O bottlenecks are covered in this paper. We do not consider network issues. For each resource bottleneck, we describe how to identify the problem and then iterate through the possible causes. For example, a memory bottleneck can lead to excessive paging that ultimately impacts performance.
* Tempdb bottlenecks: Since there is only one tempdb for each SQL Server instance, this can be a performance and a disk space bottleneck. A misbehaving application can overload tempdb both in terms of excessive DDL/DML operations and in space. This can cause unrelated applications running on the server to slow down or fail.
* A slow running user query: The performance of an existing query may regress or a new query may appear to be taking longer than expected. There can be many reasons for this. For example:
** Changes in statistical information can lead to a poor query plan for an existing query.
**Missing indexes can force table scans and slow down the query.
**An application can slow down due to blocking even if resource utilization is normal.
Excessive blocking which is not good, for example, can be due to poor application or schema design or choosing an improper isolation level for the transaction.
The causes of these symptoms are not necessarily independent of each other. The poor choice of a query plan can tax system resources and cause an overall slowdown of the workload. So, if a large table is missing a useful index, or the query optimizer decides not to use it, this not only causes the query to slow down but it also puts heavy pressure on the I/O subsystem to read the unnecessary data pages and on the memory (buffer pool) to store these pages in the cache. Similarly, excessive recompilation of a frequently running query can put pressure on the CPU.
Resource Bottlenecks: –
the CPU, memory, and I/O subsystem resources and how these can become bottlenecks. (Network issues are outside of the scope of this paper.) For each resource bottleneck, we describe how to identify the problem and then iterate through the possible causes. For example, a memory bottleneck can lead to excessive paging, which can ultimately impact performance.
Before you can determine if you have a resource bottleneck, you need to know how resources are used under normal circumstances. You can use the methods outlined in this paper to collect baseline information about the use of the resource (when you are not having performance problems).
You might find that the problem is a resource that is running near capacity and that SQL Server cannot support the workload in its current configuration. To address this issue, you may need to add more processing power, memory, or increase the bandwidth of your I/O or network channel. But, before you take that step, it is useful to understand some common causes of resource bottlenecks.
Tools we can used to resolve a particular resource bottleneck.
* System Monitor (PerfMon): This tool is available as part of Windows. For more information, please see the System Monitor documentation.
** SQL Server Profiler: See SQL Server Profiler in the Performance Tools group in the SQL Server 2005 program group.
*** DBCC commands.
A CPU bottleneck that happens suddenly and unexpectedly, without additional load on the server, is commonly caused by a nonoptimal query plan, a poor configuration, or design factors, and not insufficient hardware resources. Before rushing out to buy faster and/or more processors, you should first identify the largest consumers of CPU bandwidth and see if they can be tuned.
System Monitor is generally the best means to determine if the server is CPU bound. You should look to see if the Processor:% Processor Time counter is high; values in excess of 80% processor time per CPU are generally deemed to be a bottleneck. You can also monitor the SQL Server schedulers using the sys.dm_os_schedulers view to see if the number of runnable tasks is typically nonzero. A nonzero value indicates that tasks have to wait for their time slice to run; high values for this counter are a symptom of a CPU bottleneck.
Compilation and recompilation:-
A a batch or remote procedure call (RPC) is submitted to SQL Server, before it begins executing the server checks for the validity and correctness of the query plan. If one of these checks fails, the batch may have to be compiled again to produce a different query plan. Such compilations are known as recompilations. These recompilations are generally necessary to ensure correctness and are often performed when the server determines that there could be a more optimal query plan due to changes in underlying data.
SQL Server 2005 introduces statement-level recompilation of stored procedures. When SQL Server 2005 recompiles stored procedures, only the statement that caused the recompilation is compiled—not the entire procedure. This uses less CPU bandwidth and results in less contention on lock resources such as COMPILE locks. Recompilation can happen due to various reasons, such as:
• Schema changed
• Statistics changed
• Deferred compile
• SET option changed
• Temporary table changed
• Stored procedure created with the RECOMPILE query hint or which uses OPTION (RECOMPILE)
* System Monitor (Perfmon)
The SQL Statistics object provides counters to monitor compilation and the type of requests that are sent to an instance of SQL Server. You must monitor the number of query compilations and recompilations in conjunction with the number of batches received to find out if the compiles are contributing to high CPU use. Ideally, the ratio of SQL Recompilations/sec to Batch Requests/sec should be very low unless users are submitting ad hoc queries.
The key data counters to look are as follows.
* SQL Server: SQL Statistics: Batch Requests/sec
** SQL Server: SQL Statistics: SQL Compilations/sec
*** SQL Server: SQL Statistics: SQL Recompilations/sec
* SQL Trace:-
If the PerfMon counters indicate a high number of recompiles, the recompiles could be contributing to the high CPU consumed by SQL Server. We would then need to look at the profiler trace to find the stored procedures that were being recompiled. The SQL Server Profiler trace gives us that information along with the reason for the recompilation. You can use the following events to get this information.
SP:Recompile / SQL:StmtRecompile. The SP:Recompile and the SQL:StmtRecompile event classes indicate which stored procedures and statements have been recompiled. When you compile a stored procedure, one event is generated for the stored procedure and one for each statement that is compiled. However, when a stored procedure recompiles, only the statement that caused the recompilation is recompiled (not the entire stored procedure as in SQL Server 2000). Some of the more important data columns for the SP:Recompile event class are listed below. The EventSubClass data column in particular is important for determining the reason for the recompile. SP:Recompile is triggered once for the procedure or trigger that is recompiled and is not fired for an ad hoc batch that could likely be recompiled. In SQL Server 2005, it is more useful to monitor SQL:StmtRecompiles as this event class is fired when any type of batch, ad hoc, stored procedure, or trigger is recompiled.
The key data columns we look at in these events are as follows.
• ObjectID (represents stored procedure that contains this statement)
* Showplan XML For Query Compile. The Showplan XML For Query Compile event class occurs when Microsoft SQL Server compiles or recompiles a SQL statement. This event has information about the statement that is being compiled or recompiled. This information includes the query plan and the object ID of the procedure in question. Capturing this event has significant performance overhead, as it is captured for each compilation or recompilation. If you see a high value for the SQL Compilations/sec counter in System Monitor, you should monitor this event. With this information, you can see which statements are frequently recompiled. You can use this information to change the parameters of those statements. This should reduce the number of recompiles.
DMVs. When you use the sys.dm_exec_query_optimizer_info DMV, you can get a good idea of the time SQL Server spends optimizing.
Consider the following options if you have detected inefficient query plans.
• Tune the query with the Database Engine Tuning Advisor to see if it produces any index recommendations.
• Check for issues with bad cardinality estimates.
Are the queries written so that they use the most restrictive WHERE clause that is applicable? Unrestricted queries are resource intensive by their very nature.
Run UPDATE STATISTICS on the tables involved in the query and check to see if the problem persists.
Does the query use constructs for which the optimizer is unable to accurately estimate cardinality? Consider whether the query can be modified in a way so that the issue can be avoided.
• If it is not possible to modify the schema or the query, SQL Server 2005 has a new plan guide feature that allows you to specify query hints to add to queries that match certain text. This can be done for ad hoc queries as well as inside a stored procedure. Hints such as OPTION (OPTIMIZE FOR) allow you to impact the cardinality estimates while leaving the optimizer its full array of potential plans. Other hints such as OPTION (FORCE ORDER) or OPTION (USE PLAN) allow you varying degrees of control over the query plan.
Cursor usage: Cursors allow row-by-row processing of the result sets. Types of cursors: Static, Dynamic, Forward-only, Keyset-driven.
Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip; where as a normal
SELECT query makes only one roundtrip, however large the result set is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Further, there are restrictions on the SELECT statements that can be
used with some types of cursors.
Versions of SQL Server prior to SQL Server 2005 only supported a single active common per connection. A query that was executing or had results pending to send to the client was considered active. In some situations, the client application might need to read through the results and submit other queries to SQL Server based on the row just read from the result set. This could not be done with a default result set, since it could have other pending results. A common solution was to change the connection properties to use a server-side cursor.
When using a server-side cursor, the database client software (the OLE DB provider or ODBC driver) transparently encapsulates client requests inside of special extended stored procedures, such as sp_cursoropen, sp_cursorfetch, and so forth. This is referred to as an API cursor (as opposed to a TSQL cursor). When the user executes the query, the query text is sent to the server via sp_cursoropen, requests to read from the result set would result in an sp_cursorfetch instructing the server to only send back a certain number of rows. By controlling the number of rows that are fetched, it is possible for the ODBC driver or OLE DB provider to cache the row(s). This prevents a situation where the server is waiting for the client to read all the rows it has sent. Thus, the server is ready to accept a new request on that connection.
On a fast network such as WAN with many different user connections, the overhead required to process many cursor requests may become significant. Because of the overhead associated with repositioning the cursor to the appropriate location in the result set, per-request processing overhead, and similar processing, it is more efficient for the server to process a single request that returns 100 rows than to process 100 separate requests which return the same 100 rows but one row at a time.
Troubleshoot poor cursor usage.
System Monitor: By looking at the SQL Server:Cursor Manager By Type – Cursor Requests/Sec counter, you can get a general feel for how many cursors are being used on the system by looking at this performance counter. Systems that have high CPU utilization because of small fetch sizes typically have hundreds of cursor requests per second. There are no specific counters to tell you about the fetch buffer size.
Use a trace that includes the RPC:Completed event class search for sp_cursorfetch statements. The value of the fourth parameter is the number of rows returned by the fetch. The maximum number of rows that are requested to be returned is specified as an input parameter in the corresponding RPC:Starting event class.
• Determine if cursors are the most appropriate means to accomplish the processing or whether a set-based operation, which is generally more efficient, is possible.
• Consider enabling multiple active results (MARS) when connecting to SQL Server 2005.
• Consult the appropriate documentation for your specific API to determine how to specify a larger fetch buffer size for the cursor:
ODBC – SQL_ATTR_ROW_ARRAY_SIZE
OLE DB – IRowset::GetNextRows or IRowsetLocate::GetRowsAt
The maximum memory support options for different configurations of SQL Server 2005. Particular edition of SQL Server or Windows may put more restrictive limits on the amount of supported memory.
|Configuration||VAS||Max physical memory||AWE/locked pages support|
|Native 32-bit on 32-bit OSwith /3GB boot parameter||2 GB3 GB||64 GB16 GB||YesYes|
|32-bit on x64 OS (WOW)||4 GB||64 GB||Yes|
|32-bit on IA64 OS (WOW)||2 GB||2 GB||No|
|Native 64-bit on x64 OS||8 terabyte||1 terabyte||Yes|
|Native 64-bit on IA64 OS||7 terabyte||1 terabyte||Yes|
Identifying when SQL Server runs under a memory pressure will help you troubleshoot memory-related issues. SQL Server responds differently depending on the type of memory pressure that is present. The following table summarizes the types of memory pressures, and their general underlying causes. In all cases, you are more likely to see timeout or explicit out-of-memory error messages.
|Physical||Physical memory (RAM) running low. This causes the system to trim working sets of currently running processes, which may result in overall slowdown.SQL Server detects this condition and, depending on the configuration, may reduce the commit target of the buffer pool and start clearing internal caches.||SQL Server detects high memory consumption internally, causing redistribution of memory between internal components.Internal memory pressure may be a result of:
|Virtual||Running low on space in the system page file(s). This may cause the system to fail memory allocations, as it is unable to page out currently allocated memory. This condition may result in the whole system responding very slowly or even bring it to a halt.||Running low on VAS due to fragmentation (a lot of VAS is available but in small blocks) and/or consumption (direct allocations, DLLs loaded in SQL Server VAS, high number of threads).SQL Server detects this condition and may release reserved regions of VAS, reduce buffer pool commit target, and start shrinking caches.|
If physical memory is running high or low. SQL Server uses this mechanism in its memory management decisions.
The following tools and sources of information could be used for troubleshooting.
- Memory related DMVs
- DBCC MEMORYSTATUS command
- Performance counters: performance monitor or DMV for SQL Server specific object
- Task Manager
- Event viewer: application log, system log
More update soon.
Khan MS SQL DBA – MCTS www.addarr.com