Awesomesauce: Perf Dashboard on Native SSRS

I had been asked the question about using the Perf Dashboard reports from a Reporting Services instance multiple times. Though this was not the intention of launching the SQL Server Performance Dashboard Reports but there is a compelling need at times to have these accessible from a web URL. In today’s age, you will find a lot of DBAs monitor their SQL Server instances remotely. Sometimes, there is another layer of complexity added to this when they want to just look at the health of the SQL Server instance by accessing a URL exposed through a corporate server in their environment. In such cases, you wouldn’t want to jump through a few hoops of setting up your VPN connection and blah blah.

So with all that in mind, let’s talk about how you can get your SQL Server 2012 Performance Dashboard to your existing SQL Server Reporting Services instance.

  • First you need to create a new Reporting Services Project using SQL Server Data Tools (SSDT).
  • Use the solution explorer to add all the existing performance dashboard reports from the C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Performance Dashboard folder.
  • Modify the project properties to ensure that the data source gets written to the same folder as the reports.
  • Create a shared data source pointing to your SQL Server instance.
  • Open each report that was imported into the project and modify the data source properties to use the shared data source that you created in the above step.
  • Modify the project deployment properties as shown below.

image

Now you are ready to deploy your Performance Dashboard SSRS project to your reporting services instance.

What can you do next?

You can now set up Report Server subscriptions for the set of reports that you wish to receive via email. You can get a status report of your SQL Server instance without even having to lift a finger and that to right in your mailbox! Smile

Note that Performance Dashboard is a tool which is provided “AS-IS” by Microsoft. The steps mentioned above will help you deploy the existing Performance Dashboard reports to a SQL Server Reporting Services instance. However, the failure to deploy these reports as mentioned above is not liable for support by Microsoft SQL Server Support team.

Advertisement

System Health Session: Part 4

This is the last post for the System Health Session series. In this post, I shall show you how to extract deadlock related information from the deadlock graph captured by the System Health Session.

The deadlock graph captured typically has three distinct nodes:

victim-list – Deadlock victim’s process identifier
process-list – Information pertaining to all the processes involved in the deadlock
resource-list – Information about the resources involved in the deadlock

The query below will provide you with the time stamp when the deadlock was reported along with victim process identifier.

 -- Fetch the Health Session data into a temporary table

SELECT CAST(xet.target_data AS XML) AS XMLDATA
INTO #SystemHealthSessionData
FROM sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = 'system_health'
-- Gets the Deadlock Event Time and Victim Process
SELECT C.query('.').value('(/event/@timestamp)[1]', 'datetime') as EventTime,
CAST(C.query('.').value('(/event/data/value)[1]', 'varchar(MAX)') AS XML).value('(<a>/deadlock/victim-list/victimProcess/@id)[1]','varchar(100)'</a>) VictimProcess
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)
WHERE C.query('.').value('(/event/@name)[1]', 'varchar(255)') = 'xml_deadlock_report'
-- Drop the temporary table
DROP TABLE #SystemHealthSessionData 

The next query (when provided with an event time from the above query output)
will provide you a parsed version of the process list in a tabular format which
can be easier to read when you have a large number of sessions involved in the
deadlock.

 -- Fetch the Health Session data into a temporary table

SELECT CAST(xet.target_data AS XML) AS XMLDATA
INTO #SystemHealthSessionData
FROM sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = 'system_health'

-- Parses the process list for a specific deadlock once provided with an event time for the deadlock from the above output

;WITH CTE_HealthSession (EventXML) AS
(
SELECT CAST(C.query('.').value('(/event/data/value)[1]', 'varchar(MAX)') AS XML) EventXML
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)
WHERE C.query('.').value('(/event/@name)[1]', 'varchar(255)') = 'xml_deadlock_report'
AND C.query('.').value('(/event/@timestamp)[1]', 'datetime') = '2011-09-28 06:24:44.700' -- Replace with relevant timestamp
)
SELECT DeadlockProcesses.value('(@id)[1]','varchar(50)') as id
,DeadlockProcesses.value('(@taskpriority)[1]','bigint') as taskpriority
,DeadlockProcesses.value('(@logused)[1]','bigint') as logused
,DeadlockProcesses.value('(@waitresource)[1]','varchar(100)') as waitresource
,DeadlockProcesses.value('(@waittime)[1]','bigint') as waittime
,DeadlockProcesses.value('(@ownerId)[1]','bigint') as ownerId
,DeadlockProcesses.value('(@transactionname)[1]','varchar(50)') as transactionname
,DeadlockProcesses.value('(@lasttranstarted)[1]','varchar(50)') as lasttranstarted
,DeadlockProcesses.value('(@XDES)[1]','varchar(20)') as XDES
,DeadlockProcesses.value('(@lockMode)[1]','varchar(5)') as lockMode
,DeadlockProcesses.value('(@schedulerid)[1]','bigint') as schedulerid
,DeadlockProcesses.value('(@kpid)[1]','bigint') as kpid
,DeadlockProcesses.value('(@status)[1]','varchar(20)') as status
,DeadlockProcesses.value('(@spid)[1]','bigint') as spid
,DeadlockProcesses.value('(@sbid)[1]','bigint') as sbid
,DeadlockProcesses.value('(@ecid)[1]','bigint') as ecid
,DeadlockProcesses.value('(@priority)[1]','bigint') as priority
,DeadlockProcesses.value('(@trancount)[1]','bigint') as trancount
,DeadlockProcesses.value('(@lastbatchstarted)[1]','varchar(50)') as lastbatchstarted
,DeadlockProcesses.value('(@lastbatchcompleted)[1]','varchar(50)') as lastbatchcompleted
,DeadlockProcesses.value('(@clientapp)[1]','varchar(150)') as clientapp
,DeadlockProcesses.value('(@hostname)[1]','varchar(50)') as hostname
,DeadlockProcesses.value('(@hostpid)[1]','bigint') as hostpid
,DeadlockProcesses.value('(@loginname)[1]','varchar(150)') as loginname
,DeadlockProcesses.value('(@isolationlevel)[1]','varchar(150)') as isolationlevel
,DeadlockProcesses.value('(@xactid)[1]','bigint') as xactid
,DeadlockProcesses.value('(@currentdb)[1]','bigint') as currentdb
,DeadlockProcesses.value('(@lockTimeout)[1]','bigint') as lockTimeout
,DeadlockProcesses.value('(@clientoption1)[1]','bigint') as clientoption1
,DeadlockProcesses.value('(@clientoption2)[1]','bigint') as clientoption2
FROM (select EventXML as DeadlockEvent FROM CTE_HealthSession) T
CROSS APPLY DeadlockEvent.nodes('//deadlock/process-list/process') AS R(DeadlockProcesses)

-- Drop the temporary table
DROP TABLE #SystemHealthSessionData

The script file for the above queries can be downloaded from here.

A sample output of the above two queries is shown below:

The second dataset which shows the parsed process list from the deadlock graph is for the timestamp highlighted above.

As mentioned in one of my previous posts, the custom reports used in the previous posts can be downloaded from here (Filename: System_Health_Session_Custom_Reports.zip).

Modification April 20th, 2012: Just updated the .sql files and added the Deadlock Report to the SkyDrive location.

Note: To make use of the deadlock graph captured by the System Health Session, you need to have the required update applied to avoid the issue mentioned in KB978629. The issue is addressed in:
981355    Cumulative Update package 1 for SQL Server 2008 R2
http://support.microsoft.com/default.aspx?scid=kb;EN-US;981355
977443    Cumulative update package 6 for SQL Server 2008 Service Pack 1
http://support.microsoft.com/default.aspx?scid=kb;EN-US;977443

If you don’t have the updates installed, then Jonathan (Blog | Twitter) has shown in his article Retrieving Deadlock Graphs with SQL Server 2008 Extended Events how to workaround the issue. Michael Zilberstein’s (Blog) article, Parsing Extended Events xml_deadlock_report, has an updated/corrected version of the T-SQL to fetch the deadlock information.

References:

Bart Duncan’s Weblog:
Deadlock Troubleshooting, Part 1
Deadlock Troubleshooting, Part 2
Deadlock Troubleshooting, Part 3

System Health Session: Part 3

In my last post, I had demonstrated how to fetch the waits information captured by the System Health session. In this post, I shall show the custom reports that can be built using the waits information fetched.

image

The above report gives a summary of all the waits recorded by the health session. I have a second level drill-down available which allows me to get specific information for each of the distinct waits reported above and lets me get the SQL Query that experienced the wait.

image

As you can see above, that I have the SQL query, the time when the wait was reported along with duration statistics of the wait.

Another category of events that are tracked by the System Health session are non-yielding scheduler conditions (reported using the error number 17883) in SQL Server. So if your SQL Server instance encountered a non-yielding scheduler condition, then this information can be tracked using System Health session. The query below will give you the information of all the non-yielding conditions detected by the SQL Server instance subject to condition that the ring buffer storing this information has not been overwritten.

-- Query to fetch non-yielding errors captured by the System Health Session
SET NOCOUNT ON

SELECT CAST(xet.target_data AS XML) AS XMLDATA
INTO #SystemHealthSessionData
FROM sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe ON (xe.address = xet.event_session_address)
WHERE xe.name = 'system_health'

;WITH CTE_HealthSession (EventXML) AS
( SELECT C.query('.') EventXML
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)
WHERE C.query('.').value('(/event/@name)[1]', 'varchar(255)') = 'scheduler_monitor_non_yielding_ring_buffer_recorded' )
SELECT EventXML.value('(/event/@timestamp)[1]', 'datetime') as EventTime,
EventXML.value('(/event/data/value)[4]', 'int') as NodeID,
EventXML.value('(/event/data/value)[5]', 'int') as SchedulerID,
CASE EventXML.value('(/event/data/value)[3]', 'int') WHEN 0 THEN 'BEGIN' WHEN 1 THEN 'END' ELSE '' END AS DetectionStage,
EventXML.value('(/event/data/value)[6]', 'varchar(50)') as Worker,
EventXML.value('(/event/data/value)[7]', 'bigint') as Yields,
EventXML.value('(/event/data/value)[8]', 'int') as Worker_Utilization,
EventXML.value('(/event/data/value)[9]', 'int') as Process_Utilization,
EventXML.value('(/event/data/value)[10]', 'int') as System_Idle,
EventXML.value('(/event/data/value)[11]', 'bigint') as User_Mode_Time,
EventXML.value('(/event/data/value)[12]', 'bigint') as Kernel_Mode_Time,
EventXML.value('(/event/data/value)[13]', 'bigint') as Page_Faults,
EventXML.value('(/event/data/value)[14]', 'float') as Working_Set_Delta,
EventXML.value('(/event/data/value)[15]', 'bigint') as Memory_Utilization
FROM CTE_HealthSession
ORDER BY EventTime,Worker

DROP TABLE #SystemHealthSessionData

The .sql file for the above script is available here.

image

A sample output of the above query is shown on the left. An important information in this output is the worker address. Using this worker address, I can get the relevant messages pertaining to the non-yielding condition.

2011-09-27 21:57:51.560 Server       Process 0:0:0 (0x18c0) Worker 0x000000000606A1A0 appears to be non-yielding on Scheduler 5. Thread creation time: 12961597452926. Approx Thread CPU Used: kernel 0 ms, user 0 ms. Process Utilization 0%. System Idle 98%. Interval: 70077 ms.
2011-09-27 21:58:51.660 Server       Process 0:0:0 (0x18c0) Worker 0x000000000606A1A0 appears to be non-yielding on Scheduler 5. Thread creation time: 12961597452926. Approx Thread CPU Used: kernel 0 ms, user 0 ms. Process Utilization 0%. System Idle 97%. Interval: 133017 ms.
2011-09-27 21:59:51.760 Server       Process 0:0:0 (0x18c0) Worker 0x000000000606A1A0 appears to be non-yielding on Scheduler 5. Thread creation time: 12961597452926. Approx Thread CPU Used: kernel 0 ms, user 0 ms. Process Utilization 2%. System Idle 94%. Interval: 193116 ms.
2011-09-27 22:00:51.860 Server       Process 0:0:0 (0x18c0) Worker 0x000000000606A1A0 appears to be non-yielding on Scheduler 5. Thread creation time: 12961597452926. Approx Thread CPU Used: kernel 0 ms, user 0 ms. Process Utilization 3%. System Idle 93%. Interval: 253215 ms.

If you look at the Errorlog snippet above, you will notice that there are entries for the highlighted worker address are same as the one reported in the management studio output screen shot above. The time reported in the Errorlog is the server time (in my case it is UTC+5:30) where as the time stamps reported by the health session is UTC time which means that the timestamps reported also match. Using the events from the System Health Session and the Errorlog entries, I will be able to figure out when a non-yielding condition occurred on the server instance.

The non-yielding condition report is quite plain and has a table output showing the different columns returned by the query above:

image

In the last post for this series, I shall provide the set of queries that can be used to parse a deadlock graph collected by the System Health session along with the .rdl files for the custom reports used in this series.

System Health Session: Part 2

I had written an introductory post on monitoring the system health using the default extended events sessions that runs on a SQL Server 2008 instance and above. Now it is time for next part for this post. In the first post, I provided a set of queries which would be used for getting all the errors that were recorded by the System Health Extended Events session. Now I can create a set of reports using Business Intelligence Development Studio which can be used by the Custom Reports feature of Management Studio.

I put together a three-level drill down report to get a summary report for all the errors reported by the T-SQL queries in my previous post. The dashboard report which will basically serve as the landing page for what I am terming as the System Health Dashboard looks something like this: image

As you can see above, the report shows me the different events captured by the Extended Events session. The first level drill down provides a summary of all the different errors reported.

image

The last drill-down option is to go look into every occurrence of a specific error number. This report basically shows all the occurrences of a specific error number along with the query text (if captured) and specifics for the event recorded rather than the generic error description that you see in the above report.

image

The other category of events captured by the Health Session are wait information which fall under the following category:

  • Any sessions that have waited on latches (or other interesting resources) for > 15 seconds.
  • Any sessions that have waited on locks for > 30 seconds.
  • Any sessions that have waited for a long time for preemptive waits. The duration varies by wait type. A preemptive wait is where SQL Server is waiting for external API calls.

The query below will give you the query which experienced the wait along with the wait times which were recorded by the System Health session.

SELECT CAST(xet.target_data AS XML) AS XMLDATA

INTO #SystemHealthSessionData

FROM sys.dm_xe_session_targets xet

JOIN sys.dm_xe_sessions xe

ON (xe.address = xet.event_session_address)

WHERE xe.name = 'system_health'

;WITH CTE_HealthSession (EventXML) AS

(

SELECT C.query('.') EventXML

FROM #SystemHealthSessionData a

CROSS APPLY a.XMLDATA.nodes('/RingBufferTarget/event') as T(C)

WHERE C.query('.').value('(/event/@name)[1]', 'varchar(255)') in ('wait_info','wait_info_external')

)

SELECT

EventXML.value('(/event/@timestamp)[1]', 'datetime') as EventTime,

EventXML.value('(/event/data/text)[1]', 'varchar(50)') as WaitType,

EventXML.value('(/event/data/value)[3]', 'int') as Duration,

EventXML.value('(/event/data/value)[4]', 'int') as Max_Duration,

EventXML.value('(/event/data/value)[5]', 'int') as Total_Duration,

EventXML.value('(/event/action/value)[2]', 'varchar(10)') as Session_ID,

EventXML.value('(/event/action/value)[3]', 'varchar(max)') as sql_text

FROM CTE_HealthSession

ORDER BY EventTime DESC

DROP TABLE #SystemHealthSessionData

A sample output of the above query is shown above. The .sql file for the above query can be downloaded from here.

image

Note: Beware of false positives for PREEMPTIVE_OS_GETPROCADDRESS waits described here.

In the next post, I shall give a preview of the wait reports and provide another set of queries to track a category of events tracked by the System Health session.

Once I have completed this series, I shall upload all the series of reports to my SkyDrive so that they can be downloaded for your benefit.

System Health Session: Part 1

SQL Server 2008 introduced a new feature called Extended Events which opened a new vista for troubleshooting SQL Server issues. Bob Ward in a post on the CSS SQL Server Engineers blog had written about the system_health Extended Events session which is started by default on any instance of SQL Server 2008 or above. This session uses a ring buffer target so the information stored about the events tracked by this session don’t stick around forever much like the SQL Server default trace. Jonathan (Blog | Twitter) blogged about the System Health session and way it is setup in detail. Note that if you have an event session setup for a ring_buffer target and the data you feed the target exceeds 4Mb, you may not be able to retrieve all XML nodes from the target data. Bob Ward mentioned about this in detail in his blog post. So the System_Health session also suffers from this drawback.

This is the first post of a series of post that I will be writing about the System Health session and how you can use this to your SQL Server 2008 monitoring on the next level without having to install anything new on your servers… Maybe for the first few posts Winking smile

The first thing that you would need to do is get the different events reported with the oldest event time present in the ring buffer and the last event reported time using the query below:

SET NOCOUNT ON

— Store the XML data in a temporary table
SELECT CAST(xet.target_data as xml) as XMLDATA
INTO #SystemHealthSessionData
FROM sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = ‘system_health’

— Get the different events reported in a grouped manner
;WITH CTE_HealthSession AS
(
SELECT C.query(‘.’).value(‘(
/event/@name)[1]’, ‘varchar(255)’) as EventName,
C.query(‘.’).value(‘(
/event/@timestamp)[1]’, ‘datetime’) as EventTime
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes(‘/RingBufferTarget/event’) as T(C))
SELECT EventName,
COUNT(*) as Occurrences,
MAX(EventTime) as LastReportedEventTime,
MIN(EventTime) as OldestRecordedEventTime
FROM CTE_HealthSession
GROUP BY EventName
ORDER BY 2 DESC
— Drop the temporary table
DROP TABLE #SystemHealthSessionData

One of the set of  events tracked are errors reported by the database engine. The u_tables.sql(C:\Program Files\Microsoft SQL Server\<install folder>\MSSQL\Install) script which contains the specification of the system_health session shows that the following errors are tracked:
1. Any error which has a severity greater than or equal to 20.
2. Errors reported for out of memory conditions (Error #: 17803, 701, 802, 8645, 8651, 8657 or 8902)

The script below will give you a total number of errors reported along with the error reported counts. The second query will list out all the recorded information for a particular error number shown in the first queries output:

SET NOCOUNT ON

— Store the XML data in a temporary table
SELECT CAST(xet.target_data as xml) as XMLDATA
INTO #SystemHealthSessionData
FROM sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = ‘system_health’

— Get statistical information about all the errors reported
;WITH CTE_HealthSession (EventXML) AS
(
SELECT C.query(‘.’) EventXML
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes(‘/RingBufferTarget/event’) as T(C)
),

CTE_ErrorReported (EventTime, ErrorNum) AS
(
SELECT EventXML.value(‘(
/event/@timestamp)[1]’, ‘datetime’) as EventTime,
EventXML.value(‘(/event/data/value)[1]’, ‘int’) as ErrorNum
FROM CTE_HealthSession
WHERE EventXML.value(‘(
/event/@name)[1]’, ‘varchar(255)’) = ‘error_reported’
)
SELECT ErrorNum,
MAX(EventTime) as LastRecordedEvent,
MIN(EventTime) as FirstRecordedEvent,
COUNT(*) as Occurrences,
b.[text] as ErrDescription
FROM CTE_ErrorReported a
INNER JOIN sys.messages b
ON a.ErrorNum = b.message_id
WHERE b.language_id = SERVERPROPERTY(‘LCID’)
GROUP BY a.ErrorNum,b.[text]

— Get information about each of the errors reported
;WITH CTE_HealthSession (EventXML) AS
(
SELECT C.query(‘.’) EventXML
FROM #SystemHealthSessionData a
CROSS APPLY a.XMLDATA.nodes(‘/RingBufferTarget/event’) as T(C)
WHERE C.query(‘.’).value(‘(
/event/@name)[1]’, ‘varchar(255)’) = ‘error_reported’

)
SELECT
EventXML.value(‘(
/event/@timestamp)[1]’, ‘datetime’) as EventTime,
EventXML.value(‘(/event/data/value)[1]’, ‘int’) as ErrNum,
EventXML.value(‘(/event/data/value)[2]’, ‘int’) as ErrSeverity,
EventXML.value(‘(/event/data/value)[3]’, ‘int’) as ErrState,
EventXML.value(‘(/event/data/value)[5]’, ‘varchar(max)’) as ErrText,
EventXML.value(‘(/event/action/value)[2]’, ‘varchar(10)’) as Session_ID
FROM CTE_HealthSession

— Drop the temporary table
DROP TABLE #SystemHealthSessionData

Now you have a set of queries which you can use to fetch information about all the errors reported for your SQL Server instance. Remember that the target of the extended events session is a ring buffer. So, if your server instance have been running for a long period of time, the data will be overwritten.

Modification: Sept 15, 2011: WordPress doesn’t seem to like the @name XML attribute in the XQuery. After multiple attempts at formatting the query above, I finally decided to upload the query as System_Health_Session_1.sql to SkyDrive. You can download it here.

Modification: Sept 21, 2011: Jonathan (Blog | Twitter) dropped me an email stating that the above script could fall prey to incorrect timestamp issue. This issue is already filed as a Connect item. Jonathan described on a viable workaround for this solution in his blog post.

Modification: March 3, 2012: The timestamp value reported is in UTC (GMT) format. So if you want to report event time as per your SQL Server instance time, then you would need to perform the calculation and account for the GMT time difference.

As the title says, Part 1… There is definitely more to come on this topic. Stay tuned!