Extended Events Intro

Extended Events is supposed to be taking over for most of Profiler and server-side tracing functionality, but there were people like me out there that took their time making the switch. For me the excuse was SQL Batch Completed isn’t in 2008 / 2008 R2, most databases are slow to move off of 2008 because it’s hard to say 2008 can’t do everything you want, and, well, I really like SQL Batch Completed!!!

Now I’m losing my excuses. It’s been three years since 2012 came out and all the new servers I’m installing are 2012 or 2014, so I lost that excuse. Then I came up with “I don’t know how to use them, but I know how to use something that does almost the same thing”; that’s great logic during an outage, but terrible logic for the long term. So, I’m learning it, still learning it, and I have something to help out those who made it to 2012 but are still stuck on the “I don’t know how to use them” phase of denial.

In several of my blog posts I referenced my favorite server-side trace. Capture everything on the server that takes over X seconds, with 5 being my favorite starting place. Yes, this adds about 1ms to anything that takes over 5 seconds, which adds up to…insignificant. Seriously, if this kills your server then it was dead already, but you will see arguments against creating this for long-term runs.  I don’t agree with those arguments, but they’re your servers and I want you to make an informed decision on what you run.

Anyways, here’s how I set it up with Extended Events, for which I used Jonathan Kehayias’s (b|t) Trace to XE Converter to get started.

IF EXISTS (SELECT 1 FROM sys.server_event_sessions WHERE name = 'DurationOver5Sec')
	DROP EVENT SESSION [DurationOver5Sec] ON SERVER;
GO
CREATE EVENT SESSION [DurationOver5Sec]
ON SERVER
ADD EVENT sqlserver.rpc_completed(
	ACTION 
	(
		  sqlserver.client_app_name	-- ApplicationName from SQLTrace
		, sqlserver.client_hostname	-- HostName from SQLTrace
		, sqlserver.client_pid	-- ClientProcessID from SQLTrace
		, sqlserver.database_id	-- DatabaseID from SQLTrace
		, sqlserver.request_id	-- RequestID from SQLTrace
		, sqlserver.server_principal_name	-- LoginName from SQLTrace
		, sqlserver.session_id	-- SPID from SQLTrace
	)
	WHERE 
	(
			duration >= 5000000
	)
),
ADD EVENT sqlserver.sql_batch_completed(
	ACTION 
	(
		  sqlserver.client_app_name	-- ApplicationName from SQLTrace
		, sqlserver.client_hostname	-- HostName from SQLTrace
		, sqlserver.client_pid	-- ClientProcessID from SQLTrace
		, sqlserver.database_id	-- DatabaseID from SQLTrace
		, sqlserver.request_id	-- RequestID from SQLTrace
		, sqlserver.server_principal_name	-- LoginName from SQLTrace
		, sqlserver.session_id	-- SPID from SQLTrace
	)
	WHERE 
	(
		duration >= 5000000
	)
),
ADD EVENT sqlos.wait_info(
	ACTION 
	(
		  sqlserver.client_app_name	-- ApplicationName from SQLTrace
		, sqlserver.client_hostname	-- HostName from SQLTrace
		, sqlserver.client_pid	-- ClientProcessID from SQLTrace
		, sqlserver.database_id	-- DatabaseID from SQLTrace
		, sqlserver.request_id	-- RequestID from SQLTrace
		, sqlserver.server_principal_name	-- LoginName from SQLTrace
		, sqlserver.session_id	-- SPID from SQLTrace
	)
	WHERE
	(
		duration > 5000 --This one is in milliseconds, and I'm not happy about that
            AND ((wait_type > 0 AND wait_type < 22) -- LCK_ waits
                    OR (wait_type > 31 AND wait_type < 38) -- LATCH_ waits
                    OR (wait_type > 47 AND wait_type < 54) -- PAGELATCH_ waits
                    OR (wait_type > 63 AND wait_type < 70) -- PAGEIOLATCH_ waits
                    OR (wait_type > 96 AND wait_type < 100) -- IO (Disk/Network) waits
                    OR (wait_type = 107) -- RESOURCE_SEMAPHORE waits
                    OR (wait_type = 113) -- SOS_WORKER waits
                    OR (wait_type = 120) -- SOS_SCHEDULER_YIELD waits
                    OR (wait_type = 178) -- WRITELOG waits
                    OR (wait_type > 174 AND wait_type < 177) -- FCB_REPLICA_ waits
                    OR (wait_type = 186) -- CMEMTHREAD waits
                    OR (wait_type = 187) -- CXPACKET waits
                    OR (wait_type = 207) -- TRACEWRITE waits
                    OR (wait_type = 269) -- RESOURCE_SEMAPHORE_MUTEX waits
                    OR (wait_type = 283) -- RESOURCE_SEMAPHORE_QUERY_COMPILE waits
                    OR (wait_type = 284) -- RESOURCE_SEMAPHORE_SMALL_QUERY waits
	--OR (wait_type = 195) -- WAITFOR
                )
	)
)
ADD TARGET package0.event_file
(
	SET filename = 'DurationOver5Sec.xel',
		max_file_size = 10,
		max_rollover_files = 5
)
WITH 
(
	MAX_MEMORY = 10MB
	, MAX_EVENT_SIZE = 10MB
	, STARTUP_STATE = ON
	, MAX_DISPATCH_LATENCY = 5 SECONDS
	, EVENT_RETENTION_MODE = ALLOW_MULTIPLE_EVENT_LOSS
);

ALTER EVENT SESSION DurationOver5Sec
ON SERVER
STATE = START;
	

What’s it all mean?

This captures all SQL Batch Completed and RPC Completed events that took over 5 seconds along with any waits that took over 5 seconds. Seems kind simple and much easier to read than the script to create a trace, but there are some things to point out here.

  • Duration – Milliseconds or Microseconds
  • File Location
  • Restart on server reboot

First, duration, is it milliseconds or microseconds? Trick question, it’s both!!! Ola Hallengren (b|t) opened This Connect Item which resulted in Microsoft apologizing and saying they’ll fix it in SQL 2016. They can “fix” it one of two ways. Either they’ll document them as being different scales and your script will still work without any modifications, or they’ll make them all be the same scale and the script I have here will behave differently on different versions of SQL Server. Anyways, the script I have here is 5 seconds all around, which is 5,000,000 microseconds for the SQL and 5,000 milliseconds for the waits.

Continuing on the duration confusion, the wait info’s duration still isn’t what you might think it is.  This is the duration for a single instance of a wait, not cumulative waits for the duration of a specific query or batch.  If you made a loop that executed 10 times calling a wait for delay of 1 second each time then those are 10 separate 1 second waits.  More important, if your query is doing physical reads from disk then it’s a ton of small PageIOLatch waits, not one large one this session will see.  I understand why it has to be this way, but understanding and enjoying are two different things.

The rest of this isn’t nearly as confusing, I promise.

The file location I’m using is just a filename without a path, which will default to where your error log files are. It’s going to be a much more complex name once SQL Server is done with it, and not just a underscore with a rollover count like server-side traces.  However, it will start how it’s written here and you can use wildcards to say which files you want to read.

Now when you restarted the server, or just SQL services, with server-side traces you’d have to manually start that trace again if you wanted it to run all the time. This is exactly how the script I have here works, too. However, Extended Events also added “Startup State” which means it will start when the server starts. The only issue I have with this is that it will ONLY automatically start when the services start. That’s all fine and good if you’re a patient person….it’s not all fine and good for me, so I manually start it when it’s created.

The other minor details I can see having questions on are on the Max Dispatch Latency and Event Retention Mode. These are limits on how it writes to the file location I’m using. Max Dispatch means that SQL Server will write it to the output file within that many seconds after the event actually happens, so it’s obviously an asynchronous action. Event Retention Mode can give SQL Server permission to say it’s too overworked to write the entries and skip them if needed, and I chose to tell SQL Server to drop as many as it wants to if I’d be slowing it down.  It’s all documented in BOL, and it’d be good for you to read through that instead of just listening to me picking out select details.

So, how do you read this? It’s going to put it all in XML, isn’t it!?!?

If you can read this, Kendra Little (b|t) may suggest you’re a demon robot! (While her comment makes me laugh, that post is a great XE reference or it wouldn’t be on here.) Well, she’s probably right in calling this next script names because friendly robots don’t treat your CPU like this. In fact, if you’re on a stressed server or want to do a lot of analysis on this, it’s going to be best to copy the .xel files to a server you care a little less about and shred the XML there.

Unfortunately, I wrote my script to automatically find the files and add a wildcard character, so you’d have to modify my script to run it on another server. The point is getting you started with as little frustration as possible, and this is perfect for that reason. The biggest catch is that the session has to be running for this to work without changes because it’s grabbing the full file name with path from the DMVs for running sessions.

Anyways, here goes….

DECLARE 
	@SessionName SysName 
	, @TopCount Int = 1000

--SELECT @SessionName = 'UserErrors'
SELECT @SessionName = 'DurationOver5Sec'
--SELECT @SessionName = 'system_health'
/* 
SELECT * FROM sys.traces

SELECT  Session_Name = s.name, s.blocked_event_fire_time, s.dropped_buffer_count, s.dropped_event_count, s.pending_buffers
FROM sys.dm_xe_session_targets t
	INNER JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address
WHERE target_name = 'event_file'
--*/

SET STATISTICS IO, TIME ON

IF OBJECT_ID('tempdb..#Events') IS NOT NULL BEGIN
	DROP TABLE #Events
END

IF OBJECT_ID('tempdb..#Queries') IS NOT NULL BEGIN
	DROP TABLE #Queries 
END

DECLARE @Target_File NVarChar(1000)
	, @Target_Dir NVarChar(1000)
	, @Target_File_WildCard NVarChar(1000)

SELECT @Target_File = CAST(t.target_data as XML).value('EventFileTarget[1]/File[1]/@name', 'NVARCHAR(256)')
FROM sys.dm_xe_session_targets t
	INNER JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address
WHERE s.name = @SessionName
	AND t.target_name = 'event_file'

SELECT @Target_Dir = LEFT(@Target_File, Len(@Target_File) - CHARINDEX('\', REVERSE(@Target_File))) 

SELECT @Target_File_WildCard = @Target_Dir + '\'  + @SessionName + '_*.xel'

--SELECT @Target_File_WildCard

SELECT TOP (@TopCount) CAST(event_data AS XML) AS event_data_XML
INTO #Events
FROM sys.fn_xe_file_target_read_file(@Target_File_WildCard, null, null, null) AS F
ORDER BY File_name DESC
	, file_offset DESC 

SELECT  EventType = event_data_XML.value('(event/@name)[1]', 'varchar(50)')
	, Duration_sec = CAST(event_data_XML.value ('(/event/data[@name=''duration'']/value)[1]', 'BIGINT')/CASE WHEN event_data_XML.value('(event/@name)[1]', 'varchar(50)') LIKE 'wait%' THEN 1000.0 ELSE 1000000.0 END as DEC(20,3)) 
	, CPU_sec = CAST(event_data_XML.value ('(/event/data[@name=''cpu_time'']/value)[1]', 'BIGINT')/1000000.0 as DEC(20,3))
	, physical_reads_k = CAST(event_data_XML.value ('(/event/data  [@name=''physical_reads'']/value)[1]', 'BIGINT')/1000.0 as DEC(20,3))
	, logical_reads_k = CAST(event_data_XML.value ('(/event/data  [@name=''logical_reads'']/value)[1]', 'BIGINT') /1000.0 as DEC(20,3))
	, writes_k = CAST(event_data_XML.value ('(/event/data  [@name=''writes'']/value)[1]', 'BIGINT')/1000.0 as DEC(20,3))
	, row_count = event_data_XML.value ('(/event/data  [@name=''row_count'']/value)[1]', 'BIGINT')
	, Statement_Text = ISNULL(event_data_XML.value ('(/event/data  [@name=''statement'']/value)[1]', 'NVARCHAR(4000)'), event_data_XML.value ('(/event/data  [@name=''batch_text''     ]/value)[1]', 'NVARCHAR(4000)')) 
	, TimeStamp = DateAdd(Hour, DateDiff(Hour, GetUTCDate(), GetDate()) , CAST(event_data_XML.value('(event/@timestamp)[1]', 'varchar(50)') as DateTime2))
	, SPID = event_data_XML.value ('(/event/action  [@name=''session_id'']/value)[1]', 'BIGINT')
	, Username = event_data_XML.value ('(/event/action  [@name=''server_principal_name'']/value)[1]', 'NVARCHAR(256)')
	, Database_Name = DB_Name(event_data_XML.value ('(/event/action  [@name=''database_id'']/value)[1]', 'BIGINT'))
	, client_app_name = event_data_XML.value ('(/event/action  [@name=''client_app_name'']/value)[1]', 'NVARCHAR(256)')
	, client_hostname = event_data_XML.value ('(/event/action  [@name=''client_hostname'']/value)[1]', 'NVARCHAR(256)')
	, result = ISNULL(event_data_XML.value('(/event/data  [@name=''result'']/text)[1]', 'NVARCHAR(256)'),event_data_XML.value('(/event/data  [@name=''message'']/value)[1]', 'NVARCHAR(256)'))
	, Error = event_data_XML.value ('(/event/data  [@name=''error_number'']/value)[1]', 'BIGINT')
	, Severity = event_data_XML.value ('(/event/data  [@name=''severity'']/value)[1]', 'BIGINT')
	, EventDetails = event_data_XML 
INTO #Queries
FROM #Events

SELECT q.EventType
	, q.Duration_sec
	, q.CPU_sec
	, q.physical_reads_k
	, q.logical_reads_k
	, q.writes_k
	, q.row_count
	, q.Statement_Text
	, q.TimeStamp
	, q.SPID
	, q.Username
	, q.Database_Name
	, client_app_name = CASE LEFT(q.client_app_name, 29)
					WHEN 'SQLAgent - TSQL JobStep (Job '
						THEN 'SQLAgent Job: ' + (SELECT name FROM msdb..sysjobs sj WHERE substring(q.client_app_name,32,32)=(substring(sys.fn_varbintohexstr(sj.job_id),3,100))) + ' - ' + SUBSTRING(q.client_app_name, 67, len(q.client_app_name)-67)
					ELSE q.client_app_name
					END  
	, q.client_hostname
	, q.result
	, q.Error
	, q.Severity
	, q.EventDetails
FROM #Queries q
--WHERE eventtype NOT IN /*rather typical filtering*/ ('security_error_ring_buffer_recorded', 'sp_server_diagnostics_component_result', 'scheduler_monitor_system_health_ring_buffer_record')
	--AND eventtype NOT IN /*specific troubleshooting filtering*/ ('connectivity_ring_buffer_recorded', 'wait_info')
ORDER BY TimeStamp DESC 

The top is all I typically edit. The Session Name is ‘DurationOver5Sec’ for this purpose, but as you can see it can also read some of the information from the system_health session. The TopCount is just saying I’m only going to look at the most recent 1,000 records BEFORE I shred and filter it, mostly because I love my CPU and this changes how much XML I’m throwing at it.

No matter how I write this to filter before playing with XML or try to be 100% safe, my scripts will not be perfect for you.  In fact they may crash your server. Unfortunately, I’m only half kidding, because nothing is 100%. The space is limited to 50 MB, the overhead of writing to this file should be fine since we’re only working with events that took over 5 seconds asynchronously with data loss allowed, so I don’t foresee any issues in most cases unless you change these numbers. However, reading them gets expensive and you have to use the most caution here.  Keep in mind that you can put the .xel files on any server with SQL installed, even developer edition, and query it.

Why’d I write this and what else is going through my head?

The inspiration for this post actually came from a question a former coworker asked me quite a while ago, just seeing if I knew a lot about Extended Events.  At the time the technically correct answer was “I don’t know”, but instead I told him that although I didn’t have experience with Extended Events that he could look at Jonathan Kehayias’s (b|t) blog, specifically his XEvent A Day Series.  However, I also warned him that Jonathan’s work tends to go deeper than a beginner is prepared for and that he should start with his coworker, Erin Stellato’s (b|t), version that is easier to start with, specifically her Making the Leap From Profilers to Extended Events session she made available on YouTube. This actually inspired two blog posts, I just haven’t written the How to Say “I don’t know” Better post yet.

More Links

It is so wrong that I originally posted this without linking to Erin Stellato’s Extended Events Stairway Series on SQL Server Central.  There, problem is fixed.

Have fun, experiment (with queries), and ask questions.

Advertisements

Wait Stats – Monitoring and Using

sys.dm_os_wait_stats is one of the most important DMVs out there, and one that you should know the historical values of for every server you care about. This DMV will tell you the cumulative amount of time you waited for each wait type since SQL Server services started, which is nice to know, but limited in its usefulness as-is.

If, however, you knew what the values were 30 minutes ago and what they are now then you could know why recent queries were waiting. Take it a step further and know what you were waiting for in the same 30 minute period yesterday and you have a bit of a baseline to go off of. Now you can compare what you’re waiting for now to your baseline so you can say if what you’re seeing now is just normal for the server or if you’re seeing a true incident. Remember, the only way to know if numbers on a server are a problem are not is to know what’s normal for that specific server.

A half-hour increment is a bit much for me, and I would prefer to know it every 5 minutes to give a little more detail. Also, just looking at yesterday’s stats for comparison isn’t good enough for me, especially on when looking into an incident on a Monday where Sunday isn’t a very good baseline. Because of that, I want to look at the baseline data from the same time period yesterday and exactly a week ago.

To capture the data you need to capture a snapshot of sys.dm_os_wait_stats along with the time you grabbed it. Then grab it again, do a diff between the two values, and save that off with the starting and ending times. Keep this running every so many minutes in a SQL job and you have a running baseline to compare what you’re seeing now to what the server is typically doing.

After that, keep the data trimmed down to what you’d actually use. Personally, I’m a huge fan of 13 months. This running every 5 minutes for 13 months will accumulate about 750 MB of data, and you can adjust these numbers as you see fit.

Here are the tables; I comment out stuff if I would cry if I accidently ran it at the wrong time.

/*
IF OBJECT_ID('WaitStats') IS NOT NULL BEGIN
    DROP TABLE WaitStats
END

IF OBJECT_ID('WaitStatsLast') IS NOT NULL BEGIN
    DROP TABLE WaitStatsLast
END
*/
GO

CREATE TABLE WaitStats (
    DateStart datetime
    , DateEnd datetime
    , wait_type nvarchar(60)
    , waiting_tasks_count bigint
    , wait_time_ms bigint
    , max_wait_time_ms bigint
    , signal_wait_time_ms bigint
)
GO

CREATE CLUSTERED INDEX IX_WaitStats_DateStart_waittype_U_C ON WaitStats
(
    DateStart
    , wait_type
) WITH (Fillfactor = 95)
GO

CREATE TABLE WaitStatsLast (
    DateAdded datetime
    , wait_type nvarchar(60)
    , waiting_tasks_count bigint
    , wait_time_ms bigint
    , max_wait_time_ms bigint
    , signal_wait_time_ms bigint
)
GO

CREATE CLUSTERED INDEX IX_WaitStatsLast_waittype_U_C ON WaitStatsLast
(
    wait_type
) WITH (Fillfactor = 95)
GO

Notice the lack of creativity. It is very intentional. The code is more stable as I’m copying the field sizes directly from the DMV, and the data is more legible to an outsider (new employee, Microsoft engineer, consultant, vendor) because the fields are exactly the same as the DMV they’re used to seeing. I’m not a fan of non-creative solutions, so there’s a reason to go with it when I decide to be boring.

Now to populate the tables. This script will need to be ran on a regular schedule. Preferably, throw it in a proc and run that proc every 5 minutes, but that’s just my opinion in a discussion about how to watch your servers.

IF OBJECT_ID('tempdb..#WaitStatsNew') IS NOT NULL BEGIN
    DROP TABLE #WaitStatsNew 
END

SELECT DateAdded = GETDATE()
    , wait_type
    , waiting_tasks_count 
    , wait_time_ms
    , max_wait_time_ms
    , signal_wait_time_ms 
INTO #WaitStatsNew
FROM sys.dm_os_wait_stats

INSERT INTO WaitStats (DateStart, DateEnd, wait_type, waiting_tasks_count, wait_time_ms, max_wait_time_ms, signal_wait_time_ms)
SELECT DateStart = ISNULL(l.DateAdded, (SELECT create_date FROM sys.databases WHERE name = 'tempdb'))
    , DateEnd = n.DateAdded
    , wait_type = n.wait_type
    , waiting_tasks_count = n.waiting_tasks_count - ISNULL(l.waiting_tasks_count, 0)
    , wait_time_ms = n.wait_time_ms - ISNULL(l.wait_time_ms, 0)
    , max_wait_time_ms = n.max_wait_time_ms --It's a max, not cumulative
    , signal_wait_time_ms = n.signal_wait_time_ms - ISNULL(l.signal_wait_time_ms, 0)
FROM #WaitStatsNew n
    LEFT OUTER JOIN WaitStatsLast l ON n.wait_type = l.wait_type AND l.DateAdded > (SELECT create_date FROM sys.databases WHERE name = 'tempdb')

TRUNCATE TABLE WaitStatsLast 

INSERT INTO WaitStatsLast (DateAdded, wait_type, waiting_tasks_count, wait_time_ms, max_wait_time_ms, signal_wait_time_ms)
SELECT DateAdded
    , wait_type
    , waiting_tasks_count
    , wait_time_ms
    , max_wait_time_ms
    , signal_wait_time_ms
FROM #WaitStatsNew 

DROP TABLE #WaitStatsNew

Finally, querying it. Give it a descent time frame, I typically start at 2 hours to get an overall picture without having a single event skew the numbers too much. Then look at your most relevant baseline data – yesterday and 7 days ago during the same timeframe.

DECLARE @dStart datetime
    , @dEnd datetime

SELECT @dStart = DATEADD(Hour, -2, GETDATE())
    , @dEnd = GETDATE()

IF OBJECT_ID('tempdb..#IgnoredWaits') IS NOT NULL BEGIN
    DROP TABLE #IgnoredWaits
END

CREATE TABLE #IgnoredWaits (Wait_Type NVarChar(60) PRIMARY KEY) 
--/*
--Values taken from p25 of "Troubleshooting SQL Server - A Guide for the Accidental DBA" by Jonathan Kehayias and Ted Krueger  
INSERT INTO #IgnoredWaits 
SELECT 'BAD_PAGE_PROCESS'
UNION SELECT 'BROKER_EVENTHANDLER'
UNION SELECT 'BROKER_RECEIVE_WAITFOR'
UNION SELECT 'BROKER_TASK_STOP'
UNION SELECT 'BROKER_TO_FLUSH'
UNION SELECT 'BROKER_TRANSMITTER'
UNION SELECT 'CHECKPOINT_QUEUE'
UNION SELECT 'CLR_AUTO_EVENT'
UNION SELECT 'CLR_MANUAL_EVENT'
UNION SELECT 'DBMIRROR_EVENTS_QUEUE'
UNION SELECT 'DISPATCHER_QUEUE_SEMAPHORE'
UNION SELECT 'FT_IFTS_SCHEDULER_IDLE_WAIT'
UNION SELECT 'FT_IFTSHC_MUTEX'
UNION SELECT 'KSOURCE_WAKEUP'
UNION SELECT 'LAZYWRITER_SLEEP'
UNION SELECT 'LOGMGR_QUEUE'
UNION SELECT 'ONDEMAND_TASK_QUEUE'
UNION SELECT 'PREEMPTIVE_OS_AUTHENTICATIONOPS'
UNION SELECT 'PREEMPTIVE_OS_GETPROCADDRESS'
UNION SELECT 'REQUEST_FOR_DEADLOCK_SEARCH'
UNION SELECT 'RESOURCE_QUEUE'
UNION SELECT 'SLEEP_BPOOL_FLUSH'
UNION SELECT 'SLEEP_SYSTEMTASK'
UNION SELECT 'SLEEP_TASK' 
UNION SELECT 'SQLTRACE_BUFFER_FLUSH'
UNION SELECT 'WAITFOR'
UNION SELECT 'XE_DISPATCHER_JOIN'
UNION SELECT 'XE_DISPATCHER_WAIT'
UNION SELECT 'XE_TIMER_EVENT'

SELECT TOP 7 TimeFrame = 'Right Now'
    , DateStart = MIN(DateStart)
    , DateEnd = MAX(DateEnd)
    , wait_type
    , wait_minutes = SUM(wait_time_ms)/1000/60
FROM WaitStats
WHERE DateStart >= @dStart 
    AND DateEnd <= @dEnd 
    AND Wait_Type NOT IN (SELECT Wait_Type FROM #IgnoredWaits)
GROUP BY wait_type
ORDER BY 5 DESC

SELECT TOP 7 TimeFrame = 'Yesterday'
    , DateStart = MIN(DateStart)
    , DateEnd = MAX(DateEnd)
    , wait_type
    , wait_minutes = SUM(wait_time_ms)/1000/60
FROM WaitStats
WHERE DateStart >= @dStart - 1
    AND DateEnd <= @dEnd - 1
    AND Wait_Type NOT IN (SELECT Wait_Type FROM #IgnoredWaits)
GROUP BY wait_type
ORDER BY 5 DESC

SELECT TOP 7 TimeFrame = 'Last Week'
    , DateStart = MIN(DateStart)
    , DateEnd = MAX(DateEnd)
    , wait_type
    , wait_minutes = SUM(wait_time_ms)/1000/60
FROM WaitStats
WHERE DateStart >= @dStart - 7
    AND DateEnd <= @dEnd - 7
    AND Wait_Type NOT IN (SELECT Wait_Type FROM #IgnoredWaits)
GROUP BY wait_type
ORDER BY 5 DESC

The only thing left to do is wait and interpret the data. All the wait types mean something, with some easily pointing to an issue and others just being normal operations. Doing a dive into what wait types are out there, what they mean to you, and what they mean when they’re not what they were yesterday isn’t something I’m going to get into with this post.  However, there is an Accidental DBA book out there that explains this amazingly well.  Here are links to the FREE PDF Version and the Physical Book.