Database Log VLFs

Virtual Log Files (VLFs) split a physical database log file into smaller segments, which are required for how log files work in the background.  These are created automatically, but automatically doesn’t always mean perfect.  Here’s a practical view of what you need to know and how you can set them up properly.

What VLFs Do

My initial view of how a transaction log worked was that each change was written to a log file, and, as the documentation of those changes was no longer required, it was deleted from the log file.  I think this is a common way to view it, and it’s close to the truth.  However, it’s not close enough for a conversation on VLFs.

What really happens is that changes are written to the first VLF, which is just a segment of the log file.  When that VLF fills up, it moves on to the next VLF in a systematic order.

Each change made to the database is assigned a Log Sequence Number (LSN), and SQL Server keeps track of the oldest LSN it still needs for any purpose.  This purpose can be many things, but is typically the oldest LSN of the following:

  • Oldest LSN on a log backup (full or bulk-logged recovery)
  • LSN at the start of the oldest active transaction
  • LSN last replicated when using certain types of replication

Here’s my extremely complicated script to tell you what that reason is for your database:

SELECT name
	, log_reuse_wait_desc 
FROM sys.databases

If the last LSN in a VLF is older than what you need to keep, that VLF can be reused.  This makes is very efficient because this becomes something like a set-based operation for SQL Server, the entire VLF is cleared and marked for reuse at once instead of line-by-line.

To help you picture how the VLFs work, someone at Microsoft drew you a picture on Transaction Log Physical Architecture.  Then you can look at the results of DBCC LogInfo, and it will make a lot more sense when you see a VLF on each line there along with its status.

Only One Log File, Dedicated Drive

Adding more than one log file on a single database isn’t going to help your performance.  SQL Server is writing to a single VLF at a time, and that single VLF is part of a single log file.  If you have multiple log files then you’re writing to one while the other sits idle.  Once you fill up the last VLF on the first file it starts writing to the second file while the first sits idle.  If these are on separate drives that means each drive has to be able to handle the I/O, but it’s hit or miss if you’re using the I/O which leads to wasted resources and inconsistent performance.

As for how you’re writing to the log, most of the work done is writing to the tail end of it so spinning disks do really well just keeping the head in one place and writing to the file.  That is unless you have other types of files on this disk so the head writes a little to the log, jumps over to write a little to a data file, writes a bit more to the log, then jumps over to read a couple indexes.  This is a lot of where the recommendation to keep your data and log files on separate disks come from, but there is more to it than I’ll get into here.

However, we’re going to use RAID for redundancy making the drives too large for just a log, then we’ll put it on a SAN with a write cache, and do so many other things to make it more complicated.  If a server is extremely stressed and highly critical, look into dedicated spindles, SSDs, or other options.  On general shared disk arrays, it’s nice if you can have an array dedicated to just log files, even if it’s log files for multiple servers.

VLF Size Matters

When you’re writing to VLFs there can be two issues.  First, the VLFs are very small and SQL Server is jumping all over the place maintaining small files and figuring out where it should write the next entry – this is common and a big performance hit.  Second, the VLFs are too large and SQL Server is reserving a ton of space when only a small part of the tail end of the VLF is in use, then trying to clear it all at once – this is less common and less of an issue.

In addition to this, VLFs each need to be processed when restoring or recovering a database.  Recovering is part of restarting SQL services, so you should be hitting this issue at least once a month with your Windows Updates.  Every VLF adds some overhead to this process, and a huge number adds a lot of overhead to lengthen this process.

Details of this are on the Microsoft Customer Service and Support blog post How a log file structure can affect database recovery time, and it includes this eye-opening quote, “The first phase of recovering a database is called discovery where all the VLFs are scanned (in serial and single threaded fashion) before actual recovery starts.”

Are your log files pre-sized to take up the entire drive even though they only ever use 5% of that space?  Is that hurting you on recovery?

How Big Are My VLFs?

I have a script for that.  This is set to filter only logs that I want to look at, but you can comment out there WHERE clause on the final statement to see it all.  Also, it would be extremely rare for me to look at individual VLFs, so this is only looking at the sums and averages for each file.

I used to manually comment out a line on this depending on the version of SQL Server until I read a post on VLFs by Andy Galbraith (b|t) Counting Your VLFs, or, Temp Tables Inside IF…ELSE Blocks.  Thank you, Andy.  The timing of your post was perfect to help me make this easier for everyone.

CREATE TABLE #VLF_temp 
(
	RecoveryUnitID int
	, FileID varchar(3) 
	, FileSize numeric(20,0)
	, StartOffset bigint 
	, FSeqNo bigint 
	, Status char(1)
	, Parity varchar(4) 
	, CreateLSN numeric(25,0)
)

CREATE TABLE #VLF_db_total_temp
(
	DatabaseName sysname 
	, LogiFilename sysname
	, PhysFileName sysname
	, AVG_VLF_Size_MB DECIMAL(12,2)
	, vlf_count int
	, log_size_mb FLOAT
	, log_growth_mb FLOAT
)

IF (SELECT LEFT(cast(serverproperty('ProductVersion') as varchar),2)) in ('8.','9.','10') BEGIN
	ALTER TABLE #VLF_temp DROP COLUMN RecoveryUnitID
END

DECLARE db_cursor CURSOR READ_ONLY FOR 
SELECT name FROM sys.databases
WHERE State = 0

DECLARE @name sysname, @stmt varchar(40)

OPEN db_cursor

FETCH NEXT FROM db_cursor INTO @name

WHILE (@@fetch_status <> -1) BEGIN
	IF (@@fetch_status <> -2) BEGIN
		INSERT INTO #VLF_temp
		EXEC ('DBCC LOGINFO ([' + @name + ']) WITH NO_INFOMSGS')

		INSERT INTO #VLF_db_total_temp (DatabaseName, LogiFilename, PhysFileName, vlf_count, AVG_VLF_Size_MB, log_size_mb, log_growth_mb)
		SELECT DatabaseName = @name
			, LogiFilename = mf.name
			, PhysFileName = mf.physical_name
			, vlf_count = COUNT(*)
			, AVG_VLF_Size_MB = (AVG(FileSize)/1024)/1024
			, log_size_mb = (mf.size * 8)/1024 
			, log_growth_mb = CASE mf.is_percent_growth
									WHEN 1 THEN (mf.size * 8)/1024 * mf.growth/100
									WHEN 0 THEN  (mf.growth * 8)/1024 
									END
		FROM #VLF_temp vt
			INNER JOIN sys.master_files mf ON mf.database_id = db_id(@name) AND mf.file_id = vt.fileid
		GROUP BY mf.name, mf.physical_name,mf.size,mf.growth,mf.is_percent_growth;           

		TRUNCATE TABLE #VLF_temp           
	END

	FETCH NEXT FROM db_cursor INTO @name
END

CLOSE db_cursor
DEALLOCATE db_cursor

SELECT * 
	, USEDB = 'USE [' + DatabaseName + ']' 
	, CP = 'CHECKPOINT'
	, ShrinkZero = 'DBCC SHRINKFILE (N''' + LogiFileName + ''' , 0)'
	, GrowBack = 'ALTER DATABASE [' + DatabaseName + '] MODIFY FILE ( NAME = N''' + LogiFileName + ''', SIZE = ' + CAST(log_size_mb AS VARCHAR(20)) + 'MB ) --I split this up when over 10,000 MB'
	, ChangeGrowth = CASE --Note, never gets up to the 8000 MB I recommend for manual growths, but still 250 MB VLFs
						WHEN DatabaseName = 'tempdb' THEN ''
						WHEN log_size_mb <= 128 THEN 'ALTER DATABASE [' + DatabaseName + '] MODIFY FILE ( NAME = N''' + LogiFileName + ''', FILEGROWTH = 32MB )'
						WHEN log_size_mb <= 512 THEN 'ALTER DATABASE [' + DatabaseName + '] MODIFY FILE ( NAME = N''' + LogiFileName + ''', FILEGROWTH = 128MB )'
						WHEN log_size_mb <= 4000 THEN 'ALTER DATABASE [' + DatabaseName + '] MODIFY FILE ( NAME = N''' + LogiFileName + ''', FILEGROWTH = 1000MB )'
						WHEN log_size_mb <= 16000 THEN 'ALTER DATABASE [' + DatabaseName + '] MODIFY FILE ( NAME = N''' + LogiFileName + ''', FILEGROWTH = 1000MB )'
						ELSE 'ALTER DATABASE [' + DatabaseName + '] MODIFY FILE ( NAME = N''' + LogiFileName + ''', FILEGROWTH = 1000MB )'
						END
FROM #VLF_db_total_temp
WHERE (vlf_count > 50 
		AND Avg_VLF_Size_MB < vlf_count / 3) 
	OR (vlf_count > 250
		AND AVG_VLF_Size_MB < vlf_count / 2)
ORDER BY vlf_count DESC 

DROP TABLE #VLF_temp 
DROP TABLE #VLF_db_total_temp

What’s the Right Size?

The most common thing you’ll see on the internet is people talking about having too many with the limit being 1,000 VLFs and ideally at 50 VLFs.  I’m a little different and focus more on size than number.  The concept I’m going for is that excessive context switching is what causes me grief, and a reasonable size VLF will keep that under control.

Now there’s no magic number that will work noticeably better in all cases, so I won’t give you static numbers and say they’re magic.  What I can say is that too small is basically fragmentation with the overhead of context switching, too many increases recovery time, too big is huge chucks, and too few could cause extra growths as unused space is still being reserved.

It’s easy to visualize how smaller and excessive numbers of VLFs can cause issues, but that doesn’t mean huge VLFs are the way to go.  Tammy Richter Jones (b) gets into details with her post If > 50 VLFs are bad, then 16 is good, right?.  I’d love to see more on the affects of huge VLFs to have a more complete understanding if a 1 TB log file is better off with 50 20GB VLFs or 2,000 500MB VLFs.  I’m not convinced that >50 is bad.

Personally, I like formulas a lot more than static numbers, especially since they allow me to avoid extremes without stressing out.  The formulas I used broke it down into three categories:

  1. Below 50 VLFs, this is a reasonable number, they get a free ride no matter how big they are.
  2. Between 50 and 250 VLFs, the average size in MB has to be at least 1/3 of the count of VLFs.
    • For a file with 100 VLFs, the average VLF has to be over 33 MB.
  3. Over 250 VLFs, the average size in MB has to be at least 1/2 of the count of VLFs.
    • For a file with 500 VLFs, the average VLF has to be over 250 MB.

This is good enough for almost any database.  I don’t care about there being a couple small VLFs mixed in there because I’ll spend most of my time in the large ones.  I don’t care about the numbers getting larger because the average VLF is large enough to avoid too much context switching.  You’re not out to micromanage, so take a relaxed approach to this unless you have a reason to not be relaxed.

What’s Default?

Remember how I said there were no magic numbers?  Well, that doesn’t mean there aren’t wrong numbers.

By default, every database is based off of model, which has a 1 MB log files growing in 10% increments.  So 1/10th of a MB for the first growth, which is just a tiny fragment.  If you leave it this way on large databases you could end up with a 1 TB log trying to grow 100 GB at a time, which your users will notice.

Be default, you get the worst of both worlds!  The goal here is to avoid extremes, so you’ll want to change the growth rates away from the default.

How Do I Change the Size?

There’s only one way to change the size of your VLFs.  Delete them and recreate them.

For every database EXCEPT TempDB, this means shrinking the logs (it deletes VLFs to get to the size you want), then growing the logs again.  Both manual and automatic growths will split the new physical space into VLFs, but that depends on your version of SQL Server.

Here are the growth rates I pulled from Paul Randal’s post Important change to VLF creation algorithm in SQL Server 2014.

SQL 2012 and earlier uses this formula:

<= 64 MB growth is 4 VLFs

64 MB and <= 1 GB = 8 VLFs

1 GB = 16 VLFs

SQL 2014+ uses this formula:

Is the growth size less than 1/8 the size of the current log size?

Yes: create 1 new VLF equal to the growth size

No: use the formula above

So if you have too many VLFs you can shrink your log file.  It will delete VLFs that are not in use to make it smaller.  Then grow the file in increments that give you VLFs sized how you want them.  If you’re going for 500 MB VLFs then you grow your file 8,000 MB at a time.

The VLFs that weren’t deleted in this process, because they were in use or you didn’t shrink the file as far as you could have, will not be affected.  This means you’ll have different sized VLFs throughout your file, but that doesn’t really matter.  What does matter is that you don’t have any ridiculously large VLFs and you’re spending most of your time in properly sized VLFs.

Here’s how I do it.  Shrink it as much as possible.  Shrinking is never as easy as it should be, so verify it shrunk it a significant amount.  If it didn’t, take log backups and try again, repeating this process a handful of times if needed.  Then, if I want the log to be 32,000 MB, I just grow it by 8,000 MB 4 times.

--Run a log backup before this if database is not in simple and log backups run less frequently than every 5 minutes.
USE [DatabaseName]
GO
CHECKPOINT
GO
DBCC SHRINKFILE (N'LogicalFileName_Log' , 0)
GO
USE [master]
GO
ALTER DATABASE [DatabaseName] MODIFY FILE ( NAME = N'LogicalFileName_Log', SIZE = 8000MB )
ALTER DATABASE [DatabaseName] MODIFY FILE ( NAME = N'LogicalFileName_Log', SIZE = 16000MB )
ALTER DATABASE [DatabaseName] MODIFY FILE ( NAME = N'LogicalFileName_Log', SIZE = 24000MB )
ALTER DATABASE [DatabaseName] MODIFY FILE ( NAME = N'LogicalFileName_Log', SIZE = 32000MB )
GO

Assuming you were able to shrink this down to 10 VLFs remaining and you want it to be 32,000 MB, this would give you 74 VLFs with all new ones being 500 MB.  It goes over the 50 VLF limit I’ve seen elsewhere, but I’m very happy with these sizes and numbers.

With 8,000 MB growths the new 2014 calculation won’t kick in until the log was already 64,000 MB, and at that point an 8,000 MB VLF probably isn’t a bad thing.

Autogrowth Rates

I recommend manually growing larger log files between 8,000 and 16,000 MB at a time, but not autogrowths.  The process that causes a file to grow, and any other process that needs that space, will sit around and wait for an autogrowth to complete.  That could take a minute on descent disks, which is longer than many timeouts I’ve worked with.

Manually growing files is best, but almost every database will rely on autogrowth.  Know your timeouts, know how long it takes to grow a file, and size it accordingly.  I like to do 1,000 MB here, but will do as less if I have to.  If you can’t make a descent sized growth based on this, be more paranoid about manually growing files while leaving autogrowth set to smaller sizes.

The reason for 1,000 MB is that this is the largest size that will still give you 8 VLFs instead of being split into 16 VLFs, with the actual limit being at 1,024 MB.  If you did 2,000 MB, you’d still end up with 125 MB VLFs.  You can get that up to 250 MB each at 4,000 MB, but you’re telling your users to wait while 4,000 MB is zeroed out before they can move on.  Although the calculation changes for larger logs on SQL 2014, I’d still stick with 1,000 MB for a good autogrowth size.

Note, there is a resolved bug that would cause issues when log files grew in 4 GB increments, so you’ll see a lot of people using 8,000 MB instead of 8,192 MB.  We aren’t using an effected patch level anymore, but DBAs are made to be paranoid.  We’re not just lazy at math, although this doesn’t rule out that possibility.

TempDB

TempDB is different.  The database is recreated every time the SQL service restarts, so the log is recreated as well, following the formulas above.  If the log is over 1 GB, it starts out with 16 VLFs.  Jonathan Kehayias got into the details in his post TSQL Tuesday #11 – Misconceptions – The TempDB Log File and VLF Counts.

So TempDB never has too many on startup, but is this too few?  What if TempDB’s log is 40 GB, do you want a 2.5 GB VLF?  Is setting up TempDB to have 5 log files that are 8,000 MB each so it starts up with 500 MB VLFs a better idea?

Seriously, someone answer this one, I’m curious myself.

Changes Have Risk

The script above creates scripts to make changes.  It’s not perfectly safe to make those changes.  Test it, understand it, and be careful running it.

One of the most important pieces is that the scripts it generates will try to shrink the log as much as possible before growing it again.  In terms of cleaning up VLFs, this is perfect.  In terms of trying to run this during the day, it risks transactions failing because they’re trying to grab the next VLF as you’re running a shrink job to delete all unused VLFs.

You might be able to get away with this during the day most of the time, but it’s best to wait for a maintenance window to be safe.

What Others Did

Linchi Shea showed performance degradation from having too many small VLFs on Performance impact: a large number of virtual log files – Part I.  Updates can take almost 10x as long, and I would assume its due to the size and not number of VLFs.

SQLskills.com has blog posts that read like a series on this.  Kimberly Tripp wrote 8 Steps to better Transaction Log throughput, then followed that up with Transaction Log VLFs too many or too few?, which was updated to link to Paul Randal’s post Important change to VLF creation algorithm in SQL Server 2014.  All are must-reads if you want to really understand this topic.

Tony Rogerson (b|t) at the time of this writing has only put one post on his new blog, Transaction Log Concepts: Part 1.  If this is his typical quality of work, we have a lot to look forward to.  I’m personally looking forward to the other 3 parts of this 4 part series.

Grahm Kent (b|t) ran tests on the performance differences between two log files in his post Slow recovery times and slow performance due to large numbers of Virtual Log Files.  Both were 5 GB, but one was 16 VLFs (320 MB each) while the other was 20,480 VLFs (0.25 MB each).  It was good to see transaction performance as well as recovery performance covered.

Test It

If your database is absolutely critical to the point that squeezing an extra couple milliseconds of performance throughout the day would be noticed or if a server coming back online 15 seconds faster would save thousands of dollars, test it.  See how it performs in load tests and service restarts using your test servers and your database.

If you have a 1 TB log file and verified it needs to be about that big, what happens when you size it so you have 2,000 VLFs that are 500 MB each?  What happens when you have 100 VLFs that are 10 GB each?  Did it change recovery, any replication you’re doing, backup times, etc.?

Know that this can make a difference and that this is just another configuration change you can test out.  Crazy numbers on either side can be bad, and perfect numbers in the middle aren’t going to come from some stranger on the internet that never saw your server.

TempDB Excessive Memory Usage Example

I see TempDB using more memory than I feel it should and found a way to resolve it.  Previously I dove into technical details and questioning what I saw in my post TempDB Memory Leak? which is great for proving and debugging an issue.  Now I want to step back and talk about it from a practical perspective.

To keep it practical, I’m going to walk through a single server, step-by-step.  What I saw that caught my eye and how I applied my workaround.  This is a real, production server with screen shots from Idera Diagnostic Manager for both before and after comparisons.

What I’m Seeing

A large portion of my buffer pool is being utilized by TempDB pages which are unallocated on disk.  The easiest way to see this is this quick query:

SELECT Unallocated_MB = COUNT(1)/128
FROM sys.dm_os_buffer_descriptors bd
	LEFT JOIN sys.allocation_units au ON bd.allocation_unit_id = au.allocation_unit_id
WHERE bd.database_id = 2
	AND au.allocation_unit_id IS NULL

The details of everything in memory can be seen on my post Query the Buffer Pool.

Since these are all pointing to real, but unallocated TempDB pages on disk, I was able to develop a workaround of simply having fewer TempDB pages on disk.  AKA, I made TempDB data files smaller.

Sizing TempDB

Here we’re talking about being hurt by TempDB being too large, but we can’t lose sight of TempDB being too small hurting you as well.  Specifically, when your server is the busiest with a small TempDB it will have to stop and grow the files, then deal with any fragmentation that process caused.  So, while I’m not going to recommend filling your dedicated drive, I’m not going to recommend making TempDB tiny, either.

My recommendation is to monitor your server, find out how much space TempDB uses, and make your TempDB significantly larger than that.  The longer you have watched your server, the better.  The goal is still to avoid almost any growth possible, but I’m adding in a “don’t go crazy” clause.

You can do this yourself as I talked about in Monitoring Database and Table Sizes using the function call FileProperty(<filename>, ‘space used’).  If you do it this way, make sure you have this running for a minimum of several weeks at least once an hour, day and night.  It’s possible, not difficult to do, and not recommended for servers that have a reasonable monitoring system already in place.

If you’re doing this to your production servers you should already have monitoring software that’s watching file sizes for you, even if it’s only tracking this in the background with no obvious way to find the information in the GUI.  This is the situation I find myself in using Idera Diagnostic Manager.  The data’s there, but the GUI is set up for more immediate concerns.  I need this query to get what I want out of it.

SELECT TOP 100 S.InstanceName
	, DS.UTCCollectionDateTime
	, D.DatabaseName 
	, DataUsed_GB = CAST((DataSizeInKilobytes + IndexSizeInKilobytes) / 1024 / 1024.0 AS DEC(20,1)) 
	, LogUsed_GB = CAST(LogSizeInKilobytes / 1024 / 1024.0 AS DEC(20,1))
FROM SQLdmRepository.dbo.DatabaseSize DS
	INNER JOIN SQLdmRepository.dbo.SQLServerDatabaseNames D ON DS.DatabaseID = D.DatabaseID
	INNER JOIN SQLdmRepository.dbo.MonitoredSQLServers S ON S.SQLServerID = D.SQLServerID 
WHERE UPPER(S.InstanceName) = 'SERVER\INSTANCE'  
	AND D.DatabaseName = 'tempdb'
	AND UTCCollectionDateTime > GetUTCDate()-180
ORDER BY 4 DESC

My recommendation is to take the peak size outside of a one-time process and make the total TempDB data files at least 150% of that size.  If you have 4 TempDB data files and the peak size used is 4 GB, 150% of that is 6 GB, so make each file at least 1.5 GB.  Then set autogrowth to something reasonable because one-time processes will happen.

My Before Observations

All of the screen shots and trending information are from Idera Diagnostic Manger, but you should be able to get the same basic information out of any monitoring software.

I had a server with rather low PLE and a lot of physical I/O.

Server_Oversized_TempDB

The physical I/O is difficult to see here because this server has a lot of writes.  When you look closer you can see that it rather regularly went to 400 page reads per second.

PhysicalIO_Oversize_TempDB

All of those reads caused lots of waits on the server.  In my 12-hour sample period I saw a total of 34,000.  The units shown for this field are fine for comparison reasons, but they appear to be a total of the ms/s on the chart for values collected every 6 minutes, not the total waits.  You have to query the tables behind Idera Diagnostic Manager to get the real values.  In this case, the total waits were 138 minutes for shared and 49 for exclusive latches, for a total of 187 minutes of waiting for the 12 hours.

Waits_Oversize_TempDB

Just seeing this I wanted to request more memory on the VM, after all it only had 16 GB.  However, I made it a rule to never add memory without justifying what’s in memory first.  To do that, I used my query on Query the Buffer Pool, and this was the first line in the results:

Cache_Oversized_TempDB

Of the 8.5 GB of memory used by the buffer pool, about 6 GB of that was used by unallocated space in TempDB.  I can accept some because of how TempDB caches temp tables and other stuff, but to have 70% of the buffer pool used up while I’m seeing low PLE and high I/O is not acceptable.

My Fix

I had the files set to practically fill the dedicated drive according to best practices, so each of the 4 files was 7,000 MB adding up to almost 28 GB on a 30 GB drive.

Using the query in the Sizing TempDB section above, this server has never used more than 1 GB of data in TempDB at one time.  While that makes for some interesting questions on how it’s using 6 GB of cache, it does give me a lot of room to reduce the size.

There were no complaints about performance due to what this server is being used for, so I didn’t try to shrink TempDB.  Instead I took the patient approach an resized TempDB, which took effect the next time SQL Services restarted.  Then I waited for our scheduled server reboots for Windows Updates.

ALTER DATABASE [tempdb] MODIFY FILE ( NAME = N'tempdev',SIZE = 512MB , FILEGROWTH = 128MB )
ALTER DATABASE [tempdb] MODIFY FILE ( NAME = N'tempdev2',SIZE = 512MB , FILEGROWTH = 128MB )
ALTER DATABASE [tempdb] MODIFY FILE ( NAME = N'tempdev3',SIZE = 512MB , FILEGROWTH = 128MB )
ALTER DATABASE [tempdb] MODIFY FILE ( NAME = N'tempdev4',SIZE = 512MB , FILEGROWTH = 128MB )

Note that my recommendation above would have been to make the files at least 1.5 GB and I made them 2 GB here.  I’m not worried about exact numbers, I’m worried about being reasonable sizes that also avoid growths.  With peak observed usage at 1 GB and TempDB sized at 2 GB, any growths should be very rare.

Overall, the data files will grow at 512 MB at a time, which, when it happens, is large enough to avoid extreme fragmentation yet small enough to avoid extreme waits.  This should only happen during “I didn’t mean to do that” or “how’d it get that execution plan” moments, but it will happen.

My After Observations

After the change I waited a couple days to let the cache stabilize after a reboot.  We rebooted on Saturday, and all of these numbers are from the Tuesday after.

When I looked again, TempDB was using 1/10 of the space it was before for unallocated pages in memory.  It’s not perfect, but it’s much better.

Cache_RightSize_TempDB

TempDB using less cache allowed PLE be a lot more stable, regularly staying well above 2,000.  Using Jonathan Kehayias’s formula, PLE should be above 1,050 on a server with 14 GB of cache, so this is looking good.

Server_RightSize_TempDB

With data staying in cache more, physical I/O also dropped.  Before it was normal to see spikes to 400 page reads per second, now you’re seeing about half of that.

PhysicalIO_RightSize_TempDB

Less activity also meant less waits.  The couple spikes still went almost as high, but those are things you’ll fix more through tuning than memory management.  The totals at the bottom are what I’m focused on right now, the PageIOLatch_xx waits combined added up to about 11,000 for the day, so about 1/3 of what it was before.

Again, I don’t like how Idera Diagnostic Manager calculates the totals.  When I queried the tables I saw that there was 60 minutes of waits on shared latches and 13 minutes on exclusive latches, for a total of 73 minutes.  This is in comparison to 138 + 49 = 187 minutes before.

Waits_Rightsize_TempDB

The Downside

Even with the new, smaller size, TempDB is a lot larger than it needs to be so it would be very rare to see any growths in the near future.  However, I have to keep in mind that the load on my servers will always be increasing, so I may end up hitting autogrowth down the road.  Then I’ll reboot the servers each month with Windows Updates, it’ll shrink TempDB back down for me, and it will have to grow again the next month.

Because of this, I’ll need to visit every one of my servers from time to time (semi-annually?) to see if this is occurring and increase the size of TempDB if it is.  I’ll use the same script for determining the peak sizes because I have software that’s capturing that already, but I could find other methods to get that information if I needed to.

The only other downside I see is with servers that hit a high peak TempDB usage overnight then never come close to it again during the day.  This workaround won’t do much for those servers.  You need that space overnight, so trying to make the size smaller than what it needs at 2 AM will just make everything worse.  However, knowing about this will drive you nuts enough to look into the overnight processes to see if  you can lower peak usage.  Perhaps you can spread out some concurrent tasks or tune some bad queries that never mattered due to their timing.  It’s not a bad thing to do this work, it’s just that this probably wasn’t a good time for something to climb up your priority list.

Take a Look

Take a look at your servers.  How much cache is TempDB using?  Does TempDB ever even use half of the space allocated to it?  Is it causing issues on your servers?

Never make a change just because you saw someone say it’s a good idea, especially if they’ve never seen your servers before.  Look at the numbers on your servers for yourself, ask questions to understand it, and make the best decision for those specific servers.

If you do make this change, check TempDB before your next reboot.  Did it have to grow?  If it did, you need to change the size manually so it doesn’t have to hit an autogrowth every month.  If you shrunk it down to 2 GB total then it grew to 2.5 GB, consider making it 3.5 GB to avoid future growths while keeping the files reasonably sized.

If this was an issue on your servers and you feel SQL Server shouldn’t have functioned this way, please visit the Connect Item on this.  An up-vote is great, a comment or anything else is even better.

 

 

 

 

Tim Ford’s Entry-Level Content Challenge

All bloggers started as beginners.  We picked things up, learned them, and got to the point where we wanted to start sharing our knowledge.  The problem is that most of us moved on to talking about advanced topics and left people to fend for themselves on the difficult entry-level learning curve.  My blog, Simple SQL Server, was actually created with me saying I was going to fill that gap, right up until the time I figured out the best way to learn advanced topics was to write things down as I learned.

Entry-Level Content Challenge

Tim Ford (b|t) noticed how we moved to advanced topics and asked us to remember where we came from in his Entry-Level Content Challenge.

It’s very simple.  Create one blog post a month that someone just learning your job could understand and benefit from, use the #EntryLevel tag on your blog and notification on twitter, and link to his original post.

This is for everyone, I don’t care what job you have.  I’m a SQL Server DBA, Tim’s a SQL Server DBA, but every profession has beginners that could use some help.

We’re Outnumbered

One of the things, if not THE biggest thing, you hope to do with a blog is to make a difference.  Then you talk about advanced topics to help other people near your level keep moving forward.  You also realize that teaching the details on these topics accelerates your learning and confidence faster than you thought possible.  It’s great; I recommend everyone does it.

However, there are a couple problems with this logic.  By the time people feel confident enough to blog, they aren’t talking about the fundamentals anymore.  Also, for every advanced person in any field, there are many, many more entry-level people who could benefit from learning the fundamentals properly from the start.

Bloggers aren’t talking to the largest portion of their potential audience, who just happen to be going through the most difficult part of the learning curve.

It’s the Right Thing To Do

In no specific order, here’s why it’s the right thing to do.  This list contains selfish reasons, moral reasons, and everything in-between.

  • You want to reach a larger audience, and there’s more people who need this content.
  • You were a beginner once and learned off of others.
  • You wish you learned the fundamentals right from the start, but you didn’t always.
  • You learn everything you teach better, even if you consider it simple.
  • Tim challenged you, and it’s hard to say no to a challenge.
  • You always wanted to make a difference.

New Bloggers

This is also a great opportunity for new bloggers.  Talk about subjects you know well, even if you’re not at the level you feel comfortable teaching advanced topics.

The content is the only difficult part of blogging.  Setting up a blog and creating a post is practically identical to setting up a new email account and sending an email.  If you want to get into blogging and need help, let me know.

If Einstein Can Do It…

Einstein said “If you can’t explain it simply, you don’t understand it well enough.”  I took this quote to heart enough to name my blog after it.  Try explaining the fundamentals simply, you’ll see how much your understanding of the topic increases to accomplish that feat.

 

 

Querying Deadlocks From System_Health XEvent

It’s easy to query XEvents to see some of the basic info in deadlocks, including the system_health session which is already capturing this information by default.

This script will, by default, read from system_health.  However, change the name of one parameter at the start to have it read from any active session.  I pull the file name automatically, so the session has to be active to avoid many changes to the script.

The Script

DECLARE @SessionName SysName 

SELECT @SessionName = 'system_health'

/* 
SELECT  Session_Name = s.name, s.blocked_event_fire_time, s.dropped_buffer_count, s.dropped_event_count, s.pending_buffers
FROM sys.dm_xe_session_targets t
	INNER JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address
WHERE target_name = 'event_file'
--*/

IF OBJECT_ID('tempdb..#Events') IS NOT NULL BEGIN
	DROP TABLE #Events
END

DECLARE @Target_File NVarChar(1000)
	, @Target_Dir NVarChar(1000)
	, @Target_File_WildCard NVarChar(1000)

SELECT @Target_File = CAST(t.target_data as XML).value('EventFileTarget[1]/File[1]/@name', 'NVARCHAR(256)')
FROM sys.dm_xe_session_targets t
	INNER JOIN sys.dm_xe_sessions s ON s.address = t.event_session_address
WHERE s.name = @SessionName
	AND t.target_name = 'event_file'

SELECT @Target_Dir = LEFT(@Target_File, Len(@Target_File) - CHARINDEX('\', REVERSE(@Target_File))) 

SELECT @Target_File_WildCard = @Target_Dir + '\'  + @SessionName + '_*.xel'

--Keep this as a separate table because it's called twice in the next query.  You don't want this running twice.
SELECT DeadlockGraph = CAST(event_data AS XML)
	, DeadlockID = Row_Number() OVER(ORDER BY file_name, file_offset)
INTO #Events
FROM sys.fn_xe_file_target_read_file(@Target_File_WildCard, null, null, null) AS F
WHERE event_data like '<event name="xml_deadlock_report%'

;WITH Victims AS
(
	SELECT VictimID = Deadlock.Victims.value('@id', 'varchar(50)')
		, e.DeadlockID 
	FROM #Events e
		CROSS APPLY e.DeadlockGraph.nodes('/event/data/value/deadlock/victim-list/victimProcess') as Deadlock(Victims)
)
, DeadlockObjects AS
(
	SELECT DISTINCT e.DeadlockID
		, ObjectName = Deadlock.Resources.value('@objectname', 'nvarchar(256)')
	FROM #Events e
		CROSS APPLY e.DeadlockGraph.nodes('/event/data/value/deadlock/resource-list/*') as Deadlock(Resources)
)
SELECT *
FROM
(
	SELECT e.DeadlockID
		, TransactionTime = Deadlock.Process.value('@lasttranstarted', 'datetime')
		, DeadlockGraph
		, DeadlockObjects = substring((SELECT (', ' + o.ObjectName)
							FROM DeadlockObjects o
							WHERE o.DeadlockID = e.DeadlockID
							ORDER BY o.ObjectName
							FOR XML PATH ('')
							), 3, 4000)
		, Victim = CASE WHEN v.VictimID IS NOT NULL 
							THEN 1 
						ELSE 0 
						END
		, SPID = Deadlock.Process.value('@spid', 'int')
		, ProcedureName = Deadlock.Process.value('executionStack[1]/frame[1]/@procname[1]', 'varchar(200)')
		, LockMode = Deadlock.Process.value('@lockMode', 'char(1)')
		, Code = Deadlock.Process.value('executionStack[1]/frame[1]', 'varchar(1000)')
		, ClientApp = CASE LEFT(Deadlock.Process.value('@clientapp', 'varchar(100)'), 29)
						WHEN 'SQLAgent - TSQL JobStep (Job '
							THEN 'SQLAgent Job: ' + (SELECT name FROM msdb..sysjobs sj WHERE substring(Deadlock.Process.value('@clientapp', 'varchar(100)'),32,32)=(substring(sys.fn_varbintohexstr(sj.job_id),3,100))) + ' - ' + SUBSTRING(Deadlock.Process.value('@clientapp', 'varchar(100)'), 67, len(Deadlock.Process.value('@clientapp', 'varchar(100)'))-67)
						ELSE Deadlock.Process.value('@clientapp', 'varchar(100)')
						END 
		, HostName = Deadlock.Process.value('@hostname', 'varchar(20)')
		, LoginName = Deadlock.Process.value('@loginname', 'varchar(20)')
		, InputBuffer = Deadlock.Process.value('inputbuf[1]', 'varchar(1000)')
	FROM #Events e
		CROSS APPLY e.DeadlockGraph.nodes('/event/data/value/deadlock/process-list/process') as Deadlock(Process)
		LEFT JOIN Victims v ON v.DeadlockID = e.DeadlockID AND v.VictimID = Deadlock.Process.value('@id', 'varchar(50)')
) X --In a subquery to make filtering easier (use column names, not XML parsing), no other reason
ORDER BY DeadlockID DESC

System_health session

The system_health XEvent by default is turned on, captures deadlocks, and is limited to four 5MB files.  You can change all three of those, but it’s not common for people to do that.

Yes, you’re limited to more recent events, but the information is there.  If you find the limitations are too much, you have a couple options that should be considered in this order.

  1. Is anything else capturing deadlocks?  For me, I have Idera Diagnostic Manager capturing deadlocks as well for my production servers.  This is my main source of looking at deadlocks, so I actually rarely use system_health for this purpose in production.  The first goal is to avoid doing anything additional.
  2. Consider setting system_health to use bigger files (not too big) or have more rollover files. The more you do the more resources it will take to read it all at once, so don’t go crazy.  However, if you have 5 days of data when you wish you had 7, it’s not horrible to double the size of data you retain.  The second goal is to increase disk usage before you add more overhead.
  3. Consider a separate XEvent session.  Ugh…I wish I didn’t have to throw this out there as an option.  Love your servers, don’t add more overhead if you don’t have to.  The last resort is to not end up here at your last resort.

Idera Diagnostic Manager

At the time of me writing this, I have a script to look at all of the deadlocks in Idera DM that I use quite regularly for my production servers.  However, they just came out with a new major version and I want to make sure my query runs there before posting my script publically.  If you don’t care and just want the script, email me.

Viewing the Deadlock

The details of this script give you a great overview of the deadlock.  You know the code involved on each side, the tables and applications involved, etc.   It’s enough to look for trends, know if a deadlock was one you expected, and other basics.

However, this is just a summary list of your recent deadlocks, it’s not the details.  For me, I’m very happy just looking at the XML output to see the details.  This is how I look at it from XEvents and my monitoring software.  That makes me odd, and I’m ok with that.

Jonathan Kehayias (b|t) talks about viewing them in SQL Sentry Plan Explorer Pro in his post Graphically Viewing Extended Events Deadlock Graphs.  There’s no single right answer on how to view deadlocks, so do what works best for you.

Troubleshooting Deadlocks

This topic goes well beyond what I’m going for in this post.  Jonathan Kehayias gets into it some in his post The Accidental DBA (Day 29 of 30): Troubleshooting Deadlocks, which is just a start.  The book he co-authored with Ted Krueger (b|t), Troubleshooting SQL Server – A Guide for the Accidental DBA, is available as a free PDF and has an entire chapter on this subject.

 

Shrinking Database Data Files

Normal DBAs will sum up shrinking a database data file in one word: NO!

I’ll talk to you about this subject because I’m not normal.  Still, my goal is that you’ll go from “why can’t I shrink my files” to “I don’t want to shrink my files.”

Truth be told, even when I HAVE to shrink files, I don’t want to.

Basically, you needed the space at one point in time, so you’ll probably need it again.  Unless you changed a process or something extremely rare happened, you’ll use the space again and there’s no reason to take on the issues involved with shrinking just to have a file grow again.  While this advice is mostly the same for log files, it’s different enough where I’m making a separate post for shrinking database log files

When should you shrink data files

First, a process changed where you’ll never need that amount of space again, such as the database used to keep 10 years of history but now an automated process keeps it pruned to 6 months.  This change also left you with a very significant amount of free space that you need for another purpose.

Really, that’s about it.

What happens when you shrink data files

The most common way to shrink a file is to have it reorganize pages before releasing free space, so I’ll cover that.  It basically has two steps:

  1. Fragment the file as much as possible (killing performance)
  2. Truncate the end of the file

By “reorganize” it means take the tail end of the file after the size you said you’d like the file to be shrunk to, find every populated page of data after that point, then move those pages to the first available spot you find in the file.  When the end of the file is empty, truncate it.

It’s an expensive process that leaves a big mess behind, but there are times you’ll need to go through with it (kicking and screaming).  If you do, schedule an index maintenance job immediately after this to undo most of the damage you just did and make sure the server still loves you.

Think about this, you’re considering kicking off a process where you have to tell the server you’re sorry afterwards.  You better have a good reason for this.  A 100 GB database with only 2 GB used may be a good reason to shrink.  A 100 GB database with 75 GB used is normal and healthy.

If you are going to shrink, make sure the database won’t be using that space again.  If you have a 100 GB database with only 2 GB used, does it populate, process, and purge 90 GB in an overnight process?  Find out before you shrink.

Manual Shrinking

If I haven’t scared you off yet (and I hope I did) then here’s how you shrink.

Here’s the link how to shrink a file using T-SQL: DBCC ShrinkFile

Here’s the link how to shrink a file using the GUI: SSMS Shrink File

There are Shrink Database options as well.  I don’t use it and don’t recommend it.  You know what files you want to shrink, don’t shrink them all.  When this is used it’s typically because people want to shrink their data files, but they mess up their log files while their at it.  Be nice to yourself, don’t use shrink database.

What about the Auto-Shrink option

No.

Just no.

It’s not a feature, it’s a threat.

If you have a certain amount of free space it will automatically shrink your database causing all the problems I just discussed, but it will probably kick in during your busiest time of day.  Then it won’t follow that up with index maintenance, so it just left you hopping all over your disk to find data.  The best part is that it didn’t check to see if you would need that space again, so it’ll probably grow tomorrow to get that space back.

In the example above of a 100 GB database that only uses 2 GB during the day but populates 90 GB for overnight processing, those 2 GB will be horribly fragmented, you’ll spend the resources shrinking, then you’ll take the performance hit growing the database every night as it puts the growth anywhere it can which typically means physical file fragmentation as well.  In the mean time your monitoring software will show that you have tons of free space, right up until you get the disk full error.  You can auto-shrink that database and get all those issues, or you can just let it do its job as you do yours without a maintenance nightmare.

The Connect Item on this is marked as fixed, stating that they’ll consider removing it in future versions.  This wasn’t a joke opened by some wannabe DBA.  This was serious and opened by someone who went on to become the president of the Professional Association for SQL Server.

To be fair, we all word our feelings on this differently.  I say no, Tom LaRock says never, Paul Randal says Turn It Off!, and Brent Ozar is colorful enough to come up with the term Hamster Wheel of Death.

As of the time of this writing, no one has used the term “Magical Unicorn” to describe this feature.

Sum it up

  1. Don’t shrink data files.
  2. If you didn’t listen to #1, why?
  3. Follow it up with index maintenance.
  4. Every time you try to automate this process a unicorn dies.

Entry-Level Content Challenge

I’m taking Tim Ford’s (b|tEntry-Level Content Challenge, and this is my first post with the #EntryLevel tag.  The challenge is to have one post that earns that tag each month, but I hope to exceed that.

 

SQL Saturday Cleveland #SQLSat473

On February 6, 2016, Cleveland is hosting a free training event for SQL Server.  This has a lot of the great stuff from the big, paid events, and skips some of the negatives.

There’s a great team from the North Ohio SQL Server Users Group that took on a very difficult task to put all of this together.  They hand selected every presentation, being forced to turn away over half of the abstracts and many speakers.  What they came up with was a variety of topics that span the breadth of SQL Server and the experience levels of the professionals working with databases.

What is this SQL Saturday?

For those of you who have never been to a SQL Saturday, here’s how this one works.  Other SQL Saturdays will vary from this slightly, but not too much.

There are 6 training rooms with hour-long presentations running simultaneously in each.  You move from room to room between each session choosing the ones that apply to you the most.  The schedule for the sessions is posted in advance, and it’s impressive.

An optional $10 lunch, which is typically very obvious that no one is making a profit off of, is the closest thing the event has to an entrance fee. The stuff you’d expect to cost you is provided by volunteers and sponsors who pull together the venue, presenters, and community for you.

There’s time to network with vendors, attendees, and speakers before the event, during breaks and lunch, and often end up in conversations that last throughout the next session.  This differs from the big events in that it’s much smaller and personable with most attendees being relatively local.  The people you meet at a regional event like this are going to be more relevant to your life because they’re part of your local community.

The event at Hyland Software concludes with prizes ranging from $100 gift cards to new laptops being handed out to random attendees.  Yes, you can basically get paid to go to free training, which I still don’t quite understand.

Then there’s the after-party, also provided by the SQL Saturday volunteers and sponsors.  There’s food, fun, and more time to get to know some great people.

The venue can’t hold everyone who should come, and often can’t hold everyone who wants to come.  Register now to make sure there’s room for you, and make sure to cancel if your plans fall through.

What is SQL Saturday?

Not just this event, what is SQL Saturday as a whole?  It’s a program specifically set up to help everyone in the community to develop.

The obvious people benefiting are the attendees who get a chance to see all the sessions and meet the vendors.  You can see how they would benefit from here, but not quite how much until you’re at the event.

The less obvious are the speakers, some speaking publically for the first time.  As one of the speakers, I can personally say that I’ve grown more as a person and as a professional than I thought possible.  It’s a step I highly recommend, and one I’m very grateful to have with SQL Saturday.

The even less obvious are the vendors.  They’re speaking to the public, not just their established customers, getting candid feedback on what works, what doesn’t work, and how they can make their offerings better.  Even if they don’t make a single sale from attending the event, they can always come out ahead.

SQL Saturday didn’t just pop out of nowhere, and someone told me we should give thanks to the people who started it all.

Thank you:

Steve Jones (b|t)

Andy Warren (b|t)

Brian Knight (b|t)

These event are around the world with one or more occurring almost every weekend of the year as an annual event in most major cities.  See SQL Saturday’s website for a full list of what’s coming up.  If your city isn’t coming up soon, look in past events because most cities hold it about the same time every year.

Who’s speaking?

Where do I start?  Well, by listing EVERYONE on the current schedule, along with the blogs and twitter handles I’m aware of.

  • Deji Akomolafe (b|t)
  • Jim Arko (b|t)
  • Adam Belebczuk (b|t) <– Organizer
  • Chris Bell (b|t)
  • Delora Bradish (b|t)
  • Mindy Curnutt (b|t)
  • Aaron Cutshall (t)
  • Joey D’Antoni (b|t)
  • David Eldersveld (b|t)
  • Cassandra Faris (t)
  • Kevin Feasel (b|t)
  • Frank Gill (b|t)
  • Amy Herold (b|t)
  • Paul Hiles (t) <– Organizer
  • Steve Hood (b|t) <– What am I doing on a list this awesome?
  • Michael John
  • Jonathan Kehayias (b|t)
  • Dave Mattingly (t)
  • David Maxwell (b|t)
  • Evelyn Maxwell <– bright future ahead for her
  • Eugene Meidinger (b|t)
  • Kon Melamud (t)
  • Ben Miller (b|t)
  • Jeff Moden (b)
  • Colleen Morrow (b|t) <– Organizer
  • Matt Nelson (b|t)
  • Wendy Pastrick (b|t)
  • Pat Phelan (t)
  • Justin Randall (b|t)
  • Wayne Sheffield (b|t)
  • Peter Shore (b|t)
  • Warren Sifre (b|t)
  • Julie Smith (b|t)
  • Erin Stellato (b|t) <– Organizer
  • Michael J. Swart (b|t)
  • Allen White (b|t) <– Organizer
  • Andy Yun (b|t)

Note that the schedule may change some before the event, but it never changes too much.

What do we ask of you?

Show up.

Despite the amazing list of speakers, we’re all volunteers.  That means we do it because we want to do it, and truly enjoy it.  If you want to help us enjoy it even more, be involved.  Talk to us and the other attendees, ask questions, and give feedback.

If you want to absolutely blow us away, get in touch afterwards and let us know the difference it made for you.  Yes, we want this, it will absolutely make our day!  Let the organizers know what you think of the event.  Tell the presenters how you were able to use their advice at work, and even ask questions down the road.  Get in touch with the vendors to discuss their products, even if you aren’t planning a purchase, to give them real feedback and see what’s out there.  I used to think this was imposing, but now I realize this is the one of the most awesome compliments you can get.

The most important part….

Show up!

Hope to see you there!

UPDATE – 2016-01-13 – On the waiting list

If you registered before today, you’re in.  If you’re registering now, you’re on the waiting list.

If you can’t make it, please cancel your registration so someone on the waiting list can get in.  We’ll miss you, but at least there will be someone else we won’t be missing.

TempDB memory leak?

I found a bug where I’m seeing TempDB use more memory than it should on multiple versions of SQL Server, especially on servers set up with common best practices. There’s a workaround that has a profound affect on server performance, adding to my belief that this is a legitimate bug and leading me to open a Connect Item on the issue.

Querying the Buffer Pool

I have a query to show me what’s in the buffer pool for the entire server, showing the expected results as well as excessive space being used by TempDB.  The newest version showing these details is on my post Querying the Buffer Pool.

It shows number of pages in the buffer pool grouped by the database, table, and index.  The query makes use of left joins so it can see space in memory that’s not currently allocated to a specific object.

The results are surprising in many ways.

The good surprises are seeing what indexes are hogging up your buffer pool so you have an idea of where to start tuning.  I’m a huge fan of this and have blogged about it in Cleaning Up the Buffer Pool to Increase PLE, although the name of my older post is misleading because it does more than just help memory management in SQL Server.

The Bug

The bad surprise was a bug which has been harassing me for quite some time now.  As I mentioned, the query will return all the space in the buffer pool, specifically the contents of sys.dm_os_buffer_descriptors, and does a left join to the tables leading up to and including sys.indexes so space not currently allocated to a table will show up.  The problem is that the space that shows up as unallocated for TempDB is much larger than expected, in this case taking up 1/3 of my buffer pool.

QueryBufferPool_TempDB

On this post I’m talking about a single server, but the problem wasn’t limited to a single server.  It showed up at the same time, caused by the same change (implementing a common best practice), partially resolved by the same partial rollback (undoing the best practice) on SQL 2008 R2, SQL 2012, and SQL 2014.

Details About the Bug

So the query I have on yesterday’s post, Querying the Buffer Pool, showed I had unallocated space in TempDB in memory, and a lot of it.  However, it doesn’t show details.

To start looking at the details, what kind of pages are these that exist in sys.dm_os_buffer_descriptors, but not in sys.allocation_units?

SELECT bd.page_type
	, MB = count(1) / 128
FROM sys.dm_os_buffer_descriptors bd
	LEFT JOIN sys.allocation_units au ON bd.allocation_unit_id = au.allocation_unit_id
WHERE bd.database_id = 2 --TempDB
	AND bd.is_modified = 0 --Let's not play dirty, only clean pages
	AND au.allocation_unit_id IS NULL --It's not even allocated
GROUP BY bd.page_type 
ORDER BY 2 DESC

TempDB_BufferPool_Unallocated

Ok, so we’re dealing with typical data in TempDB.  Well, other than it not being allocated, of course.

So I run another query to get more details.  This time I want to look inside the pages to see if they tell a different story.

SELECT TOP 100 bd.*
FROM sys.dm_os_buffer_descriptors bd
	LEFT JOIN sys.allocation_units au ON bd.allocation_unit_id = au.allocation_unit_id
WHERE bd.database_id = 2 --TempDB
	AND bd.is_modified = 0 --Let's not play dirty, only clean pages
	AND au.allocation_unit_id IS NULL --It's not even allocated

TempDB_PageLevel_Unallocated

Then I follow that up with Paul Randal’s How to use DBCC PAGE, which comes with all the disclaimers about using an undocumented and unsupported trace flag and command.  This one isn’t horrible in my mind or Paul’s comments, but remember the undocumented and unsupported parts.

DBCC TRACEON (3604);

DBCC PAGE (2, 5, 502219	, 0)
DBCC PAGE (2, 5, 374929	, 0)
DBCC PAGE (2, 5, 69868	, 0)
DBCC PAGE (2, 5, 453687	, 0)
DBCC PAGE (2, 5, 214988	, 0)
DBCC PAGE (2, 5, 440966	, 0)

DBCC TRACEOFF (3604);

The results all looked about the same to me.

DBCC_Page_Results

There are several important parts to me.  The m_objId is a negative value I can’t find in TempDB.sys.objects, so it WAS a temporary object that no longer exists.  Across the board, these are “NOT ALLOCATED”, “NOT CHANGED”, “NOT MIN_LOGGED”, “0_PCT_FULL”, so there’s nothing there.

To me it looks like temp objects made it into memory and remained in memory after the temporary objects were dropped.  I have no idea what objects these were or how they were dropped, but I’m imagining these were temp tables automatically dropped when the session was either closed or reset.

A Recent Change (A CLUE)

I found this by noticing that PLE for several servers was lower now than it has been in the past, so I was peeking in the buffer pool to see who was playing nice.  Going off of “when did PLE start to be lower” I noticed that I implemented a change around that time to use a common best practice.

That change was presizing TempDB data files to take up a vast majority of the dedicated LUN instead of letting them grow as needed.  It avoids waiting for file growth, especially if you’re using TDE (I’m not) and can’t use IFI (I can), but for several other reasons as well, including file fragmentation and the slight pause even IFI causes.  So at the start of all these festivities, I took the 4 TempDB data files from 100 MB each to 12 GB each, using up 48 GB of the 50 GB available.

A Workaround

Seeing this, I wanted to partially roll back the change the next opportunity I had.  100 MB was too small and I was aware that it invoked file growths every month (we reboot monthly for OS updates).  48 GB wasn’t right though, we just have that much space on the drive due to server build standards and paranoia (I’m a DBA).  So I went through our Idera Diagnostic Manager monitoring software and found the most space TempDB used, which is captured once an hour.  I found that 4.8 GB was the peak usage with several incidents of usage going over 4.5 GB.

With that information available and still not wanting an autogrowth for all the reasons listed above, I decided that all 4 files should be 1.5 GB, so 6 GB total.  That means peak usage was about 75% full, leaving plenty of room for error, especially with my baseline only being captured once an hour.  Autogrowth is set to 256 MB, so it’d add 1 GB total each growth.  I can live with that.

I can’t say it eliminated the issue because I still have 2 GB of unallocated TempDB space in cache, but it’s better than 8 GB.  It can be considered more acceptable than other issues I need to tackle right now, but it still bugs me.

What’s the Best Practice?

It’s a best practice to have TempDB data files on their own LUN, drive, array, however you want to word it.  Then it just make sense to have the total size of your data files add up to 90% or more of the drive size.  I see this advice everywhere, with these two standing out:

  • Solar Winds – Configuration Best Practices for SQL Server Tempdb–Initial Sizing
    • “Next, if you can give tempdb its own disk, then configure it to almost fill the drive. If nothing else will ever be on the drive, then you’re better off setting it to be larger than you’ll ever need. There’s no performance penalty, and you’ll never have to worry about autogrow again.”
  • Brent Ozar – SQL Server 2005/2008/2012/2014 Setup Checklist
    • “Notice that I don’t have filegrowth enabled.  You want to proactively create the TempDB files at their full sizes to avoid drive fragmentation.”

Jonathan Kehayias does it a little bit differently in his post SQL Server Installation Checklist saying to add space to TempDB files in 4 GB increments.  Although he doesn’t fill the drive by default, this isn’t mentioned by him, either.

Now I need to be perfectly clear on this, I trust these three sources.  I trust Jonathan and Brent more than I trust myself with setting up SQL Server.  I also feel the same about the authors I know on the Solar Winds post.  This does not change that.

Sizing TempDB like that often means it’s much larger than you need.  The workaround I’m using is to right-size these files instead.  For me, for now, I’m going to stick with seeing how large TempDB gets and make it slightly larger than that until I have a solid answer to my problem.

What Was It?

I still don’t know.  The workaround managed to knock it off of my priority list enough where I’m not actively working on it.  However, my drive to understand SQL Server better won’t leave me alone.

This post is my solution.  I have some very intelligent people reading this who I hope will at least lead me further down the rabbit hole, even if they don’t have a conclusive answer.  There’s a good chance I’ll be asking for help on Twitter with #sqlhelp or opening a connect item on this, for which I have a very well documented description of the issue that I can link to.

Updates:

2016-01-06 – Caching of Temporary Objects

Due to a comment, I started looking into the caching of temporary objects to see if this was the root cause.  The comment specifically mentioned Paul White’s (b|t) post Temporary Object Caching Explained, and I also read over Itzik Ben-Gan’s (b|t) post Caching Temporary Objects.

Both of these left me with the impression that smaller amounts of data would be left in the cache linked to temporary objects linked to the proc cache.  What I’m seeing is large amounts of data in the buffer pool that did not drop when I ran DBCC FREEPROCCACHE (on a test server that wasn’t in active use) as I expected if this was the full explanation.

While it’s very likely this is related to the issue on hand, I’m not ready to accept it as a full explanation.  If the memory associated with TempDB dropped when clearing the proc cache (on a test server) then it would have been a great explanation with a poor side effect of going too far with the memory being used.

2016-01-07 – Opened a Connect Item

I mentioned this issue on the comments of Paul White’s blog post mentioned in the last update and comments below on this post.  His response concluded with this:

So, when memory pressure is detected, I would expect memory use like this to be freed up for reuse in pretty short order, by design. If it is not, and bad things happen because memory for unallocated tempdb is not released/reused, that would be a bug.

While I was already leaning that way, it pushed me over the edge to decided it was time to open up a connect item on this issue.  I feel it’s well worth the read going to Paul’s post and the connect item.  Also, if you’re seeing this as well, an upvote on connect is very appreciated.

https://connect.microsoft.com/SQLServer/feedback/details/2215297