Why are Primary Databases Smaller Than Log Shipped Secondaries?

This week I was asked a ‘simple’ question by a client – why are the files for the primary databases smaller than the corresponding files on the log shipped secondary?  Obviously it needed to have something to do with either Growth or Shrink operations, so I set up a test to find the answer.

First a recap on what log shipping does.  Basically it is the process of backing up your databases and then restoring them automatically on another server, with a few bits or monitoring thrown in via management studio to make it a more manageable and versatile solution.  So any question around differences between the primary and secondary databases come down to the following question “Is it a logged operation?”  In this case the DBA that drew my attention to the question had already investigated shrinking the log file, and after the next backup,  copy and restore the log duly shrunk on the secondary server – so it appears that shrink is logged.  So that leaves growth.  It’s possible that the original database had a larger autogrowth which was later adjusted, and this change wasn’t transferred to production.  Let’s test.

Create the database with a stupid autogrowth setting:

CREATE DATABASE [Growtest]
ON PRIMARY
( NAME = N’Growtest’, FILENAME = N’D:\SQLData\growtest\Growtest.mdf’ , SIZE = 25600KB , FILEGROWTH = 50%)
LOG ON
( NAME = N’Growtest_log’, FILENAME = N’D:\SQLData\growtest\Growtest_log.ldf’ , SIZE = 18432KB , FILEGROWTH = 10%)
GO

And set it up for log shipping to Growtest_DR.  For brevity sake I won’t include the instructions on that, but if you need them there’s a pretty good step by step guide here.  Take a backup, copy and restore to make sure it’s all running properly, and check the file sizes:

LS Sizes 1

As we expect both files are the same size.  Now let’s reset the autogrowth on Growtest from 50% to 10%:

ALTER DATABASE [Growtest] MODIFY FILE ( NAME = N’Growtest’, FILEGROWTH = 10%)
GO

And create a table and fill it with enough data to trigger an autogrowth:

CREATE TABLE [dbo].[SpamTable](

[randomspam] [char](8000) NOT NULL
) ON [PRIMARY]

 

BEGIN TRAN
DECLARE @counter INT
SET @counter = 1
WHILE (@counter <3000)
BEGIN
Set @counter=@counter+1
INSERT into dbo.spamtable values (@counter)
END

COMMIT TRAN

 

Then do a backup, copy, restore and check tables again:

LS Sizes 2

The files on DR and production are still the same size.  What happens if we run the transaction but then roll it back instead of comitting it.  Does SQL bother to apply and rollback the transaction, or does it just remove the transaction from the logfile?

BEGIN TRAN
DECLARE @counter INT
SET @counter = 1
WHILE (@counter <500)
BEGIN
Set @counter=@counter+1
INSERT into dbo.spamtable values (@counter)
END
ROLLBACK TRAN

LS Sizes 3

After a backup, copy, restore we still have the same file sizes on DR and Prod, so there goes our growth theory.  So let’s forget about what we were told when the problem was handed to us and do some shrinking.  First the test that was already performed:

DBCC SHRINKFILE (N’Growtest_log’ , 0, TRUNCATEONLY)

Sure enough the log file truncation is transferred across to the DR server:

LS Sizes 4

Running out of ideas, we’ll try a shrink of the data file:

DBCC SHRINKFILE (N’Growtest’ , 0, TRUNCATEONLY)

And after a backup, copy and restore:

LS Sizes 5

Finally, we have an answer.  While autogrowth changes, and log file shrinks all appear to be logged, the shrink of the data file is not.  So at some point someone has shrunk the data file on production, which is why the production data files take less space than the DR data files.  So where  tables are archived in production to free space, and the data file is shrunk(either with shrinkfile or shrink database), we need to be aware that this shrink is not transferred to the DR server.  If we expect to reclaim that space we need to refresh the logshipping by taking a full backup from Production and restoring it over the DR database.

 

 

 

 

 

 

Where Do Default Settings For A New Database Come From?

Recently I was reviewing one of our supported instances and I saw a bunch of databases with 1MB autogrowth on the data file.  Most were quite new, and some had grown to quite a large size.  Paul Randal deals with ‘the default’ settings for autogrowth in this blog post, and says 1MB is not a good option.  No surprise there as for a highly transactional system you are going to have the data file growing all the time.  So I want to change it, but I also want to understand where it came from.

This is a managed server, so I know the model database is not going to have 1MB autogrowth set up, but just to be sure I check and confirm.  So if it’s not getting the default setting from the model database, where is it coming from?  Time to test. So when I use the Management Studio GUI it pulls settings from the model database as I’d expect.  And when I script those out using the ‘SCRIPT’ option in the create database wizard it faithfully transfers all the settings.  But what about if I just use ‘create database’ by itself.

CREATE DATABASE MODELTEST

Sure enough this grabs the default setting from when you install SQL Server – 1MB, and not the updated value in the model database.  I also noted it grabs the default data file size, but not the default log file size.  This was news to me, and led me to wonder what other settings were not being pulled from model.  So I changed a bunch of options in model on the options tab to test.

All of the other options I tested transferred properly.  Or at least as I anticipated.  So….nice one SQL Server…..you will transfer autoshrink being enabled but not transfer the log startup size or data growth settings.

Resolving Performance Problems by Rebuilding Indexes

From the moment I started playing with this SQL Server thing I have certain things repeated to me over and over again.  One of those is the importance of removing fragmentation from indexes.  There’s a whole bunch of stuff out there on the interweb on people rebuilding indexes and performance increasing, and so now…a few years further on in my journey it’s quite common for a client to come to me with questions about whether such and such an index has been rebuilt, or concern over fragmentation levels.  Increasingly I find myself explaining that rebuilding the index may help, but it’s quite an expensive operation, can cause blocking and may not be the problem anyway.

Now, I’m not saying that indexes shouldn’t be rebuilt or maintained.  Clearly they should, but there’s a bunch of other things that happen when we rebuild an index which may be what is actually resolving the performance issue that has raised the original flag.  Let’s look at them:

Statistics Are Updated

Whenever you rebuild an index, all the statistics related to that index are updated.  (Take Note:  This applies to rebuilding the index, if you are reorganising it the statistics are not automatically maintained).  Statistics are very powerful part of how SQL Server decides to access data.  Even with autoupdate statistics enabled statistics can get out of date really quickly, and query plans can be generated using incorrect information.  That can bring back information significantly slower than if the statistics are up to date.  The most common affect I see around this is around queries based on dates(and in particular that use a comparison against something like getdate() or CURRENT_TIMESTAMP).  Recently I saw a query which was running around 800 times a day and taking around 17 seconds per execution.  Without adjusting anything but updating the statistics that dropped down to 8 seconds.  Rebuilding the index would have resulted in the same performance improvement, but has a much higher impact on the server to achieve the same result.

Plans are recalculated

Related to the above, any plan that uses the index is recalculated based on the latest information – Fragmentation is removed(hopefully) and statistics are up to date, so SQL should choose a more appropriate plan.  Or it could be slow due to a parametrization issue which generated a plan based on one value and is now re-using the plan for other, less appropriate values.  Imagine a database which contained everyone in the worlds favourite colour.  If a plan is run the first time to get information for New Zealand population, and all subsequent queries are trying to get information about the US or Europe, SQL is obviously going to have trouble accessing the larger volumes of data in the same way it would for the smaller set of data associated with New Zealand.  Now, everyone in the world is going to be a big index to rebuild, where as in that case a recompile hint could see the same performance improvement.

So, in conclusion – yes your indexes need to be maintained, but if there is a performance issue on a server, it’s much better if you can narrow it down to a specific problem which can potentially be resolved surgically, than get out the second biggest hammer you have(behind restarting your server….and one day when I am brave enough I’ll post on what that is actually doing when it magically fixes your performance issues.)

How to change the location of the Default Trace Files?

I had a situation where a 500 GB drive needed to be repurposed and the…oh 20GB worth of databases on it needed to be moved somewhere else.  Even though this included the system databases this was not overly complicated and not something you can’t find instructions for anywhere else thanks to our friend google.  A little more complicated was that the default trace files were also being written to that location.  Although I suspect the worst result of leaving that location would be a failed message in the logs and no default trace, I didn’t want SQL to do anything silly like fail to start the moment it couldn’t write to the default trace.  It seems this is either such a blindingly simple thing to do that people haven’t posted a how to online, or that it’s not something people have a call to do much.  At any rate, I found this blog post on sqlconcepts that indicates you can’t change default settings for the default trace.  Luckily while that is true of the size or the number of files retained, it is a relatively simple matter to change the location.

Reading the microsoft article on the default trace it says that the default location is MSSQL\LOG which is technically true, and also led me to believe that the default trace was stored, by default, in the same location as the error logs.  You can change the location your error logs are written to by adjusting the location of your current error logs in your SQL Server startup parameters.  I tested this on my local instance(SQL 2012) and everything ran fine, switching to the new location after a server restart.  I guess this makes sense given that the default trace really is another sort of error log.

On a slight tangent due to the sql concepts post above:  If you want to retain more history than the default(5 20MB files – which can equate to not very much at all on a heavily used server) it is possible to do so by simply scheduling a job which copies the trace files to another location.  You can either just copy all *.trc files out of your errorlog location to your archive location, or if you want to get fancy and only copy the new files you can get the current file name with:

select path from sys.traces where id=1

So long as the default trace is enabled it should always have an id of 1 as it is started(and rolled) whenever SQL Server restarts.

 

I hope this is helpful to someone.

A Merry Christmas to the Nephews

Following on from the Birthday Present I made my daughters, I promised my sister some games for her boys Christmas presents.  December turned into a pretty crazy month, but I got the last one finished on Christmas eve.

James Space game is here.

Alex’s Very Noisy motorbike game is here.

And Daniels memory game is here.

 

Best played in Google Chrome or Mozilla Firefox, because some versions of Internet Explorer will not deal with the sound files correctly.

 

Why log shipping backups fail in an Always On Availability Group – Resolved

Sometimes it bother me how long it takes to throw a single switch which resolves a problem.  This was one of those times.

Here’s the scenario.  An availability group is set up and you want to log ship one of the databases to a designated DR server so you have both HA and DR at your disposal.  Everything sets up just as it should, but when you run the backup no file appears in your backup directory.  What’s going on?

For a while I went around in circles trying to second guess my log shipping setup.  Then I made a second database, successfully configured log shipping, but the moment I added it to the availability group it began exhibiting the same behaviour.  There are no errors in the logs, the backup job says it’s successful, but when you go into the error logs all you find is it is deleting old backups that have fallen out of retention.  No backup is made.  The databases last backup date does not change.

I went so far as reading the entire support notes on books online around availability groups trying to find anything to suggest this couldn’t be done.  Then I repeated my experiments to confirm that log shipping worked fine outside the availability group, but failed when it was added.  Finally I came to the conclusion that should have been obvious an hour earlier.  The problem must be something to do with the group set up.  Once I got there the likely culprit was clear – the availability group was trying to enforce a backup being taken off the secondary replica, and the log shipping job didn’t like that.  It seems to check if it can take a local backup, see that it can’t and so skip straight to the historic backup purge.

Here’s how to correct it.

  1.  In management studio expand Always On High Availability and then Availability Groups.
  2. Right click on your chosen group and choose properties.
  3. In the left pane select backup preferences.
  4. Select ‘Primary’ as your backup source.  Click Okay.

Now run your log shipping log backup job and everything should perform as expected.

 

How to be a programmer………

This article was linked to from a recent Brent Ozar PLC newsletter.  It contains a bunch of tips about how to be a programmer(Not how to program, but how to be a programmer – the distinction is important).  I think it was absolutely brilliant and has huge areas of crossover to other technical roles…such as DBA’s.

 

(Dead Link)http://samizdat.mines.edu/howto/HowToBeAProgrammer.html?x#id2855381(Dead Link)

I particularly liked the sections on how to deal with difficult people, how to provide estimates, and perhaps best was knowing when to go home:)