Why are Primary Databases Smaller Than Log Shipped Secondaries?

1 Reply

This week I was asked a ‘simple’ question by a client – why are the files for the primary databases smaller than the corresponding files on the log shipped secondary? Obviously it needed to have something to do with either Growth or Shrink operations, so I set up a test to find the answer.

First a recap on what log shipping does. Basically it is the process of backing up your databases and then restoring them automatically on another server, with a few bits or monitoring thrown in via management studio to make it a more manageable and versatile solution. So any question around differences between the primary and secondary databases come down to the following question “Is it a logged operation?” In this case the DBA that drew my attention to the question had already investigated shrinking the log file, and after the next backup, copy and restore the log duly shrunk on the secondary server – so it appears that shrink is logged. So that leaves growth. It’s possible that the original database had a larger autogrowth which was later adjusted, and this change wasn’t transferred to production. Let’s test.

Create the database with a stupid autogrowth setting:

CREATE DATABASE [Growtest]
ON PRIMARY
( NAME = N’Growtest’, FILENAME = N’D:\SQLData\growtest\Growtest.mdf’ , SIZE = 25600KB , FILEGROWTH = 50%)
LOG ON
( NAME = N’Growtest_log’, FILENAME = N’D:\SQLData\growtest\Growtest_log.ldf’ , SIZE = 18432KB , FILEGROWTH = 10%)
GO

And set it up for log shipping to Growtest_DR. For brevity sake I won’t include the instructions on that, but if you need them there’s a pretty good step by step guide here. Take a backup, copy and restore to make sure it’s all running properly, and check the file sizes:

As we expect both files are the same size. Now let’s reset the autogrowth on Growtest from 50% to 10%:

ALTER DATABASE [Growtest] MODIFY FILE ( NAME = N’Growtest’, FILEGROWTH = 10%)
GO

And create a table and fill it with enough data to trigger an autogrowth:

CREATE TABLE [dbo].[SpamTable](

[randomspam] [char](8000) NOT NULL
) ON [PRIMARY]

BEGIN TRAN
DECLARE @counter INT
SET @counter = 1
WHILE (@counter <3000)
BEGIN
Set @counter=@counter+1
INSERT into dbo.spamtable values (@counter)
END

COMMIT TRAN

Then do a backup, copy, restore and check tables again:

The files on DR and production are still the same size. What happens if we run the transaction but then roll it back instead of comitting it. Does SQL bother to apply and rollback the transaction, or does it just remove the transaction from the logfile?

BEGIN TRAN
DECLARE @counter INT
SET @counter = 1
WHILE (@counter <500)
BEGIN
Set @counter=@counter+1
INSERT into dbo.spamtable values (@counter)
END
ROLLBACK TRAN

After a backup, copy, restore we still have the same file sizes on DR and Prod, so there goes our growth theory. So let’s forget about what we were told when the problem was handed to us and do some shrinking. First the test that was already performed:

DBCC SHRINKFILE (N’Growtest_log’ , 0, TRUNCATEONLY)

Sure enough the log file truncation is transferred across to the DR server:

Running out of ideas, we’ll try a shrink of the data file:

DBCC SHRINKFILE (N’Growtest’ , 0, TRUNCATEONLY)

And after a backup, copy and restore:

Finally, we have an answer. While autogrowth changes, and log file shrinks all appear to be logged, the shrink of the data file is not. So at some point someone has shrunk the data file on production, which is why the production data files take less space than the DR data files. So where tables are archived in production to free space, and the data file is shrunk(either with shrinkfile or shrink database), we need to be aware that this shrink is not transferred to the DR server. If we expect to reclaim that space we need to refresh the logshipping by taking a full backup from Production and restoring it over the DR database.

Where Do Default Settings For A New Database Come From?

Resolving Performance Problems by Rebuilding Indexes

Leave a reply

From the moment I started playing with this SQL Server thing I have certain things repeated to me over and over again. One of those is the importance of removing fragmentation from indexes. There’s a whole bunch of stuff out there on the interweb on people rebuilding indexes and performance increasing, and so now…a few years further on in my journey it’s quite common for a client to come to me with questions about whether such and such an index has been rebuilt, or concern over fragmentation levels. Increasingly I find myself explaining that rebuilding the index may help, but it’s quite an expensive operation, can cause blocking and may not be the problem anyway.

Now, I’m not saying that indexes shouldn’t be rebuilt or maintained. Clearly they should, but there’s a bunch of other things that happen when we rebuild an index which may be what is actually resolving the performance issue that has raised the original flag. Let’s look at them:

Statistics Are Updated

Whenever you rebuild an index, all the statistics related to that index are updated. (Take Note: This applies to rebuilding the index, if you are reorganising it the statistics are not automatically maintained). Statistics are very powerful part of how SQL Server decides to access data. Even with autoupdate statistics enabled statistics can get out of date really quickly, and query plans can be generated using incorrect information. That can bring back information significantly slower than if the statistics are up to date. The most common affect I see around this is around queries based on dates(and in particular that use a comparison against something like getdate() or CURRENT_TIMESTAMP). Recently I saw a query which was running around 800 times a day and taking around 17 seconds per execution. Without adjusting anything but updating the statistics that dropped down to 8 seconds. Rebuilding the index would have resulted in the same performance improvement, but has a much higher impact on the server to achieve the same result.

Plans are recalculated

Related to the above, any plan that uses the index is recalculated based on the latest information – Fragmentation is removed(hopefully) and statistics are up to date, so SQL should choose a more appropriate plan. Or it could be slow due to a parametrization issue which generated a plan based on one value and is now re-using the plan for other, less appropriate values. Imagine a database which contained everyone in the worlds favourite colour. If a plan is run the first time to get information for New Zealand population, and all subsequent queries are trying to get information about the US or Europe, SQL is obviously going to have trouble accessing the larger volumes of data in the same way it would for the smaller set of data associated with New Zealand. Now, everyone in the world is going to be a big index to rebuild, where as in that case a recompile hint could see the same performance improvement.

So, in conclusion – yes your indexes need to be maintained, but if there is a performance issue on a server, it’s much better if you can narrow it down to a specific problem which can potentially be resolved surgically, than get out the second biggest hammer you have(behind restarting your server….and one day when I am brave enough I’ll post on what that is actually doing when it magically fixes your performance issues.)

How to change the location of the Default Trace Files?

A Merry Christmas to the Nephews

1 Reply

Following on from the Birthday Present I made my daughters, I promised my sister some games for her boys Christmas presents. December turned into a pretty crazy month, but I got the last one finished on Christmas eve.

James Space game is here.

Alex’s Very Noisy motorbike game is here.

And Daniels memory game is here.

Best played in Google Chrome or Mozilla Firefox, because some versions of Internet Explorer will not deal with the sound files correctly.

Why log shipping backups fail in an Always On Availability Group – Resolved

2 Replies

Sometimes it bother me how long it takes to throw a single switch which resolves a problem. This was one of those times.

Here’s the scenario. An availability group is set up and you want to log ship one of the databases to a designated DR server so you have both HA and DR at your disposal. Everything sets up just as it should, but when you run the backup no file appears in your backup directory. What’s going on?

For a while I went around in circles trying to second guess my log shipping setup. Then I made a second database, successfully configured log shipping, but the moment I added it to the availability group it began exhibiting the same behaviour. There are no errors in the logs, the backup job says it’s successful, but when you go into the error logs all you find is it is deleting old backups that have fallen out of retention. No backup is made. The databases last backup date does not change.

I went so far as reading the entire support notes on books online around availability groups trying to find anything to suggest this couldn’t be done. Then I repeated my experiments to confirm that log shipping worked fine outside the availability group, but failed when it was added. Finally I came to the conclusion that should have been obvious an hour earlier. The problem must be something to do with the group set up. Once I got there the likely culprit was clear – the availability group was trying to enforce a backup being taken off the secondary replica, and the log shipping job didn’t like that. It seems to check if it can take a local backup, see that it can’t and so skip straight to the historic backup purge.

Here’s how to correct it.

In management studio expand Always On High Availability and then Availability Groups.
Right click on your chosen group and choose properties.
In the left pane select backup preferences.
Select ‘Primary’ as your backup source. Click Okay.

Now run your log shipping log backup job and everything should perform as expected.

How to be a programmer………

1 Reply

This article was linked to from a recent Brent Ozar PLC newsletter. It contains a bunch of tips about how to be a programmer(Not how to program, but how to be a programmer – the distinction is important). I think it was absolutely brilliant and has huge areas of crossover to other technical roles…such as DBA’s.

(Dead Link)http://samizdat.mines.edu/howto/HowToBeAProgrammer.html?x#id2855381(Dead Link)

I particularly liked the sections on how to deal with difficult people, how to provide estimates, and perhaps best was knowing when to go home:)

It Depends

Rob's SQL Server and Technology blog

Why are Primary Databases Smaller Than Log Shipped Secondaries?

Where Do Default Settings For A New Database Come From?

Resolving Performance Problems by Rebuilding Indexes

How to change the location of the Default Trace Files?

A Merry Christmas to the Nephews

Why log shipping backups fail in an Always On Availability Group – Resolved

How to be a programmer………