Why log shipping backups fail in an Always On Availability Group – Resolved

Sometimes it bother me how long it takes to throw a single switch which resolves a problem.  This was one of those times.

Here’s the scenario.  An availability group is set up and you want to log ship one of the databases to a designated DR server so you have both HA and DR at your disposal.  Everything sets up just as it should, but when you run the backup no file appears in your backup directory.  What’s going on?

For a while I went around in circles trying to second guess my log shipping setup.  Then I made a second database, successfully configured log shipping, but the moment I added it to the availability group it began exhibiting the same behaviour.  There are no errors in the logs, the backup job says it’s successful, but when you go into the error logs all you find is it is deleting old backups that have fallen out of retention.  No backup is made.  The databases last backup date does not change.

I went so far as reading the entire support notes on books online around availability groups trying to find anything to suggest this couldn’t be done.  Then I repeated my experiments to confirm that log shipping worked fine outside the availability group, but failed when it was added.  Finally I came to the conclusion that should have been obvious an hour earlier.  The problem must be something to do with the group set up.  Once I got there the likely culprit was clear – the availability group was trying to enforce a backup being taken off the secondary replica, and the log shipping job didn’t like that.  It seems to check if it can take a local backup, see that it can’t and so skip straight to the historic backup purge.

Here’s how to correct it.

  1.  In management studio expand Always On High Availability and then Availability Groups.
  2. Right click on your chosen group and choose properties.
  3. In the left pane select backup preferences.
  4. Select ‘Primary’ as your backup source.  Click Okay.

Now run your log shipping log backup job and everything should perform as expected.

 

How to be a programmer………

This article was linked to from a recent Brent Ozar PLC newsletter.  It contains a bunch of tips about how to be a programmer(Not how to program, but how to be a programmer – the distinction is important).  I think it was absolutely brilliant and has huge areas of crossover to other technical roles…such as DBA’s.

 

(Dead Link)http://samizdat.mines.edu/howto/HowToBeAProgrammer.html?x#id2855381(Dead Link)

I particularly liked the sections on how to deal with difficult people, how to provide estimates, and perhaps best was knowing when to go home:)

How to Find Which Databases Have Indexes With Page Level Locking Disabled

I’ve recently had to look over a client server where a maintenance plan had been failing for some time.  The inhouse DBA was unaware of the failures because he had no alerting on the jobs and the big red crosses weren’t big or red enough to have caught his attention.  When I looked at the cause of the failures it was due to a native maintenance plan not being able to reoganise an index with page level locking disabled.  Whenever a reorganise plan hits a database with page level locking disabled it generates the following error:

Executing the query “ALTER INDEX Indexname] ON [dbo]….” failed with the following error: “The index “Indexname” (partition 1) on table “tablename” cannot be reorganized because page level locking is disabled.”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.

There are two ways of getting around this.  The first way is to generate TSQL scripts to enable Page level locking prior to the re-organise and then re-enable it after the maintenance is completed.  JP Chen has the scripts to do that here.  Note:  Haven’t tested those, but they look right.  I suspect you may need to exclude the system tables as well.

The second, and easier way, is to set up a rebuild step and move the affected databases into that step of your maintenance plan.  Rebuilds do not require a page lock because the index is completely rebuilt rather than shuffled around.  The TSQL to figure out which database you need to move from the re-organise to the rebuild step is:

exec sp_MSforeachdb ‘use ? select db_name(), (case when count(*)>0 then ”Page Locks Disabled:” ELSE ”Page Locks” END) from sys.indexes where allow_page_locks=0 and name not in(”queue_clustered_index”,”queue_secondary_index”)’

and if you want to get the specific indexes you can use this:

exec sp_MSforeachdb ‘use ? select db_name(), object_id,name from sys.indexes where allow_page_locks=0 and name not in(”queue_clustered_index”,”queue_secondary_index”)’

Standard disclaimer about sp_MSforeachdb being an undocumented stored procedure and maintenance plans being somewhat limited form of doing your index mainteance apply:)

 

 

TSQL – get sizes of unused indexes

Here’s a TSQL snippet that will give you the sizes of the indexes that the query optimiser isn’t using.  Note that the NULL values are SINCE THE LAST RESTART.  So if you just restarted your server then this is rather a waste of time.

 
SELECT object_name(ui.object_id), si.name as [Index Name],ui.index_id, ui.last_user_seek, ui.last_user_scan, ui.user_seeks, ui.user_scans,
ui.user_lookups,ui.user_updates,
8 * sa.used_pages AS [Index Size]
from sys.dm_db_index_usage_stats ui
inner join sys.indexes si on ui.object_id=si.object_id and ui.index_id=si.index_id
INNER JOIN
sys.partitions sp ON si.object_id = sp.OBJECT_ID AND si.index_id = sp.index_id
INNER JOIN sys.allocation_units AS sa ON sa.container_id = sp.partition_id
where database_id=7
and object_name(ui.object_id) not like ‘sys%’
and ui.last_user_seek is NULL
and ui.last_user_scan is NULL
ORDER by 10 desc

Error while enabling windows feature: netfx3. – resolved

Something I have run into a few times now which is an annoying timewaster is the following error:

Error While Enabling Windows Feature netfx3

Error While Enabling Windows Feature netfx3

You’ll get it on windows 2012 installations when installing SQL 2012(And probably lower versions) and it’s caused by dotnet 3.5not being enabled on the windows installation. It’s frustrating because it happens during the actual installation and is not flagged in any of the pre-requisite checks.  This is because Windows assumes it either has access to the install media to install it from, or can download it from the internet.  On your SQL Servers you shouldn’t have direct internet access, and you are unlikely to have the install media lying around.  At any rate – here’s how to fix it:

1.  Place the install media back to it’s original location(Most likely the CD drive).

2.  From Server Manager choose ‘Add Roles and Features’.

3.  Add ‘.NET framework 3.5 features’.

4.  Now redo your SQL Install and you should be fine.

Enable dotnet 3_5

TSQL to Send Email From a SQL Server With Attachment

There are about a billion different technologies used to connect remotely to SQL Servers and some are better than others.  Some allow file transfer, some allow sharing clipboards and some don’t.  Of course, better depends entirely on your perspective.  In most cases those ‘inconvenient’ things are there for a reason – security.  nonetheless it’s a pain when you have some nice query output that you want to drop into a CSV file and you can’t get it to your own laptop.  Well….here’s how.

First of all you need a few things set up.  you need:

  • Database Mail XP’s enabled.
  • A SQl Databasemail profile set up.
  • mail relaying enabled from the server you are sending from.

In other words, you aren’t going to sneakily move data off a server.  You have to have the rights to set those things up and the co-operation of a system administrator to get the mail sorted.  Now, if it’s just a query result you want sent back you can just use this:

USE msdb
EXEC sp_send_dbmail
@profile_name = ‘ManagedSQL’,
@recipients = ‘me@mymail.com’,
@subject = ‘ROSA OUTPUT’,
@body = ‘Daily Statistics are below.’,
@execute_query_database = ‘MYDB’,
@query = ‘SELECT * FROM [MYDB].[dbo].[tbl_color]’

But if you’ve already got a large output you may have to save it to the server and then send it as an attachment:

USE msdb

GO

EXEC sp_send_dbmail
@profile_name=’ManagedSQL’,
@recipients=’me@mymail.com’,
@subject=’Healthcheck Output’,
@body=’Server Healthcheck data attached.’,
@file_attachments=’C:\temp\config.txt’

An Introduction to Always On Availability Groups

The below is an extract from my presentation at SQL Saturday 252 on high availability and disaster recovery.  This post covers the basics of Always On Availability Groups.

So if you have mirroring you need to fail over each database, and sometimes you are going to have multiple databases servicing the same application and some will fail over automatically and leave you with databases in different places, attempting to join with each other and failing, and connection strings bouncing back and forwards between servers and generally….a big mess.

Wouldn’t it be nice if we could guarantee that if one database failed over, it took all of the other associated or reliant databases across with it.  And wouldn’t it be nice if we could use a virtual name like with a cluster so we could point an application at one place and then the windows cluster, active directory and SQL server would do the donkey work?  Well….the good news is if you have SQL Server 2012 then you can.

Availability groups are like mirroring, except well…better.  You can have multiple databases in a group and guarantee you fail them over together, you can reliable report off any of those mirrors and you can use a single connection string and point it at a virtual name and let it decide which server it needs to action your query on.

You can have up to 4 secondary replicas as well as your read\write primary.  You have the same options around whether you want any individual replica to be Synchronous or Asynchronous.  You can specify a shared backup location so when you fail over between replicas your recoverability chain remains intact.  You can take backups from replicas in copy only mode.

Initially….initially it looks like all the best bits of mirroring and clustering have been beautifully merged together to give us the HA\DR solution we have been waiting for.   But…not quite.

SQL 2012 is the first crack at Availability Groups, and they are pretty good, but they still have a few things to sort out and are going to get better.

Let’s start with backups.  If you take backups on different replicas you need to be aware that each instance only keeps backup history from it’s own msdb database.  So while the backup chain itself will still be fine, you can’t query the history of backups on any individual instance, you need to write some smarts which looks at all the msdb databases and collates that information.  Now obviously you can be taking copy only backups off your secondary replicas, but they are going to be copy only.

SQL 2012 is really built to be running on Windows 2012 server, but it’s not required.  And there are a whole heap of considerations if you are using Windows Server 2008 R2.  There’s bugs that are known(and you can do a google to find out what they are) running availability groups on Windows 2008 R2 purely because when Windows Server 2008 R2 was being developed the whole concept of “Always On” availability was just an idea in the back of some developers head.  Mostly these are around the way and the times that the server quorums are calculated, but availabilities groups push windows clustering technology as hard….or probably harder….than any other application does, so there’s things being found all the time, and patches coming out all the time.

There are a lot of single points of failure.  Is the listener accepting writeable connections.  Is it sending read connections to the correct place.  Is load being balanced between replicas?  Are the replicas up to date.  These are all new considerations that you probably aren’t actively monitoring for with your current environment.  There’s a whole bunch of new moving parts in this setup and if one of them fails you are not necessarily going to know what and how if you haven’t worked out what needs to be monitored.  At best you may find that everything is going to your primary replica so you aren’t getting any scaled out performance, at worst – you may not even get that.

There’s no combined statistics and DMV’s.  So if you are doing your performance tuning you are going to look at a DMV and note a whole bunch of indexes not being used.  But the reason they aren’t being used might not be that they aren’t good….it might be that you are offloading all of the queries that would be using them to a different replica.  Likewise, a query might be performing badly and you go to troubleshoot it and find it’s incredibly fast on the server you test it on.  But maybe that’s because of the load on the server the user was being directed to.

Basically using Always on Availability groups is introducing a more complex environment, and while not all the things above are bad things, they are things you need to be aware of and consider.  The effective monitoring of an AOAG environment has different rules than monitoring a standalone server and the DBA needs to keep up with those.  But change is what we are all about in IT right?