Author Archives: Rob

About Rob

http://gumption.co.nz/about-me/

How to Gain Emergency Sysadmin rights to a SQL Server

As a SQL Consultant there have been a number of times where I have had to gain ‘emergency access’ to a SQL Server installation.  This can be because a previous DBA has not documented login details, a server has always been around and no-one is sure exactly what it does, or because all staff with administrator rights have left the company.  On several occasions I have had to resolve an issue where a junior DBA or staff member has removed sysadmin rights from the BUILTINAdministrators group without realizing that this was the very group they were gaining their sysadmin rights from (Yes, SQL Server let’s you drop your own sysadmin rights, and once you’ve done it you can’t undo it under your own credentials!).

Regardless of the reason if you need emergency access to a server, there is an easy way and a hard way.  There’s also a really easy way, but it does rely on the stars being aligned in a certain fashion, or more precisely the server being configured in a particular fashion.

The Really Easy Way

   In SQL Server 2005 the BUILTINAdministrators group was automatically assigned full system administrator rights within SQL Server.  This was considered something of a security hole and in most cases the rights were removed by the DBA’s looking after the servers.  However it’s possible the SQL instance you are looking at is a default install of 2005, has been upgrade from a default install, or has had the logins transferred from one.  So the first thing to try is simply to add your account in as a local administrator on the windows server and attempt to log in to SQL Server.  You might get lucky.

The Easy Way

If the really easy way fails, you are left with the easy way and the hard way.  Let’s start with the easy way, because it’s…well…easier.  The downside of the easy way is that it requires a server restart.  If you can’t restart the server then I’m afraid you’ll have to do things the hard way.

The easy way is documented in this Microsoft article and simply involves restarting the SQL Server in single user mode and then logging in.  When logging in to single user mode any member of the local administrator group on the windows server will also have system administrator rights within SQL Server.  It’s basically a more secure method of the BUILTINAdministrator method described above.

The Hard Way

Finally we come to the hard way.  The hard way makes use of the fact that the NT Authority System login is automatically given administrator rights in a SQL instance.  Again this can sometimes be removed by a protective DBA, but there’s less reason to than with BUILTINAdministrators as this method is far less well known and also requires a deliberate attempt to gain sysadmin rights.  You can’t accidently follow this process whereas there are a number of reasons for adding a login to the local administrators group.

Let’s begin:

  1.  Open up notepad and enter the following text.

 sqlcmd.exe -S MYSQLSERVER -q “Exec sp_addsrvrolemember ‘BUILTINAdministrators’, ‘sysadmin’;”

pause

‘MYSQLSERVER’ needs to be replaced with the name of your SQL instance.  Now save this as type ‘all files’ with an extension of ‘.cmd’.  You’ve just created a batch file that if run will add the group ‘BUILTINAdministrators’ to the system administrator role.  Of course, if you run that file it will fail, because you don’t have the rights to do that just yet.

  1. So, to run it as NTAUTHORITY SYSTEM we need to make use of the ‘AT’ command.  Schedule a task to run your file by doing the following:
      1. Open a command prompt(cmd at the run menu).  You may need to do this as an administrator.
      2. Type the following:  at 16:30 /interactive “C:tempsqladmin_restorer.cmd” – Obviously substitute the path and filename for your own file.

2.  This job will pop up on your screen at 4:30pm letting you know the batch file has run.  The BUILTINAdministrator should now be a member of the sysadmin role, and anyone set up as a local administrator will have full sql admin access.

All three of these methods do rely on you having local administrator rights on your SQL Server machine.  The first 2 methods are built in functionality from Microsoft.  It’s only the final method which is a true workaround.

 

An Introduction to Log Shipping

The below is an extract from my presentation at SQL Saturday 252 on high availability and disaster recovery.  This post covers the basics of SQL Server log shipping.

Log shipping is the natural extension of relying on backups alone, because it’s a process which automates the restoration of your backups.  It’s basically a restore of your full backup on the day you establish log backups followed by a series of log backup restores.  So on your primary server you are going to take log backups at a set interval(the default is every 15 minutes) and your secondary has a job which will periodically check for any new log backups and copy them from the primary to the secondary server.  A third job will then look at the settings and restore any appropriate transaction logs.  So we are achieving a few things here.  Not only have we set up a warm standby of the production server which is up to date within 15-30 minutes, we are also actively confirming our log backups are usable, because if they are not log shipping will break.

One of the things that makes log shipping such an attractive DR option is that it is incredibly easy, and in most cases FREE!  Why do I say most cases?  Well, you basically have a passive SQL license available to you for each licensed instance, provided that it is a purely passive server for disaster recovery purposes.  I’ll throw out my standard disclaimer that I’m not Microsoft, and don’t sell their licenses for them, so check with your licensing guys, because there are a few gotchas.

For all it’s simplicity, log shipping gives us a lot of flexibility.  Firstly, you can logship to multiple servers.  Secondly you have the option to restore the database in standby mode which allows you to report off it.  Thirdly you can put in a load delay, so the secondary won’t load logs until, say, 8 hours have passed(You can configure this entirely to your own requirements).  And fourthly – you can combine all the above things together to suit both your disaster recovery and business requirements.  So you can have one production database taking it’s backups each 15 minutes.  Then you can have a pure DR secondary which restores a copy of the database and is the target for a failover.  Then add another secondary on your BI server and set that up so it’s only running restores at 8 o’clock in the morning, but doing it in read only so it can act as a data source for your BI reports.  So BI is now accessing all the data in your prod database, up to the current day – and it’s not messing up production.

And because you have a load delay, when your rogue developer hits F5 in the wrong window and updates the account balance of everyone to $100(because he forgot he commented the where clause and he thought he was on the dev box anyway) – well, you can bring that standby online and grab that table data and take it back to production.

You can even logship to the same server as the production database, because logshipping allows you to rename the destination database to whatever you want.  Why would you do that?  Well it allows you to avoid locking issues by reporting off a second database.

There’s a few gotchas though.  Firstly log shipping relies on your recovery chain being intact.  If someone takes a single log backup that isn’t available to the secondary then logshipping is going to break.  Remember that log shipping is just the process of going through a regular restore.  If you lose a log backup out of your regular recovery chain then you can’t restore further.  Of course, personally I’d prefer find out some rogue 3rd party application is taking random log backups by log shipping falling over than in a live disaster situation.  Setting up log shipping has actually revealed exactly this issue on more than one client site I’ve dealt with.

Secondly – if you want to use the secondary server to report off then it is considered a production server for licensing purposes.  You can also have only one passive standby license.  So while you can theoretically have as many logshipped secondaries as you like, after the first one you are paying for them all.

Are Backups Valid DR?

The below is an extract from my presentation at SQL Saturday 252 on high availability and disaster recovery.  In this section we talk about the basic process of backing up databases and whether this is enough to meet your requirements.

Let’s leave aside all the cool options you get natively and talk about something more mundane.  Backups.  Regardless of what else you have in place, you still want to be doing your backups regularly.  If everything goes completely pear shaped, if all your best laid plans fall apart, so long as you have a native SQL backup then you still have your data.  Without all the other options we are going to be discussing you can still recover from a serious outage if you have a secured backup.  There might be some trial and error if you haven’t documented users and settings, it might take some time and whenever you resort to restoring from backup you are making a call on acceptable data loss – but you haven’t lost everything.  You can recover.

To make it easier on yourself there’s other things that are critical to the application using the database you should be documenting and regularly scripting as well.  What version and patch level is the server? Are the logins being scripted?  What about agent jobs?  Linked servers?  SQL config settings?

I’ve come across businesses whose only recovery options was from their scheduled backups and they were not interested in any additional options.  Sometimes it’s a little surprising considering the nature of their businesses.  Some are financial and any dataloss would be measured in actual dollars, actual transactions being unrecorded and sold items not being sent to customers who would probably not be too happy about that.  One in particular springs to mind where the company had a defined and documented RTO and RPO.  It was backed by SLA’s around those RTO’s and RPO’s and they could take a day of downtime and lose up to 24 hours of data and still meet those targets.  In a disaster they had a documented build process and they would simply install(from scratch) a new server, install and patch SQL, restore the databases, reapply their standard permissions and repoint their application server.

This all raises the question of whether just taking your backups qualifies as Higher Availability or Disaster Recovery.  To me…..so long as you are able to meet the RTO and RPO then it absolutely is.  It’s also going to be largely dependent on the data sizes you are talking about.  When you are talking megabytes or low gigabytes you can probably get away with this sort of strategy.  But when you are talking terrabytes just the time to retrieve and copy your backup to a new server will probably blow your RTO, let alone restoring it and hoping like hell it works right on the first attempt.  It comes back to the process of establishing those RTO and RPO values.  If everyone agrees on values that are very high, and they will actually admit they agreed to them if there really is a disaster, then using a standard backup strategy as your only form of DR\HA is fine.  It’s something that should be done even if you do have yourself covered with the other HA\DR options because it gives you a backstop should everything else go wrong.

Now having said that…….and this applies to everything I’ll be saying about the other technologies as well…..test it.  When you start putting together your backup strategy you need to think backwards.  Think of the process you are going to go through to restore everything.  Not just what do you need, but what order are you going to bring it back.  And document all the steps, because it may ultimately not be you doing the process, and it should be a process that more than one person can pick up and run with.

One of the other speakers at SQL Saturday was Argenis Fernandez.  I think the first blog post of his I ever read he was talking about a disaster situation where the DBA was confident he had everything in space.  Regular backups – taken off to tape each night – tape stored securely in a safe offsite.  Unfortunately he never tested the restore process.  So when a disaster rolled around and he put the tape back in to grab the backup from – the tape was completely blank.  The magnetic lock on the safe erased it.  You get a lot of peace of mind from knowing that the process that you are going to use to recover your system has been tested and works.

(EDIT – For the life of me I can’t find the post from Argenis I refer to above.  So perhaps I dreamt it).

(EDIT 2 – Here I am 3 years later re-writing my presentation from 2013 and I found the post I incorrectly attributed to Argenis.  It was actually written by Anthony Bartolo and can be found here.)

Are You High? Can You Recover?

The below is an extract from my presentation at SQL Saturday 252 on high availability and disaster recovery.  This post deals with the concepts of RTO and RPO.

Let’s start by covering off just what we are talking about when we are using these terms – Higher Availability and Disaster Recovery.  Basically what we are referring to is a couple of acronyms – RTO and RPO.  That’s Recovery Time Objective and Recovery Point Objective.  Or put another way, how long can a service be down, and how much data loss is acceptable following an outage.

Now, in a lot of places these aren’t defined.  They are replaced by things like “best efforts” and the expectations are that you’ll get things up as soon as you can and lose as little as you can.  This is a dangerous mentality, because it should be the expectation anyway.  If you have a recovery time objective of an hour, and you get things fixed after 15 minutes, you aren’t going to look at your watch, see you’ve got another 45 minutes before your RTO runs out so go get a cup of coffee before you throw the switch.  You’re gonna turn stuff back on and then get the coffee.  Ditto for the RPO.  If you can retrieve all the data, you will do that, not delete or not restore up to the defined RPO.     Of course you are going to do things as quick as you can and lose as little as you can.

When I say it’s dangerous I’m looking at the outage that drags out.  If things take an hour to bring back online, or if 15 minutes of transactions are lost….or if you don’t know how much data is lost…then you put yourself in a position where someone can come back and question if your “Best Effort” was good enough, or perhaps if someone else might have done things better.

So you need to get these things defined by the business, or the application owner etc.  If they can’t give you hard numbers, or if they just drag the chain on doing it, then you need to give the numbers to them.  Tell them…”Okay, right now if this system goes down we stand to lose 15 minutes data and it’s take me an hour to bring everything back up.”  If they accept that – fantastic, you now have a documented RTO and RPO.  If they don’t then you’ve started the conversation about what additional resource you need to hit a different target.

Another important consideration for the recovery time objective is when the clock starts.  If you have an RTO of 1 hour – is that one hour from the time the issue arises, one hour from the time an automated alert is sent.  1 hour from the time you see that alert, or one hour from the time you actually get onto the server to look at the problem or one hour from the time the decision is made that you need to invoke DR.  Obviously this matters.

The bottom line is that if you don’t have those two numbers defined then you cannot define whether your higher availability and disaster recovery strategy is a success or not.

A note on ‘uptime’

Now when we define successful uptime it’s often defined as a percentage or number of 9’s.  Let’s take a quick look at that.  Let’s say you go to your boss and say we had 95% uptime.  That’s a nice number right?  On a test you’d be happy with a 95?  But in uptime terms that translates to 72 minutes downtime every day, or over 18 days a year of downtime.  So the more 9’s we add the narrower those allowances become.  At 99.999% downtime we are talking about less than a second per day and just over 5 minutes per year.  So basically you are talking one unscheduled restart a year.

What Disaster?

The final thing I want to talk about before we move on is the different types of disaster.  You need to think of HA and DR basically like insurance policies.  I live in Christchurch, which got rather smashed up in a recent earthquake, and once upon a time insurance policies were simple here.  You insured your house for a lump sum, and so long as you kept up your regular maintenance(payments) then you knew that should disaster strike your insurance company would come along and you’d have your new house up and running within a sensible time(Currently insurance companies are defining ‘reasonable time as around 5-10 years).  This is an example of responding to setting an RTO RPO with “Just do your best”.  Who’s to say 5 years isn’t reasonable if it’s not defined?  Nowadays you have to provide all types of addition information on that insurance policy – geotech reports and the like, because the Insurance companies have discovered that to replace a $400k house can often cost more that $400k.

In the SQL world it’s the same.  We need to consider more than the SQL server when we are talking about these things, because SQL Servers are not independent of everything else in your infrastructure and they are affected by different types of ‘disaster’.  You can have your server go down.  You can have a patch fail.  You can have your service account password changed and prevent SQL from starting.  You can have network or active directory problems.  Or you can have a data issue, where something that shouldn’t have been deleted is, where a “WHERE” clause is missed when executing a query.

And don’t forget this(We are in Christchurch so we are thinking of big natural disasters, and we are in IT so we know that sometimes hardware and software don’t do what you expect):  But not all disasters are accidents.  There are people out there who can and will deliberately do damage to your system – and anyone who came along to the SQL User Group a few months back will be able to see just how easily that can potentially happen.

So as we move through the rest of this presentation remember…..there are different sorts of disaster, and we need to consider those in our HA and DR strategy.

A Happy Birthday to My Daughters

I have twin daughters who turned five years old today.  They’ve had a fun weekend of partying, first with their kindy friends on Friday and then with both sides of the family on Sunday.  They have been thoroughly spoilt with all types of different presents, but I’m a strong believer in giving gifts that didn’t just come off the shelf of the local store.  To me that shows a bit of a personal touch and some thought about what it is the recipient of the gifts will like.  It has nothing at all to do with me being cheap:)

Now being a database guy there’s not much of my job that I can share with the girls.  Even though Kaylee asked for a database for her Birthday, we later deciphered she actually meant her own laptop, which is probably a bit much for a five year old.  But a week before the girls birthdays I started dabbling with a tool called construct 2, which allows even a programming Muppet like me to be able to put together a reasonable sort of a game in a short space of time.

So apart from the scooters and My Little Ponies, the girls also got this game made for them for their birthday:  http://www.gumptiongames.net/happybirthday/index.html

Check it out and let me know what you think.

How to Use TSQL to Get Database Backup History

Having a record of the database backup history is particularly useful.  I use the following script a lot, but I find it’s main use is when I am logshipping and SOMEBODY breaks the recovery chain by taking an extra log backup.  This can give you an idea of who did it and where the log backup went.

SELECT TOP 50 bs.database_name, m.physical_device_name,
bs.user_name, bs.backup_start_date,
CASE bs.[type]
WHEN 'D' THEN 'Full'
WHEN 'I' THEN 'Differential'
WHEN 'L' THEN 'Transaction Log'
END
AS BackupType
FROM msdb.dbo.backupset bs
INNER JOIN msdb.dbo.backupmediafamily m ON bs.media_set_id = m.media_set_id
WHERE bs.database_name = 'AdventureWorks'
ORDER BY backup_start_date DESC, backup_finish_date
GO

If you are just after the last backup date of each database take a look at this one:

select T1.name as [DatabaseName],
MAX(T2.backup_finish_date) as [Last Backup Date]
from sys.databases T1 LEFT OUTER JOIN msdb.dbo.backupset T2
on t2.database_name=t1.name
where t1.name not in ('model','tempdb')
and t2.type in ('D', 'I', 'G','F') — Data, incremental, file/filegroup or file/filegroup incremental
group by T1.name
order by t1.name

How to Use TSQL to Monitor Index Fragmentation

The script I use to monitor fragmentation is a very slightly modified version of the one found here which gives me a little additional information and excludes a bunch of indexes I probably don’t care about:

SELECT OBJECT_NAME(ind.OBJECT_ID) AS TableName,
ind.name AS IndexName,indexstats.index_type_desc AS IndexType,indexstats.page_count AS [Page Count],
indexstats.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,NULL) indexstats
INNER JOIN sys.indexes ind
ON ind.object_id=indexstats.object_id
AND ind.index_id=indexstats.index_id
WHERE indexstats.avg_fragmentation_in_percent> 10
and indexstats.page_count>500
ORDER BY indexstats.avg_fragmentation_in_percent DESC

According to Paul Randall’s blog post on Where the Books Online Index Fragmentation Thresholds Come From, the generally accepted guidelines are to rebuild indexes larger than 1000 pages that have more than 30% fragmentation.  So in the query above I have trimmed out indexes with less than 500 pages and included page count in the output.  You can go to 1000 pages if you want, but I like to start with a more aerial view when I have to troubleshoot index fragmentation.  It’s also worth reorganising Indexes which are fragmented below 30% – according to the above article anywhere between 5 and 30% should be reorganised.

Paul also makes a really important point in the article quoted above, and that is that these numbers are made up to give us a starting point.  So use the script above freely, but don’t forget there are a lot of other factors involved in whether the indexes it returns are causing you problems, or your constant attempts to get them sitting nicely are actually doing nothing to improve your server performance and are just sucking up your time and effort.