SOS – I’m dead.

In a strange announcement this week, Microsoft announced on the Feb 7th that the Azure Data Studio would be retired on the future date of February 6th. You still have until February next year until it becomes unsupported, but I found the bungled announcement of the end of the products life was somehow fitting for a product that never quite took off. Personally I’d used it a few times, never really saw a compelling reason to dig deeper when Management Studio(SSMS) mostly did what I wanted, and just assumed I wasn’t the target market and went back to doing what I always did. Seems I wasn’t alone.

Microsoft Azure Data Studio(ADS), originally launched as SQL Operations Studio in November 2017 and I can still remember my reaction when someone first pointed out that the acronym for that was SOS. In IT, everything is going to get shortened to an acronym sooner or later, so to release a product without realizing what it’s acronym is likely to be showed….a certain lack of planning. Maybe we should have seen the writing on the wall.

Initially focusing on SQL Server database management, the software aimed to offer a modern, user-friendly interface, emphasizing usability and collaboration in data environments. By April 2018, Microsoft rebranded SQL Operations Studio to Azure Data Studio, maybe aligning it with it’s cloud platform, but probably just acknowledging that the original name was badly thought out. This rebranding highlighted its capability to specialize in cloud-based data operations while also supporting on-premises environments. The product was designed as a lightweight, cross-platform tool available for Windows, macOS, and Linux, providing flexibility for developers working in diverse ecosystems. The multi-platform support is something that SSMS can’t boast, but I’m a windows user, my clients are windows users and it was never functionality that those around me had needed. It also offered themes which some people who aren’t me care about and seems to be a problem too hard for SSMS to crack.

One of the touted features of Azure Data Studio was that it allowed users to integrate various extensions. This capability enables the customization of the environment to meet specific workflow requirements. Notably, Microsoft had included extensions for popular programming languages like Python and R, facilitating data science-centric workflows. The tool also supported Jupyter notebooks, boosting its appeal within data analysis and machine learning communities. And all that sounds nice until you tried to use it the same way you would use…say Visual Studio Code, and quickly realized that the extension model was heavily curated by what Microsoft was offering you. VSCode by comparison has a huge library of extensions with a fairly straightforward pathway to create your own.

So, without really seeing a great benefit for me, I continued using SSMS, so was disappointed when I found that ADS didn’t want to be ignored. If you installed SSMS you got ADS anyway. This is great fun to explain to clients when you are installing it into their environments. “You said you wanted to install SSMS, why are you installing this other stuff?”

Apart from me never really seeing the point, ADS also faced other criticisms from people who were using it. One recurring complaint was its performance issues when handling large datasets compared to SSMS. Users reported lag and responsiveness challenges, particularly when executing complex queries or managing big data environments.

So the announcement comes as a bit of a surprise to me, but is not likely to make a change to the way I do things. I guess some of the functionality that has been pushed into extensions for ADS(such as Azure migration assessments) will need to be rethought and repackaged over the next year, but overall I think it’s best for Microsoft to focus in on SSMS as it’s primary tool for SQL Database management. Version 21 apparently gives themes to people who feel that’s important, and also has copilot functionality.

TSQL Tuesday – Managing Permissions and the importance of visibility

Steve Jones (Blog) has taken off the co-ordinator hat and put on the hosting hat for this months TSQL Tuesday, and he’s come up with something that’s quite topical for me – Database Permissions. In the last week I’ve been in calls discussing internal permissions on our own databases, calls around managing application permissions, active directory groups and a permissions security cousins Encryption(Hint: If you are looking at a SaaS solution ask if they encrypt their databases, and then ask for proof).

The common thread is that this is not usually initiated from an application owners side, it is initiated from the DBA side. Application owners like their application to work, and it seems common that security is just an assumed thing that happens. It’s not that they don’t care about security – it’s just that they hope someone else is taking care of it. And that can easily be where it ends unless someone asks some very pointed questions about who all these people with access are and whether they really need it. So let’s start with happy land, or how I like to see things set up, and we can talk about reality later:

  • There is a single connection used by the application and actual user permissions to data are handled by roles within the database.
    • Preferably this is a Windows login, but it’s no big deal if it’s a SQL login, so long as the only thing using that SQL login is the application, and the password is sensible.
    • That login should have the minimum permissions that the application needs in the database. Usually this is going to be read and write permissions, maybe it also needs to update objects depending on how your releases hang together. When the documentation says ‘sa’ account is required…..it’s usually not. Certainly past the initial install.
  • Any direct access to the database is handled by windows logins, preferably with groups. Groups are something I love and hate. They are great for managing the reasons for accessing a database and assigning an equivalent to role-based access. They are less great for visibility as to who can actually see what data. When I get access to a client system it is usually at the SQL level and auditing nested AD groups to provide a list of who can actually see what is not always easy. Regardless, for any direct access least privilege is a must.
  • Changes are completed under change control with a recorded reason and authorization.
  • BI Developers don’t exist.

Now, in the real world you don’t often get that. At a minimum that last one is going to be tricky because Business Intelligence provides massive business benefit when done right and most applications don’t have quality reporting built in, or the reporting isn’t specific enough to your business, or it doesn’t provide the insights you are looking for. It’s not that I don’t like BI developers, it’s just that they enter a new variable in controlling my data. Specifically they want to query it in different ways than the application developers intended and usually they want to move it somewhere I no longer have control of it.

So out in the real world we don’t often see my happy land simple implementation, and instead we see a huge mish-mash of systems that talk to each other, replicate to secondary servers and have access from people all over the place. That access may or may not still be needed, or may or may not have been appropriate on the first place. My personal favourite was a server we reviewed just before Christmas that had “Domain Users” set up as a system admin. I’m sure Joe the cleaning temp enjoys knowing what his bosses make and will resist the temptation of bumping up his own salary.

If we can’t necessarily control all of the various ways that the databases are accessed, then we should at least ensure that we have visibility. This comes in two forms:

Visibility of what the current state is: Who has server level roles, who has database owner level access, and who can read or write to each database. Even if you don’t go beyond the built in server and database level roles if you have this information then you are well ahead of a good 90% of organizations out there. There are some specialized products that can help with this, or some organizations have custom solutions. Others simply export the environment permissions and keep a record in source control. Most don’t have it at all.

I want to include quite a few things in this visibility. System level role membership is a given. Database level role membership should also be relatively easy to collect. Individual permissions including allow and deny rules gets complicated, and often overlooked is comparing permissions between servers, particularly where high availability is involved. Following that, one of the key pieces of information I look for in this is ‘who has the ability to change permissions’ with the preferred answer being as few people as possible.

Visibility of changes: This is trickier, but having an alert when permissions change is very nice to have. The reality is that your SQL Server permissions just shouldn’t change that often. Ideally you set up the application and the permissions are static beyond that. New permissions should be tied to a new or changed business requirement. We have this implemented in a few places so when permissions changes are made there is an immediate alert that goes out to an administrator that includes what changed and who changed it. That means there is some nice visibility and just the fact that someone is being notified actually seems to reduce the number of incidents of random inappropriate escalations.

If you have the above two levels of visibility, then you are actually armed to start tidying up permissions or (we can hope) verifying that your current permissions structures and processes are appropriate and working well. If you do not, then getting that visibility should be a priority.

Starting the year thinking about Security

Okay, so last year it happened. A client got their entire infrastructure Cryptolocked and recovery has been a dog. This was not a SQL Server problem nor a SQL breach, but it brings into perspective so much of what I’ve been talking about in this blog of how SQL is part of a wider environment, and how everyone has to be on the same page when it comes to securing and, if worst comes to worst, recovering that environment.

My focus during the start of 2025 is going to be the development of our DataAssure Security module. We’ve got some pretty neat stuff in our managed service already in the security side, but we have developed based on a minimum threshold we want everyone to be at. So we capture your failed logins, we keep track of your system administrators and we highlight some glaring security holes(like weak passwords and incorrectly configured linked servers), but the security module will be taking that to the next level for our clients who are after a more complete solution.

This is going to mean a refresher for me on many of the security features of SQL and we’ll build out a bunch of the below:

  • TDE and Backup Encryption
  • Certificate Rotation
  • Monitoring Encrypted Connections
  • SQL Audit and Extended Events
  • Server and Database level permissions changes
  • Inactive and over-privileged logins.

Security is hard enough when you are just thinking about your own servers, but designing solutions that are flexible enough to cover everything from hardware stores to financial institutions is especially difficult. Still, no point in starting the year off with easy tasks is there?

Let’s go.

SQL Saturday Wellington 2024

I wanted to make sure that everyone in the Wellington region is aware that Anu and the team are lining up another weekend of high class SQL training THIS WEEKEND. I do apologize for the late post, but I’ve been chatting to a few Wellington based clients in the past week and letting them know and realize that not everyone is aware and I need to do my bit to get the message out.

I’m going to be sharing some of the good, the bad and the ugly that we’ve seen with live moves of various clients into the Azure cloud. It’s a session that compresses masses of experience into a (probably less than) one hour session, so come along and take the opportunity to learn from other peoples mistakes.

But check out the schedule, because there is 3 tracks of data goodness, and what looks to be a fascinating keynote from Kevin McCombe and Mike Winiata from the New Zealand Fire Service.

SQL Saturday South Island 2024

Hi all. Just a reminder if you are not already signed up to get over to:

https://sqlsaturday.com/2024-07-27-sqlsaturday1084

and sign up for SQL Saturday South Island 2024 edition. I’ll be giving a talk on Azure Database migration and sharing some of the good, the bad and the ugly we’ve experienced in moving clients to (and from) Microsoft Azure.

There’s also 4 streams of data good ness with the usual suspects all making an appearance. Check out the link above for the full list. See you there!

How to find and review all backups on a drive

This is a quick piece of code that searches a drive for SQL backups(or at least anything that is a *.bak format) and runs restore headeronly on it to determine what database was backed up and when. There’s nothing especially clever going on here but it took me longer than I cared to admit so I thought it was worth dropping here so I could find it next time I need to do something like this. The trickiest part of this is getting a restore headeronly output into a temp table. You have to create the table and exactly match the column names and numbers. Luckily this work was already done for me here.

SET NOCOUNT ON;

DECLARE @looper INT = 0;
CREATE TABLE #BakFiles (
    FilePath NVARCHAR(MAX)
);

INSERT INTO #BakFiles
EXEC xp_cmdshell 'dir H:\*.bak /s /b'; -- Change H:\ to your desired drive

DECLARE @FilePath NVARCHAR(MAX);
DECLARE @SQL NVARCHAR(MAX);

CREATE TABLE #BackupInfo (
    DatabaseName NVARCHAR(255),
    BackupFileLocation NVARCHAR(MAX),
    BackupStartDate DATETIME
);

DECLARE FileCursor CURSOR FOR
SELECT FilePath
FROM #BakFiles
WHERE FilePath IS NOT NULL AND FilePath <> ''; -- Filter out null or empty paths

OPEN FileCursor;

FETCH NEXT FROM FileCursor INTO @FilePath;

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @looper = @looper + 1;

    SET @SQL = 'RESTORE HEADERONLY FROM DISK = N''' + @FilePath + '''';

    PRINT 'Processing file ' + CAST(@looper AS NVARCHAR(100)) + ': ' + @SQL;

    CREATE TABLE #HeaderInfo (
    BackupName nvarchar(128),
    BackupDescription nvarchar(255),
    BackupType smallint,
    ExpirationDate datetime,
    Compressed bit,
    Position smallint,
    DeviceType tinyint, 
    UserName nvarchar(128),
    ServerName nvarchar(128),
    DatabaseName nvarchar(128),
    DatabaseVersion int,
    DatabaseCreationDate datetime,
    BackupSize numeric(20, 0),
    FirstLSN numeric(25, 0),
    LastLSN numeric(25, 0),
    CheckpointLSN numeric(25, 0),
    DatabaseBackupLSN numeric(25, 0),
    BackupStartDate datetime,
    BackupFinishDate datetime,
    SortOrder smallint,
    [CodePage] smallint,
    UnicodeLocaleId int,
    UnicodeComparisonStyle int,
    CompatibilityLevel tinyint,
    SoftwareVendorId int,
    SoftwareVersionMajor int,
    SoftwareVersionMinor int,
    SoftwareVersionBuild int,
    MachineName nvarchar(128),
    Flags int,
    BindingId uniqueidentifier,
    RecoveryForkId uniqueidentifier,
    Collation nvarchar(128),
    FamilyGUID uniqueidentifier,
    HasBulkLoggedData bit,
    IsSnapshot bit,
    IsReadOnly bit,
    IsSingleUser bit,
    HasBackupChecksums bit,
    IsDamaged bit,
    BeginsLogChain bit,
    HasIncompleteMetaData bit,
    IsForceOffline bit,
    IsCopyOnly bit,
    FirstRecoveryForkID uniqueidentifier,
    ForkPointLSN numeric(25, 0),
    RecoveryModel nvarchar(60),
    DifferentialBaseLSN numeric(25, 0),
    DifferentialBaseGUID uniqueidentifier,
    BackupTypeDescription nvarchar(60),
    BackupSetGUID uniqueidentifier,
    CompressedBackupSize bigint,
    Containment tinyint,
    KeyAlgorithm nvarchar(32),
    EncryptorThumbprint varbinary(20),
    EncryptorType nvarchar(32)
);

    INSERT INTO #HeaderInfo
    EXEC sp_executesql @SQL;
	
    INSERT INTO #BackupInfo (DatabaseName, BackupFileLocation, BackupStartDate)
    SELECT DatabaseName, @FilePath, BackupStartDate
    FROM #HeaderInfo;

    DROP TABLE #HeaderInfo;

    FETCH NEXT FROM FileCursor INTO @FilePath;
END;



SELECT DatabaseName, BackupFileLocation, BackupStartDate
FROM #BackupInfo;

-- Cleanup
DROP TABLE #BakFiles;
DROP TABLE #BackupInfo;
CLOSE FileCursor;
DEALLOCATE FileCursor;

Optimizing Azure SQL Database Costs

Best Practices for Performance and Affordability

Losing control of your spend in Azure is ridiculously easy – like absolutely crazy easy. A few months ago, I set up a test environment in Azure to experiment with new features and test workloads. I created several Azure SQL databases with generous resource allocations, which were pretty dumb backends to some Azure App Services which were the thing I was actually testing. Over the next few weeks, I worked through different tests, the databases continued to run, and every now and then I hit a bottle neck and bounced the resource up. One day, everything stopped. I was using a test account which had a $250\month budget, and I had spent the lot in a few days.

I had underestimated the potential costs, and not kept an eye on it, never scaling back down once I’d worked around my bottleneck. I had done, in short, what so many people do in Azure. This situation serves as a classic example of how easy it can be to overspend in Azure if you’re not actively managing and optimizing your database resources. In my case it didn’t matter too much, it just meant I had to wait until the end of the month for my budget to reset. I was also working at a pretty small scale, but I have seen, and heard of people who hit this problem at alarmingly large scales.

Azure SQL Database is a powerful, scalable solution, but if not managed properly, the costs can quickly spiral out of control. So I thought I would share a few strategies for maintaining high performance while keeping your spending in check.

1. Choose the Right Service Tier

Azure SQL Database offers multiple service tiers and performance levels, such as Basic, Standard, and Premium, each designed for different workloads. The key to cost optimization is selecting the right tier for your workload. For test and development environments, avoid Premium or Business Critical tiers unless you truly need the high performance. Instead, opt for Basic or Standard tiers, which offer more cost-effective solutions for lower workloads.

In my example above, I had chosen a lower-tier service but then scaled it up. By understanding the needs of your application, you can select the appropriate service tier that ensures performance without inflating your costs, but if you do choose to scale up, work out that new cost and be sure that you are happy keeping paying it. If not think about when you have higher needs and scale only then, which brings us to our next point.

2. Scale Resources Based on Demand

Azure SQL Database allows you to scale resources up or down depending on the workload. For example, you can scale up during high-demand periods (like running large queries or processing data) and scale down when the demand is low (such as during off-hours or when the database is idle).

In the case of the test account, I didn’t scale down after tests were completed. Instead of manually resizing the databases, I could have automated the scaling process using Azure Automation or scheduled tasks to reduce resources when the system wasn’t actively being used. The Serverless Tier is actually really great for this if you don’t want to go through the manual process of doing it yourself.

3. Utilize Reserved Instances

For long-term projects, using Azure’s Reserved Instances can help reduce costs significantly. By committing to a one- or three-year term, you can lock in discounted rates compared to pay-as-you-go pricing. Reserved Instances are an excellent choice for production databases that run continuously, as the cost savings can be substantial over time.

4. Monitor Usage and Set Alerts

Azure provides built-in tools like Azure Cost Management and Azure Advisor that can help you monitor usage and track costs in real-time. Setting up alerts when usage reaches certain thresholds can prevent you from overspending unexpectedly. I’ve now taken to setting up budget alerts on my test accounts because honestly – the chance of me getting called onto something else 2 seconds after I scale up a database I fully intend to scale back down are pretty high.

5. Cleanup Unused Databases – Automatically?

Lastly, it’s important to manage resources that are no longer in use. For temporary or test environments, make sure to regularly clean up unused databases. You can automate this cleanup process to prevent old, idle databases from consuming unnecessary resources and driving up costs.

Conclusion

I know it sounds simple – but the biggest thing you can do to optimize costs is to look at the costs. It’s honestly the not paying attention that is the biggest danger. If you are looking you are very likely to see places that you can trim expense, but the biggest danger of a cloud spend is that you don’t look at the numbers as you go – only when you get the bill at the end of the month.