AI is coming for your job….but…..you’ll be fine.

This months TSQL Tuesday is hosted by Pinal Dave who asks the question “Has AI helped you with your SQL Server Job?“. It’s a fascinating question because the marketing on numerous products gets me very excited, but then I use them and…well….underwhelming.

First of all, the term AI is a buzzword, and as such it gets picked up by a whole bunch of people who don’t really know what it is, how it works and what it’s current limitations are and chucked into any presentation they are doing. I’ve heard formulas in spreadsheets, threshold based alerting and indexed searching described as AI. Quite honestly if you say your product has AI in it you seem more likely to get your project greenlit of funded. Sometimes even the promise that you will use AI later is enough to get investors and decision makers excited, because for them they hear “AI = Computer doing work = human not doing work = don’t have to pay wages anymore”. Even big companies who should know better fall into this pattern, such as Amazon’s whose just walk out technology, supposedly powered by leading edge vision technology, actually turned out to be powered by hundreds of workers in India watching your shopping session on video.

So far I’ve seen the AI-oversell pattern repeat over and over, and in my own work place I do have to say that the integration of AI has been time-consuming and painful. I was lucky enough to attend PASS summit last year and some of the demo’s of AI in the Azure Portal, and in Vendor products were quite impressive. I couldn’t wait to get back home and try some of them out. But the thing about demo’s are they are easy to get to behave exactly how you want – especially if you control all the inputs and record the demo. I was able to duplicate some, but not all of the demonstrations I saw, and those imperfectly. Some of the features were preview and just not available yet.

How Code is generated and used will be another interesting area to watch. The old meme that “For AI to take over programmers jobs, managers will need to come up with accurate requirements” is very true. Writing code is quite a small part of a programmers job. But I’m at a level when it comes to my code that AI works well at. I will ask for a small specific thing, and I will get back a specific answer that I can plug into a wider application. For the most part that means I don’t have to remember syntax in languages I might use monthly rather than daily. And then add in intellisense (Itself a type of AI) to make sure the code includes keywords that actually exist and I can get things working reasonably quickly.

I was also particularly keen to get my hands on co-pilot for Office 365. It’s been good for drafting emails, but really only for getting from a blank page to ‘something’ and then shaping in the information I want to include. I’ve dropped the use of CoPilot for analyzing my outgoing emails though. My emails tend to answer questions and I try and do this quickly and concisely. CoPilot would prefer I spend more time with cuddly language. Maybe I should but the emails don’t sound like me when I send them.

What I really wanted to be able to do with CoPilot was take a document file that I’d prepared for a client, such as a site review and quickly turn it into a PowerPoint presentation. This can sometimes take up to a day manually depending on the volume and depth of the source document. I had hoped to cut that down to an hour or two, but alas CoPilot has not been very good at picking out key pieces of information, and formatting has also been quite disappointing. With experimentation I’ve found that presenting the source document in a specific format assists greatly with the conversion process and accuracy of the information presented. But I want the PowerPoint to be a summary of the word document – It shouldn’t have to dictate the format of the document in order to do that.

And this leads me to the title of my post today. If AI can get these things right they can shave hours off manual work processes, and obviously that means the same work can be achieved by fewer staff. AI will cost people jobs, but how many people and how many jobs. There’s no doubt that AI will just keep getting smarter and will be capable of achieving more and more tasks, so the real question is how standardized is your job. If you are doing the same thing every day and your decisions are fairly simple, you can expect someone somewhere to start looking at whether AI(or simply well written automation) can do that more cost effectively. If you have a reasonably complex job, and particularly if your job involves complex discussions with other actual human beings, AI is not the threat you should be worried about. The thing that will take your job is another human being who has figured out how to use AI to cut out the time consuming bits of the role and manage the human interactions well.

Microsoft did a very smart thing when they named their AI family of products CoPilot. The not-so-subtle implication is that AI is something that supports you in your job, rather than comes and takes it off you. In the short term at least I see that as being how things will play out. For me AI has been things like CoPilot and chatGPT getting me off a blank page. Asking a question and seeing if the answer tracks. The Search functionality of providing it’s sources is a great addition because I can fact check the source before I read too much into the actual answer.

The Documentation Problem with AI

Because I’m lazy I’m going to say that this post is a response to Brent Ozars TSQL Tuesday invite. It’s wasn’t originally, it’s a week laste, but it kinda fits so let’s roll with it and not ask too many questions?

Brent asks “What was the last ticket you closed?” Unlike most of the respondents I do work in a system where tickets are a real thing and they do largely drive me day. Everything gets turned into tasks, these are tracked against budgets and deadlines and the ticketing system RULES OUR LIVES!

But the last ticket I closed wasn’t actually a ticket, it was a task in Microsoft to-do, or planner, or whatever it’s being called this month. I have a recurring task for several pieces of our documentation that prompts me to go and review what is written, and if necessary to change it.

Why would I do that? Well put simply, I hate documentation, but I hate documentation slightly less than I hate ‘lack-of-documentation’. I’m primarily a technical person and I like to have a list of things or a process to follow for repeatable tasks. Ideally I like to automate them, but that’s something you do after you have a reliable process, not before. I also like to make sure that any process my colleagues have to work through has a nicely documented set of steps too, but the problem is that sometimes that changes as you change your infrastructure, or SaaS products, or software changes or a bunch of other things. The problem is that documentation is seldom scoped as part of any project, and if it is it’s rare that it’s scoped sufficiently to actually do the documentation justice.

And why is hunting down outdated documentation particularly important right at this point of time? Because the world is changing, and we are unleashing AI on our systems, and it’s pulling recommendations from documentation we wrote 5 years ago and never updated or removed from the system. I want to work in an organization that is AI ready, and unfortunately that means a certain amount of legwork to work through all the different boring bits of source material that AI is going to be using to create it’s responses.

There’s a huge amount of buzz about Artificial Intelligence at the moment, and rightly so. But not enough discussion about the groundwork that’s needed for good results. Put simply the businesses that have prioritised structure, process and documentation will reap the benefits of that investment as they integrate AI into their workflows. The businesses who haven’t are the ones that will find AI giving them the strangest answers and have the lowest level of trust in what AI is doing for them. Let’s face it, there’s a lot of these businesses out there, and they are the most likley to jump in feet first to the AI world looking for a miracle that they simple haven’t earned yet.

So, closed? Temporarily. I’ve completed the key bits, but this is a ticket that is never really closed. If we want to get best value from AI we need to make sure it’s source data is as accurate as possible and that’s an ongoing process. Maybe with the right prompting it’s a task AI can eventually take over itself, but until then…. Garbage in-garbage out. Funny how working with data just keeps coming back to the same concepts isn’t it?

TSQL Tuesday – Tis the Season to Say Thanks….to ourselves?

This month’s TSQL Tuesday is bought to us by Kay Sauter and asks us to have a think about who needs a thank you thrown their way. As I wind down to go on leave for the year, I’m reflecting on what a crazy year 2023 has been. I wrote this a week ago, and then I got covid which gave me even more time to mull things over, so I’m going to completely rewrite it and thanks someone different!

First thanks obviously goes to family and friends who put up with me. To my wife who rings me and reminds me that I stopped getting paid at 5 and maybe it’s time to come home. To my girls for listening to me recite the latest episode of Darknet Diaries or prattle on about why their school grades are stupid and ours were so much better. “Daddy I got an E” still sounds like a bad thing to me. I will never accept it is the best grade there is. How am I meant to index things in my head if they aren’t alphabetical? We don’t live in hogwarts!

Oh….and I should mention a quick thanks to my blog readers for holding back on the flames when I take off on a tangent.

Secondly, I want to throw out a thank you to some people who really don’t get thanked enough – ourselves. I’m talking about all the people who collect, guard and use data. Every now and then I stop and think about all the amazing things that we do with data and how very lucky we are to be right on the edge of so much of the exciting changes that are happening in the world.

As SQL professionals we are doubly lucky because not only do we have a great product to work with, we have an amazing community to be a part of. I had the fantastic opportunity of travelling to PASS Summit in Seattle last month and was reminded what a special group of people form this community. I want to call out all those who lead sessions – not just at PASS but at SQL and data events everywhere. We deal with a lot of complex topics and sometimes it just takes the right presentation of something and it all clicks into place. For me that usually comes with a comparison, or a story about when something was used in the wild, and those come from blogs, SQL events, or catching up with people in the industry for a beer. I want to extend that community thanks to the folks who have made products, training and code available. Imagine being a DBA without the first responder kit, DBATools, Ola’s maintenance scripts or sp_whoisactive! Imagine if you had to solve every problem from scratch instead of paste an error message in google and be lead straight to someone’s blog where you can read about the why and the how. We are part of an amazing community and I want to take this chance to thank all the SQL heroes – past, present and future.

The thing that makes Encryption work

For this months TSQL Tuesday Matthew McGiffen asks us some broad questions around Encryption and Data Protection. His timing is great, because that’s exactly what I gave a talk about last month at SQL Saturday South Island. My timing is not so great. It’s Thursday. I could tell you about Tuesday, but I’m not sure you would believe me. So apologies for being late Matthew, thanks for the great topic which I keep telling everyone who will listen we all need to keep front of mind.

I’m going to start off in the past, where Encryption was a big scary thing for me. I didn’t get it. It was complicated and seemed like some hidden voodoo science that I just wasn’t smart enough to understand. One thing I’ve learnt though is that not understanding the intricacies of how something works is actually not that bigger deal so long as you understand what it is used for, when it’s appropriate to use and how to go about implementing it. Embracing my inability to know everything about everything is something I actually recognize as a big career turning point for me, and if it hasn’t been for you yet then I’ve got a thing called AI that we need to have a chat about.

At any rate, a little less distantly in the past I presented at SQL Saturday SOuth Island and did demo’s inside an hour on how to do the following forms of ‘encryption’.

  • WITH ENCRYPTION object ‘encryption’
  • Connection Encryption
  • Transparent Data Encryption
  • Backup Encryption
  • Always Encrypted Column Encryption

5 forms of encryption in an hour. When to do it, why to do it, and how to do it – take my code. But that wasn’t the point of the talk. Hopefully what people took away from the talk was that there is no technical barrier to them going away and implementing encryption in their environment straight away. The features are available in all paid editions of SQL Server, the process is really well documented, and they had just witnessed someone implement them all in a handful of minutes.

The stuff that really matter is the bits around the protection. What is the impact of encryption on your database size…your backup size…your ability to use third party dedupe products…your developers process…your compliance…your security. Each one of those things is a blog post in itself (oh good…no more wondering what to blog about next month…) but the one I want to focus in on is a human element and it’s something I have had to deal with directly in the past week (See reference to Tuesday above).

As employees we are often tasked with figuring out how to get something done quickly. We have an incredible ability to figure out the ‘how’ to a problem, and we live in a glorious age where the ‘how’ knowledge is readily and generously shared by a multitude of online genius types. So when someone says ‘encrypt this column’ it’s quite possible that a moderately competent IT professional can figure out the how, make that column encrypted and tick whatever compliance box their boss was worried about that week.

What we are often not so good about is documentation and knowledge transfer, particularly of an ad hoc task that took us a couple of hours several years ago. The biggest problem I have with encryption is that people change roles, and change jobs, and fall out of trees. When any form of encryption is in place, we run the risk of ending up in a state where data is irreversibly encrypted because the certificate is lost – or we have a backup, but not a private key or a decryption password. I don’t want to make people afraid of encryption, but I do want people to be very aware that the technical implementation of encryption is not the hard part. The hard part is agreeing and documenting all the things that go round encryption.

So my call to action to you – person with TDE in your environment, and you – person with Always Encrypted implemented on a server near the end of it’s life – is to make sure more than just you know how to access the backups or certificates and know where your process is documented.

You won’t thank me for this, it will be one more long boring task to work your way through, but if you don’t do it – you, or whoever inherits your environment, may well curse the name of whoever introduced encryption into your workplace. And that would be a shame.

SQL Saturday South Island – Wrap Up

What an amazing weekend.

Sometimes you can go to a free training event and take a few notes of things to go look up and that’s a win. And some days the stars align, you pick the right sessions and you walk away with solutions to problems you are working on right now.

I was lucky enough to attend Craig Ryans session on reducing your SQL spend. This is something I focus on in most of my client engagements so generally I think we have the bases covered. Not even close! Craig had a handful of different considerations that I need to work into our own offerings including comparing pricing in different regions and dusting off the cobwebs from the Web Edition of SQL Server – which to be quite honest I haven’t considered in my calculations for years. Then add in changing operating system and we saw an example of how to actieve the same levels of performance on the same hardware(or often better hardware) for half the price. Great session well worth the price of admission(which was free, but to be honest I would have paid twice that.)

My own session went okay. We went through the 5 types of Encryption that are available within SQL Server and demo’d each, making it clear that technical know-how is not a reason to not start the process of implementing Encryption in your environments. And a great thing happened after that session. I had fumbled a couple of times with the technical setup and switching between slide decks and Management Studio and one of the audience came up to me afterwards with a recommendation on using colour coding for session connections. This is something I often do when I am connecting to prod or non-prod client environments, but I think it would be a great addition to a session where I’m demo’ing in multiple different environments – both for me and the audience. So thank you anonymous random guy.

The next session I started going to was Warwick Rudds session on Azure Data Studio, but Warwick hit a technical snag when the presentation room monitor decided it had had enough and went to sleep. Warwick always presents great content, but I also know he distrubtes it really well so I promised to catch it on online and took the opportunity to grab a Lime scooter back to the hotel and drop off my laptop so I was set for a big night after the event without having to lug it around.

I got back in time for Heidi Hasting to talk about some of the processes she has automated and the tools she uses. I picked a couple of key words out of her presentation abstract and thought it may just help with an issue that a colleague is currently working on. Heidi had flown in late and had the last session of the day and presented on 2 hours sleep. But man did she deliver just what I was after with some pieces being perfect to solve a Powershell\PowerBI analytics\MS Licensing issues that have been on my periphery over the last week or two.

And as always the real value in these sessions is that if you don’t get the exact piece of information you are after in a session, you can go out afterwards, buy someone a drink and chat about the technologies we are working with and chew over the problems we are faced with in our day jobs. I literally got thousands of dollars of value out of the event and had an amazing time doing it. Thanks to Hamish and the team for another successful and enjoyable event.

Come to SQL Saturday South Island

Hi all – a quick note telling you to come along and support SQL Saturday South Island.

Once again Hamish has pulled together a great group of speakers from New Zealand, Australia and even the USA. As always it is a free day of training, and you even get a free lunch thrown in, so if not for bettering your professional career and learning lots of cool new stuff – come along and grab a free lunch – cause it turns out there IS such a thing as that.

SQL Saturday South Island 2023 (#1061)

Managed Instance duplicate system databases in file_stats

Suppose you use sys.dm_io_virtual_file_stats to get performance metrics on your server. Further suppose you then drop that into the table using the database name and the collection time as your primary key. If you do this on a managed instance, and you group by database_ID for your datacollection you are going to hit a problem.

Why?

Because this:

        select 
            database_id,
            db_name(database_id) database_name
            ,cast(sum(num_of_bytes_read + num_of_bytes_written) / 1048576 as decimal(18, 2)) io_total_mb
            ,case when convert(decimal(18,2), (sum(num_of_bytes_read ) / 1048576)) = 0 then 0.001 else convert(decimal(18,2), (sum(num_of_bytes_read ) / 1048576)) end io_read_mb
            ,case when convert(decimal(18,2), (sum(num_of_bytes_written) / 1048576)) = 0 then 0.001 else convert(decimal(18,2), (sum(num_of_bytes_written) / 1048576)) end io_write_mb
        from 
            sys.dm_io_virtual_file_stats(null, null) dm_io_stats
        group by 
            database_id

Will give you a result like this:

This stuffed me up quite a bit but once you know that you are looking at stats for invisible magic Azure Managed Instance databases messing up your results there are several options:

  • Filter out the stupid high database_id’s
  • Be consistent in whether you use name or id for your grouping and primary keys.