2021 has been a tough year and it’s been one where I have heard SQL Server, delivered as an on premise platform, referred to several times as a legacy technology. I guess nobody likes to hear the product they work on all day referred to as old, but I thought I’d use this post to get my thoughts in order around this so I have a clear response next time it gets casually dropped into a conversation.
The argument for SQL Server being dead?
The argument goes something like this. NoSQL databases started becoming popular in the late 2000’s and the promotion of them was around faster queries, huge ability to scale and querying vast pools of data a more simple exercise for developers. SQL databases were created last millennium with a focus on reducing duplication and adding structure to IT real estates that were already beginning to sprawl as computing systems were in their infancy and adoption was driven by immediate inhouse needs rather than a centrally planned and implemented strategy. SQL databases solved a number of problems, but the primary one was data-sprawl, or more specifically data being duplicated in multiple places, and the inefficiency and storage costs associated with that.
NoSQL databases rose to prominence in a different time in the IT world, a time when IT systems were already critical in the day to day jobs of most office staff and importantly, when the cost of storing data itself had substantially decreased. This meant that there was a shift in what was costing money to businesses when it came to data driven applications. The cost was not so much the infrastructure/storage anymore, so much as the server license fees and the direct development cost of the application(ie/ the developers). Born in the age of DevOps NoSQL developers were keen for a system that allowed them the flexibility of not having to define structure upfront and then be locked into that structure for the life of the application. They wanted, in short, a database system that could be modified as often as the requirements they were fed were. And at the same time cloud computing was on the rise, giving the developers an ability to distribute their code rapidly to multiple different datacenters, housed in multiple different regions, provided by multiple different providers. The infrastructure now existed to make their applications more flexible, reliable and resilient, but a central database server became a huge bottleneck.
NoSQL addressed all of these problems. Typically a NoSQL database will have the following attributes:
- Ease of use for developers
- Flexible Schemas
- Horizontal Scaling
- Fast Queries due to the data model
In short, the argument goes, when you are working with NoSQL you are working with DATABASES 2.0 – the new and improved way of managing data; the evolution of databases; the purpose built solution for data management in this millennium. Ergo, NoSQL is better. SQL is Legacy. SQL Server is dead.
Technology changes
And all that’s nice, NoSQL database systems have been developed to meet a need that many application developers, and specifically web application developers needed. The initial versions coped a lot of crap about things like reliance on ‘eventual consistency’, not being ACID compliant, not supporting transactions, being insecure and losing data. Some of this is founded, some of this appears to just be completely made up. In current NoSQL versions, these concerns are all addressed and to my mind resolved (If they did actually exist that is, the data loss thing is hard to find any actual evidence on). Where they remain it is as a design choice – transactions are optional, security is something that must be configured to meet requirements (in any product) and options like sharding come with pros and cons.
So NoSQL databases have evolved to meet the objections thrown at them. The mistake made by many NoSQL advocates is thinking that SQL databases haven’t also evolved. Here’s a handful of examples of SQL Server implementing NoSQL, or non relational features:
- Polybase – introduced in SQL 2016 Polybase allows direct integrations with NoSQL databases and Hadoop data lakes.
- JSON support – from 2016 JSON can be stored and queried within SQL Server.
- XML support – this column type has been available since SQL 2005 and can serve a similar purpose to JSON support.
- Scalability – SQL Server has always offered a range of options here. Availability groups have evolved greatly and are now available in all paid SQL Server editions, and with software assurance licensing is provided for multiple replica servers as part of the primary servers license. But the only technology that offers multiple writable replicas is replication, which is still hugely flexible and…with that it has to be said…hugely complicated.
On top of this Microsoft as a company has focussed most of it’s non-relational development effort into it’s cloud products, and the integration between relational and non-relational cloud products. Non-relational cloud products from Microsoft include:
- Azure Table Storage
- Azure Blob Storage
- Azure File Storage
- Azure Cosmos DB
So even if NoSQL databases were becoming more popular, that doesn’t necessarily have much effect on Microsofts bottom line as they have plenty of non-relational offerings both inside SQL Server itself and in the wider Microsoft family.
I say even if, because the evidence that non-relational databases are cutting into the relational database market share is flimsy at best. Numbers on DB-engines.com as at December 2021 show only one NoSQL database in the top 10 databases(MongoDB at number 5).
Now let’s talk about Legacy
What’s interesting on that ranking to me is that MS Access is sitting in the top 10. If you want to talk to me about Legacy systems it’s mind boggling that MS Access is still considered one of the top 10 databases in the world. Why is that?
Well, the thing with these charts it’s it isn’t about a direct feature by feature comparison of different database systems, or a fancy cost-benefit analysis of one against the other. It’s about who uses stuff. A lot of the noise about NoSQL databases was, and still is coming from sectors which have to be very agile and dynamic. To tie it down further I would say Startups and greenfields developments. In this scenarios you have an open choice of technologies and can pick the one that is best suited to your needs. Let’s accept all the arguments and say that NoSQL databases were always the right choice for a new data project – and that everything not developed in the past 12 months is ‘Legacy’.
What do you think businesses are running on? It is a hugely expensive exercise to port the backend of an application from one database engine to another, let alone change the underlying assumptions of the database engine itself. I’ve had half a dozen calls this year from people looking to use SQL Server as a backend to Access front-ends and the performance results are horrible. It’s not just a case of cramming the data from one system into another, the underlying methodologies just don’t work – and that is between two ‘relational’ databases!
I don’t have personal experience with moving SQL data to MongoDB data, but I’ve heard some rather horrifying stories from those who have done it, or have had to pick up the pieces after it was done. I’m not saying it can’t or won’t be successful, I’m just saying that I’ve heard enough examples of where it wasn’t to think that the percentage option for a business is to invest in tweaking poor performance in a relational database engine rather than redeveloping it in a non-relational one (or in a different relational one for that matter).
So, even if the correct choice for every new development is a NoSQL database(Hands down, it’s not, but I’m conceding that to make my point) the majority of businesses are not going to redevelop their current functional IT systems which will mean that relational databases will still continue to be the dominant player in data management for a long time to come.
This is mirrored by what I see today in my customer base. Servers become more powerful and can comfortably hold larger and larger numbers of databases, but while we consolidate systems as much as possible new ones arise and because SQL Server is already dominant in the clients environment new applications are often back ended in SQL Server. Even the move to cloud computing hasn’t largely changed this paradigm. Businesses have to support their applications, and this will often swing the choice in favor of technologies where they already have existing skillsets and\or support contracts.
So don’t be alarmed SQL DBA’s. There is plenty of life left in your skillset. Even more importantly there are many exciting paths open to you to expand that skillset.