Empty silos and DBAs

low angle photography of missile silo hole

The biggest issue I have seen in the data space, similar to what happened with system administration when DevOps came along, is that we have gone from having a bunch of DBAs with deep knowledge of the RDBMS systems to having data engineers, data scientists, platform engineers and a whole raft of other people with an interest in the data but not much interest or skill in good old fashioned data warehouse and database design. Those skills are still really valuable and can bring a lot of benefit to teams working in that area. It’s one of a few versions of the ‘empty silo’ problem that we seem to encounter more these days. We used to throw things over the wall to ‘the DBAs,’ which caused a bottleneck with those guys. So we stopped doing that, but instead of setting up a ‘complicated subsystem’ team, we just got rid of the DBAS. And yet we know this isn’t ‘our thing’, so we still throw it over the wall but to no one.

Data warehouse schema design is a specialised area. Still, a technique called [agile data warehouse design] involves breaking the analysis down to focus on events that take place along the ‘user journey’. This makes the whole purpose of the analysis much more apparent to business users, who are often the primary consumers of the data.

Another way in which database management can be brought into the 21st century is to use [data mesh] - one goal of which is to segregate data by domain so that, again, the business users who are interested in a specific area can have input into the design of the database as it will be used in their area of focus.

One issue with the ‘empty silo’ of DBAs is that developers cannot always refer to a team with expertise in this complex system. It has happened so often that a database is built by the dev team, using an ORM tool or in another automated way, without any schema design being done. There are then issues discovered during performance testing - which is often done just before go-live or, in many cases, is done by just putting it in front of users and seeing what breaks. In many cases, the performance issues are caused by a lack of indexing of the data within the database - this is an essential step a DBA would take care of, but it is so often missed these days.

My focus is usually on go-live day and ensuring an application can be successfully deployed and monitored.

One thing which is forgotten so often is what happens when you need to update the database schema?

This is the most enormous ‘empty silo’ problem I can think of because what happens when you inevitably need to update the schema.

What does that actually look like?

Are you planning to use a tool like flywheel or liquibase?

Have you used those tools before, and have you considered your scripts' impact?

Does your code review process include checking your migrations scripts?

So, honestly - I miss DBAs. It’s one of those times when, as an industry, we’ve been so obsessed with improving velocity we’ve forgotten to look at how best to deliver actual working solutions (again). We need to refill some of those.

The data space has shifted from having DBAs with deep knowledge of RDBMS systems to having data engineers, data scientists, and platform engineers with less interest in data warehouse and database design. Agile data warehouse design and data mesh can help bring database management into the 21st century. The lack of indexing and schema design can cause performance issues, and updating the schema can be problematic. The industry needs to focus on delivering actual working solutions and remove the empty silo of DBAs.

Reply

or to participate.