How to troubleshoot MS SQL Server – Part 2

By | January 21, 2016

In this post we carry one from where we left off in the first part. Since we discussed what the L1 DBA (resident) should try and do when he is troubleshooting let’s look at our GP (I am continuing the analogy of how DBAs and like Doctors here too) now. A general practitioner of GP as the name suggests is a doctor who has sufficient knowledge of all the bits and pieces even if he might not be an expert. This is similar to what our L2 DBAs are. As an L2 DBA the expectation is that they are aware of all the bits and pieces in MS SQL Server and as a result have a better understanding of why a particular issue is occurring. The point to be emphasized here is that you’re not an L2 DBA if all you know if installing and Configuring SQL Server. If you find yourself relying heavily on the GUI to get things done then probably your closer to L1 than L2.

At this point the DBA needs to be very familiar with T-SQL so that we can understand what logic the developer is trying to implement and if they should be using a recursive CTE instead of a while loop or cursor. Or maybe they should stop using the scalar function and look for a set based approach. It is important for the DBA to know what the code does and the inherent limitations of the common approaches. How would you like it if the doctor treating you needs to constantly visit the fat medical book on his shelf while treating you?

In addition to knowing T-SQL the doctor also needs to know other things like age, profession, foreign travel, food habits etc. Similar to this the DBA needs to know SSRS, SSIS, a little .Net, how full text indexes, XML and Service Broker work too. Developers are going to use a lot of the features available in SQL Server to achieve the end result and since a server can host hundreds of databases there is a chance you will find one instance of the above running on your hardware. Understanding if encryption/ compression is being used is important to the DBA to size CPU for example. So know what else is running on the system and also identify external factors at the OS, distributed transactions etc. Do you know what all the little buttons and options on SSMS do?

Being a L2 DBA means the tools you use and the solutions you suggest aren’t run of the mill standard solutions that you get off Google. It typically involve tweaking server configurations, database queries, enabling trace flags or implementing hotfixes etc. This brings us to our next point Practice somewhere else first. This is why doctors practice surgeries on dead patients first. Before the DBA performs surgery he should already know what a kidney looks like and what happens to it when he pokes it with a query hint e.g.

If the problem was simple; a resident would have already fixed it, accept that you are getting problems that are either very different or aren’t being solved the usual way. Trust the L1 DBA (but also read the chart :-)) and then move on to the big stuff. Don’t waste time redoing all the steps the L1 DBA did already, at this point the issue is important enough to warrant a senior guy so treat the issue the seriousness it deserves and also be thorough with the Root cause analysis.

I can’t stress this enough Upgrade yourself along with your hardware. Every release comes out with new features, sometimes even service packs too. The L1 DBA can be forgiven for not being up to date since they are still starting out but the L2 DBA needs be aware of what the options are so that he can make a good recommendation. Should we use Filestream instead of varbinary (max), or hierarchy datatype instead of a self-join? Know the new features and what they do even if you’re not using them right now. Ignorance is not bliss when it comes to SQL Server. If doctors didn’t upgrade themselves we would still be curing a lot of illnesses with bloodletting.

Knowledge of internals is good but not essential (yet), knowing genome sequences and how DNA works might make you a good doctor but isn’t essential to doing your day to day job. However knowing that mutations in chromosomes cause cancer is important information every DBA should know. A lot of the time DBA misunderstand / misinterpret / misuse internals knowledge to their own downfall. Knowing internals is very important but there are so many nuances to it that you will almost always find some info contradicting what you have just read somewhere else on MSDN ( how many times have you heard “it depends” as the answer to your question). Before you start your internals journey pick a field e.g. nephrology, neurology, high availability, performance tuning etc. and then explore that path till its end before starting the next one.

L2 DBAs have the awesome responsibility of explaining to the end users what went wrong and why, it’s important to make sure you identify the right root cause and the advantages and disadvantages of each solution being suggested. In Insurance we often look for doctors who cost the company more, here are two of the most common reasons why:-

  • Doctors perform too many tests often indicating they have no clue what they are looking for or simply that they are blowing up the bill
  • Doctors having a track record of misdiagnosing illnesses thus causing the patient to keep coming back for more treatments

Try not to be one of these.

Leave a Reply