Monthly Archives: January 2016

How to troubleshoot MS SQL Server – Part 2

In this post we carry one from where we left off in the first part. Since we discussed what the L1 DBA (resident) should try and do when he is troubleshooting let’s look at our GP (I am continuing the analogy of how DBAs and like Doctors here too) now. A general practitioner of GP as the name suggests is a doctor who has sufficient knowledge of all the bits and pieces even if he might not be an expert. This is similar to what our L2 DBAs are. As an L2 DBA the expectation is that they are aware of all the bits and pieces in MS SQL Server and as a result have a better understanding of why a particular issue is occurring. The point to be emphasized here is that you’re not an L2 DBA if all you know if installing and Configuring SQL Server. If you find yourself relying heavily on the GUI to get things done then probably your closer to L1 than L2.

At this point the DBA needs to be very familiar with T-SQL so that we can understand what logic the developer is trying to implement and if they should be using a recursive CTE instead of a while loop or cursor. Or maybe they should stop using the scalar function and look for a set based approach. It is important for the DBA to know what the code does and the inherent limitations of the common approaches. How would you like it if the doctor treating you needs to constantly visit the fat medical book on his shelf while treating you?

In addition to knowing T-SQL the doctor also needs to know other things like age, profession, foreign travel, food habits etc. Similar to this the DBA needs to know SSRS, SSIS, a little .Net, how full text indexes, XML and Service Broker work too. Developers are going to use a lot of the features available in SQL Server to achieve the end result and since a server can host hundreds of databases there is a chance you will find one instance of the above running on your hardware. Understanding if encryption/ compression is being used is important to the DBA to size CPU for example. So know what else is running on the system and also identify external factors at the OS, distributed transactions etc. Do you know what all the little buttons and options on SSMS do?

Being a L2 DBA means the tools you use and the solutions you suggest aren’t run of the mill standard solutions that you get off Google. It typically involve tweaking server configurations, database queries, enabling trace flags or implementing hotfixes etc. This brings us to our next point Practice somewhere else first. This is why doctors practice surgeries on dead patients first. Before the DBA performs surgery he should already know what a kidney looks like and what happens to it when he pokes it with a query hint e.g.

If the problem was simple; a resident would have already fixed it, accept that you are getting problems that are either very different or aren’t being solved the usual way. Trust the L1 DBA (but also read the chart :-)) and then move on to the big stuff. Don’t waste time redoing all the steps the L1 DBA did already, at this point the issue is important enough to warrant a senior guy so treat the issue the seriousness it deserves and also be thorough with the Root cause analysis.

I can’t stress this enough Upgrade yourself along with your hardware. Every release comes out with new features, sometimes even service packs too. The L1 DBA can be forgiven for not being up to date since they are still starting out but the L2 DBA needs be aware of what the options are so that he can make a good recommendation. Should we use Filestream instead of varbinary (max), or hierarchy datatype instead of a self-join? Know the new features and what they do even if you’re not using them right now. Ignorance is not bliss when it comes to SQL Server. If doctors didn’t upgrade themselves we would still be curing a lot of illnesses with bloodletting.

Knowledge of internals is good but not essential (yet), knowing genome sequences and how DNA works might make you a good doctor but isn’t essential to doing your day to day job. However knowing that mutations in chromosomes cause cancer is important information every DBA should know. A lot of the time DBA misunderstand / misinterpret / misuse internals knowledge to their own downfall. Knowing internals is very important but there are so many nuances to it that you will almost always find some info contradicting what you have just read somewhere else on MSDN ( how many times have you heard “it depends” as the answer to your question). Before you start your internals journey pick a field e.g. nephrology, neurology, high availability, performance tuning etc. and then explore that path till its end before starting the next one.

L2 DBAs have the awesome responsibility of explaining to the end users what went wrong and why, it’s important to make sure you identify the right root cause and the advantages and disadvantages of each solution being suggested. In Insurance we often look for doctors who cost the company more, here are two of the most common reasons why:-

  • Doctors perform too many tests often indicating they have no clue what they are looking for or simply that they are blowing up the bill
  • Doctors having a track record of misdiagnosing illnesses thus causing the patient to keep coming back for more treatments

Try not to be one of these.

SQL Cuppa – A new skype handle and logo

Recently I tried to start a thing every Friday where database folks get together and chat over coffee or some other beverage. The intention was to share knowledge and experience among the community and get to meet your peers. As with most first attempts it didn’t go as expected but I did learn a lot of lessons from it. This time around there is a new handle on skype (SQLCuppa) where you can add me and we can start the conversation anytime. Also because it’s over the internet I am hoping it will be a convenient option to get the conversations flowing. So why are we doing this? Here are a few reasons:-

  • It’s a good way to learn new stuff
  • Explore new opportunities and challenges that you might not face in your current organization- see how others are using MS SQL.
  • Because Social networking is overrated- I have over 700 DBAs networked on LinkedIn, I know 1 % of them personally/professionally.
  • It’s just fun with the right people
  • Exchange ideas

012016_1024_SQLCuppaAn1.png

Check out the logo (done using PowerPoint) , if you see this logo anywhere , say at the airport , a café or even in your office , please drop by and say hello J

Here are few things these sessions are not about

  • Self-promotion – in all its forms

Here is a link to the first announcement.

Auditing via the T-log

A client of mine recently wanted to find out who deleted all the records from a table being used in production. This is an issue that we have all come across at some point in our careers. Usually the stage just before we finally implement audits or triggers to prevent these things from happening in future. Now usually it would be great to have and audit or to be able to find the required details from within the default trace or if you’re very lucky from the sys.dm_exec_connections DMV. But we are almost always guaranteed to be unlucky when it comes to tracing the culprit. So we resort to the T-log. Naturally it makes sense to look at the t_log since it logs for every DML operation that happened. So we are bound to find something useful in it.

In this particular case we first try

select * from sys.fn_dbLog(null,null)

This undocumented DMF reads the T-log and represents the data within the T-log in an easy to understand grid format. I mainly use this DMF to understand the internal workings on Recovery models and backups and restores but it can be used pretty much for all SQL internal investigation. The below screenshot shows an example of the DMF being used. Notice that everything we need to identify who made the change and to which table they made it is highlighted in red.

Auditing via Tlog

Make sure to map the user to the right transaction. However as is the case with most companies we take backups that cause the T-log to rollover and then we lose the data available above. In these cases there is another DMF that can be used to read the data directly off the t-log backups. The below script demonstrates the example where sys.fn_dump_dblog

Auditing via Tlog

Next we delete from the table and track the changes from the backup.

Auditing via Tlog

As you can see from the above script the DMF reads the T-log to identify the records that are modified and the user responsible or it, however there are a few challenges to using either of the above DMF, the first being you need to know the exact time when the problem occurred otherwise it’s just too much info returned from the DMF to effectively troubleshoot. The second thing is the username returned only works if you’re using Windows Authentication. If you’re using SQL authentication then any user can login using that account and as a result it won’t be possible to pin the blame. Basically we need to ensure that we complete the security circle by implementing principle of least privileges and auditing to protect our data.

PS:- sorry for the screenshots but the plugin I use for WP throws one issue after the other with long T-SQL code. Any suggestions for a better plugin would be greatly appreciated. Currently using Developer Formatter

References
http://www.sqlskills.com/blogs/paul/using-fn_dblog-fn_dump_dblog-and-restoring-with-stopbeforemark-to-an-lsn/

Training Feedback 2015

 

As with the previous 2 years we have finished analyzing our feedback data from all the trainings conducted over Jan 15 – Dec 15. Another remarkable year with amazing feedback score for our core metrics. This year however the analysis was done using MS SQL Server 2016 SSRS. We have tried to compress our core data into a single slide share below. Since our trainers are actually consultants we have been able to share a lot of industry experience with participants and answer commonly encountered issues found in the real world. This is especially true for MS SQL Server 2014 and MSBI trainings as we see more companies exploring VLDBs and In Memory OLTP systems as well as Windows Azure.

The above screenshot is an actual dashboard from SSRS.

Thanks to all our participants for the amazing feedback and we hope to improve your training experience in 2016 as well.

How to troubleshoot MS SQL Server

I am sure the title must have piqued the interest of most DBAs since this is what we do every day. It’s also one of the most common questions I get when I conduct trainings as well.   Before I get into the details I want to call out that this post is not technical in nature so don’t expect any scripts or video. There are plenty of those out there anyway. Here I explain the process or the method I use. As in my trainings I compare the role of the DBA to that of a Doctor and hopefully that comparison will be relatable to everybody.

Some basic assumptions are listed below

DBA = Doctor

Server = Patient

Database = organ

Developers = Concerned relatives

Errors/ Offline/Shutdown = Tumors, injuries, fractures/ unconsciousness etc. (a physical manifestation of the problem)

Performance Issues = Symptoms like headaches, nausea (a side effect of the actual problem)

Trace/XE/DMV = Tools used to diagnose the problem stethoscope, MRI, XRAY etc.

Indexes, Query Hints, DBCC CHECKDB = Treatments some are bandages others are open heart surgeries

By now I guess you know where I am going with this but before you put on a white coat and roam the corridors of the data center, let’s dive a little deeper.

There are different types of Doctors and Medicine.  E.g. there are Residents / GP / Specialist. The resident is just starting out and akin to an L1 DBA using DMVs, Sp_who2 etc. to figure what’s going on.  The GP is obviously more senior but still not a specialist  and so he gets to use tools like XE and TRACE etc. to figure out what’s going on but unfortunately he isn’t an expert in all fields and so prone to misdiagnosing the issue. The specialist is the expert but since SQL is very vast you won’t find a lot of people who know performance tuning and Security and infra management and HA and DR and Development and SDLC and AGILE and DEVOPs, and CLOUD and .Net and ….. So just like in real world the GP needs to recommend the right specialist ( heart, brain , kidney) who can then use tools like DMVs , Perfmon, RML, XE, Trace, DBCC commands  etc.

Now that we have a distinction between different types of DBA let me explain give you an example of what a doctor is likely to encounter.

A patient and his concerned relatives walk into a hospital and are directed to a resident, they walk up to him and say “doctor he was fine till yesterday evening and nothing has changed since then, what’s wrong with him now?” How many DBA have heard this before “It was all working fine yesterday and we haven’t made any changes so why is the query slow?”

Here the L1 DBA needs to diagnose and suggest possible solutions however he has limited tools and understanding. As a result he is either not able to find anything wrong or misunderstands the symptoms and suggests a quick fix (aspirin) instead of fixing the actual root cause.

So what can the L1 DBA do to improve his success rate?

Ask the right questions – A doctor always starts his investigation asking a standard set of questions that eliminate 70% ( just my number, with the right question you could get 90%) of all issues right off the bat and help narrow down the problem. If you don’t ask the right questions or if the relatives are afraid of the cost (code rework) and hide info then you’re likely to go down the wrong path and implement a temporary fix.

Know your tools properly – L1 DBA only have to work with a very small subset of the tools available and there is no reason why it can’t be mastered in a few months. Sure you are going to use DMVs but not all of them so why not understand what each column means in the most common DMVs you use. An L1 DBA should know what exec_query_stats, Missing indexes and a basic understanding of execution plans are so that he can identify tempdb spills, inefficient joins, out of date stats etc.

Set expectations – Concerned relatives need to know that just because they didn’t make a change doesn’t mean that a change hasn’t occurred (for all you know the server was having an affair with an Ad hoc SQL he met a few weeks back). For servers the volume of data being inserted is growing and just like people servers age, if the hardware and software are not upgraded then naturally performance will decrease over time and additional changes like indexes, partitioning, rewrite etc. will be required. If you want performance like the first day you got the server then don’t do any deployments or add new data to the server :-p

Monitor and learn from experienceJust like in real life a major health issue shows a lot of warning signs before they finally collapse the patient. Heart patient would probably be overweight, fatigued easily or complain of minor chest pains etc. It’s not OK to say “he’s always been like that, well he is getting old or it must be gas” Relatives and family physicians have been monitoring the server long enough to understand that things are not good and should implement proactive steps early on. DBA who encounter the same issues over and over again need to ask what can be done to improve the server and if nothing can be done then call it out so that everybody understands.

You probably noticed that I have mixed my doctors and DBA analogy quite a few times above but it hasn’t really affected the meaning or context has it? For now I am stopping at this point but as a thought experiment, how would you troubleshoot a connectivity issue?

Assuming TCP/IP = eyes, Shared memory = Touch, Names Pipes = ears and VIA=??? J

If the connection are not accepted it could be because one of the senses are not working or because the server is not listening because they have earphones (firewalls) blocking the path, or your speaking a languages they don’t understand or they are ignoring you because their mom (DBA) told them not to listen ( permissions) to anything you said :-p

PS: – I hope you had as much fun reading this as I had writing it? Stayed tuned for more …

SQL 2016 – Always encrypted

Here I try to summarize a lot of the information available on the MSDN website for the Always Encrypted feature being launched in SQL 2016. The primary concept to keep in mind here is that the application is the owner of the encryption keys and therefore the data comes to the database already encrypted.

The intention here being that when the data is stored in the database even the DBA has no ability to decrypt and view the data. Previously with features like cell level encryption and TDE the DBA was the owner of the server and therefore had control over the Master/ Symmetric / Asymmetric Key / Certificates etc. This meant that the DBA needed to be very trustworthy. Actually not. This feature is not meant to prevent the DBA from misusing his authority but more importantly just to improve customer confidence when moving to the cloud or outsourcing database activities. Personally I think this is a great idea since one of the most frequent concerns my clients have is about the security of their data. With this feature hopefully they would have more confidence in the security of the data that it would be easier to bring on board external vendors when needed. As you can see from the above diagram previously the keys were stored within the database itself and this meant the DBA (internal or external) has the ability to decrypt data. However if the data comes to the database encrypted right off the bat then the potential loopholes are plugged. This does pose some challenges to the DBA since performance and search behavior is now impacted which I will discuss later.

Primarily this feature is being implemented in two different way one is Deterministic encryption and the other is Randomized encryption. Most of us will recognize the first approach as Hashing and the second as encryption. Deterministic encryption always generates the same encrypted value for the same input while Randomized encryption will generate different encrypted values for the same input string.

Why is this important?

Well the database still needs to perform a lot of other activities like grouping and equality searches on top of the encrypted data and this becomes a challenge if you can’t know the value of the column without decrypting its contents first. With Deterministic encryption since the encrypted value will always be the same we can use it to hash and find matching columns. This will help in cases where you want to do Joins, equality searches and groups etc.

Data that is typically unique either by itself or when used in combination with another un-encrypted column might still prefer randomized encryption. An example of randomized encryption would be Username and password, passwords need to be randomized to protect commonly used combinations while at the same time they are almost never queried without the username which is still SARGable. An example of Deterministic encryption might be Designations in government organizations where multiple people share the same designation and you will need to group people by their job titles/depts. for audits etc.

So how much extra effort is this going to take?

Surprisingly little based on what we have seen so far. A lot of the functionality is handled by the Provider/Driver that connects to SQL server. The pre-reqs is the client machine should have .Net framework 4.6 installed. There is a wizard within SSMS that helps setup everything on the Server side. The changes will require the application developer to rewrite the connection string and there are some limitations when it comes to functionality but most of them are minor issues.

You can find a quick post on how to set it up here.

What is the down side?

There are quite a few compatibility issues with other features within SQL Server, for a detailed list click here. But from the look of things it seems performance is one of the big drawbacks since indexes are practically useless for these columns if you choose Randomized encryption. For a detailed test of performance issues with normal encryption vs Always Encrypted stayed tuned.

References

https://msdn.microsoft.com/en-us/library/mt163865.aspx#Anchor_3

SQL 2016 – Setting up Always Encrypted

A quick Step by step guide on setting up always encrypted on the server as well as the client, and a quick approach on migrating existing tables to Always Encrypted.

Right click the database > Tasks > Encrypt Columns

Press Next on the Introduction screen (BTW nice touch with the image)

On the Column Selection Grid, identify the table and columns that need to be encrypted, in this example I am encrypt the password column and LastLogin Column. The Encryption key is CEK Auto by default, then press Next

Select the master key location, as you will see there are options for both Hosted as well as on premise systems, you can also see that the master key is created automatically. I am not sure if this is a typical self-signed certificate. Press Next

On the validation screen you are provided with options to implement Always Encrypted immediately or generate a PowerShell script for scheduled deployment. Note the warning below. There is potential data loss if writes are being performed during this process. If you choose PowerShell a PS1 script is saved to the location specified in the location you select. Press Next

You can review the process by visiting the log link at the bottom of the page and then press Close.

Once created you will find the certificate listed in certificate store of the machine as well as

As well as a pointer to it under SSMS

If you already have a master key created you can create a table enforcing Always Encrypted at creation time as shown below

Note once a table has Always encrypted enabled you cannot perform DML on the table as Ad hoc queries executed directly in SSMS. The fact that all the data is encrypted outside of SQL limits these tables from interacting with most other features in SQL Server. To a large extent you can consider Always Encrypted as being mutually exclusive from other database features like in memory OLTP , replication etc. There is currently no documentation on how it behaves with multiple replicas in Always ON.

Once Always Encrypted is setup on the server the rest of the work is done on the client side. The first thing is to have .Net framework 4.6 installed. The connection string for the application

The rest of it is just normal Ado.net programming. Where the input data needs to be passed as SqlParameter. If you’re not familiar with .Net you can still test the setup using Import Export wizard within SSMS . Just make sure to enable Column Encryption Setting as shown below when selecting the destination

The above approach can be used when migrating existing tables to Always Encrypted as well.

References

https://msdn.microsoft.com/en-us/library/mt147923.aspx

SQL 2016 – In-Memory OLTP enhancements

A bit late in the day but I am still posting the changes to In-memory OLTP that are proposed for SQL 2016. Why? In-memory OLTP in SQL 2014 wasn’t production ready in my opinion but with the upcoming changes it’s become more compatible with what real world OLTP databases look like. I have arranged the list in order of what I think are the most useful changes.

Altering in Memory tables and natively compiled procedures is now supported No surprises here, this was probably the biggest flaw in in-memory tables when it was launched. With all the DevOps and agile development happening it seemed regressive to not allow DDL once the object has been created.

PK/FK constraints are supported now off in memory OLTP tables it will help enforce proper database design instead of treating it as optional. Also it will help in-memory OLTP tables be more consistent with design principles of other tables too. Developers can also code better without having to add additional referential logic within the application code.

Up to 2TB of data can now be saved in RAM. The previous recommended limit was 250 GB and most companies would agree that this is a good amount of space and works for most tables that need to be ported. However I always felt it should have supported at least 1 TB from the very beginning else the overhead of managing what resides and vacates the RAM becomes a burden for the DBA and Developer. Now that 2 TB is supported out of the box we can code simpler and that’s always a good thing.

Additional Query Constructs added to natively compiled Procedure Natively compiled procedures was pretty limited in what it could do initially but with the changes in SQL 2016 it should allow developers code more easily since a lot of functions and features we take for granted in interpreted procedures are now supported in natively compiled procedures as well. For those who are starting new projects on SQL 2014 this was not a serious issue but it was a pain for migration projects since almost every procedure would have some amount of rewrite required unless it was CRUD. The documentation for natively compiled procedures however states that best performance was achieved for procs that have complicated business logic that was compiled into machine language and thus works much faster.

Support for Any Collation In SQL 2014 any meaningful column required a BIN2 collation but now we can have any collation when you combine this with the fact the index columns now support nulls we have a table that actually represents what most companies would see in the real world.

Multithreaded merge In SQL 2014 the DBA was given the option if performing merge of checkpoint files in case the system wasn’t able to keep up with the amount of DML being performed. In 2016 multiple threads are used to persist the data from in memory table to the file group checkpoint files. This allows a much faster recovery time compared to SQL 2014 since I have noticed consistently the in-memory OLTP tables holdup the database recovery in most cases. Since I am currently testing on a VM I am not able to see the real effects of this yet but I will post more detailed info once I get my hands on RTM.

For the complete list of changes see below

https://msdn.microsoft.com/en-us/library/bb510411.aspx#InMemory

 

SQL 2016 – Possible bug in temporal tables

I am not sure if this has been raised before, it seems like a very common scenario that I am sure most people will encounter but I wasn’t able to find any details on the internet or the connect website so here goes. While testing out temporal tables I found that if a record is updated multiple times in quick succession (sub sec or sub milli second timeframes) then the records are captured accurately within the temporal and historical tables but the records are not returned accurately when they are queried using the Temporal query syntax FOR SYSTEM_TIME BETWEEN or FROM.

It looks like temporal tables expect the end time to be datetime2(0) and as a result it loses granularity at the sub second level for updates. So if two rows are updated at the exact same time only the latest row is returned as part of the output.

If you have encountered this issue or have an explanation for why this behavior is expected please reply to the below connect item.

I have also created a YouTube video showing this behavior in action.

Connect item

https://connect.microsoft.com/SQLServer/feedback/details/2215250/missing-data-when-using-temporal-tables

Contains script to reproduce the issue as an attachment