Monthly Archives: November 2018

SQL 2019:- Table Variable deferred compilation

One of the features being released in SQL 2019 that has got some attention is table variable deferred compilation. As its name suggests it applies to Table Variables and not temp tables, CTEs, or table datatypes. More importantly why does it need to do deferred complication for Table variables?

One of the drawbacks of Table variables is that to do no have an accurate row count before the query execution and as a result their estimated row counts was always 1. As you may be aware from previous blogs and video on this site, this results in a number of situations where memory grants and execution plans are sub optimal. By delaying the compilation of a sub plan that references a table variable SQL Server gets a chance to capture the correct row count within the table variable and as a result improve the execution plan for downstream operators.

Before we can see how this look in SQL 2019 it is a good idea to see how it used to be on SQL 2016.

The below query is executed on the exact same dataset in SQL 2016 and SQL 2019

 

DECLARE @tablevariable TABLE 
( 
customerid INT 
 
) 
 
INSERT INTO @tablevariable
SELECT  customerid  FROM customers
WHERE IsMember =1
 
 
SELECT  customerid, password FROM Customers
WHERE customerid in (SELECT customerid FROM @tablevariable)

Here is the execution plan from SQL 2016

Here is the execution plan from SQL 2019

Notice how the Sort operation from SQL 2016 has turned into a Hash Match Aggregate in SQL 2019.

Here are at the execution plan details for the table variable from both SQL 2016 and SQL 2019 side by side

SQL 2016

SQL 2019

See how the former has an Estimated Number of Rows to be read as 1 and the actual as 48874 while the later has the same number of rows for estimated as well as actual. So what does this mean for performance?

Here is the output from Set Statistics time, IO on for SQL 2016

Here is the output for SQL 2019

Naturally we can see that the performance has increased from 2800 ms to 1302 ms but more interestingly the Query Compile time for the latter is less compared to the former even though it is a deferred compile.

So overall this features does improve query performance but it still raises the question if we had an accurate row count now why did we still get tempdb spill warnings as can be seen from the execution plans earlier.

Shouldn’t a more accurate row count has resulted in the correct amount of memory being granted and more importantly why didn’t intelligent Query processing detect and correct the issue in subsequent runs? More on this in the next blog.

Using Time Series DAX functions in PowerBI

Recently at a training I was asked how to implement DAX Time Series functions like parallel period and YTD in PowerBI. I always assumed that I had something written up about it but on closer inspection found that I have missed writing about such an important piece of functionality. So here is a step by step explanation on how to use some of the most commonly used Time Series functions in PowerBI.

First you need a date column in the original dataset. For example the below screenshot shows a table which contains data for flights in the US, the important column that I must have is highlighted in the box. Notice that the datatype is shown as datetime, however I actually only have dates within the column all times are defaulted to 12:00:00.000.

Next I need a calendar table created which meets the below criteria:-

  • Must have dates that cover the range in the dataset. E.g. I usually like to have a calendar with ten years’ worth of dates in it.
  • Must have continuous range of dates with no gaps. Even if there are no flights on Feb 25th in the original dataset the calendar table must have date corresponding with 25th Feb
  • The Calendar table must be related to the Report data table using a relationship as shown below in PowerBIDesktop.

     

 

Next we need to help identify the calendar table within PowerBI, which is done by navigating to the table within the PowerPivot mode of PowerBI Desktop and clicking the Mark as DateTable button on the ribbon for Modelling

Once done we are ready to create new measures for the Calculation of ParallelPeriod and YTD

Here are the formula for your reference.

TOTALYTD

YTDFlights = TOTALYTD(SUM(ReportDataset[Flights]),FlightDates[flightdate])

 

The blue part is the name of the measure which we are creating it can be called anything you like

 

The black and bold part is the name of the Formula

The red part is the calculation we want to perform for the Year Till Date. In this case I want to calculate the Sum of all flights taking off this the current date starting from the beginning of the year.

The Green part is the way the PowerBI accesses the calendar to know when the year starts and how to navigate the calendar.

PARALLELPERIOD

 

LastMonthFlights = CALCULATE(sum(ReportDataset[Flights]),PARALLELPERIOD(FlightDates[flightdate],1,MONTH))

 

This formula is pretty self-explanatory , literally translated it is ” Calculate the sum of flights from the reportdata set for a period parallel to today by looking at the calendar table and stepping forward by one month.”

When used within PowerView the result would look like below after dragging and dropping the fields from the Field List on the right. One thing to pay attention here is that the month column must be fetched from the calendar table even if you may have a month column in the Original dataset too.

 

 

Tuning SQL Server – Simulating Performance issues

 

Check out the second video on our series on Tuning SQL Server. After explaining the data model in the first video, where we cover the process by which we caused the performance issues.

In this second video of the series I show how we can call procedures using OStress.exe to simulate hundreds of users running the query concurrently and the impact of it on the CPU. We can also use Activity Monitor to get a quick idea on if there is an issues that we need to investigate further.

https://www.youtube.com/watch?v=0hF0T7SF3-Y

Working with Filetable to upload documents into SQL Server

A simple script on how to setup and work with file tables in MS SQL Server. This basically does the bare minimum to setup and create filetables in SQL Server and allows users to directly upload files into the directory and see them reflect inside of SQL Server.

CREATE DATABASE MyFileTable
GO
WITH FILESTREAM
( 
NON_TRANSACTED_ACCESS = FULL,
DIRECTORY_NAME = N'FileTableDir'
);
GO
/* Add a FileGroup that can be used for FILESTREAM */
ALTER DATABASE MyFileTable
ADD FILEGROUP FileTableFG
CONTAINS FILESTREAM;
GO
/* Add the folder that needs to be used for the FILESTREAM filegroup. */
ALTER DATABASE MyFileTable
ADD FILE
(
NAME= 'FileTable_File',
FILENAME = 'C:\Work\FileTable'
)
TO FILEGROUP FileTableFG;
GO
USE MyFileTable;
GO
/* Create a FileTable */
CREATE TABLE resumes AS FILETABLE
WITH
( 
FILETABLE_DIRECTORY = 'FileTableDir'
);
GO