Writing MDX the T-SQL way

By | September 15, 2019

 

With all the focus on data science I felt it would be a good idea to revisit a little used Query Language called MDX in SSAS ( analysis Services). Why? Because despite the usage of Hadoop and Polybase and all the enhancements in Columnstore indexes there is still a good use case for Analysis Services in today’s analytics landscape. Often the biggest challenge the developer faces (after learning how to model a Data warehouse) is how to query it. I figures I’d take a moment to try and show a side by side comparison of the different Query syntax of TSQL queries vs MDX queries. While this post is not meant to exhaustive I hope to cover the most common business cases and how we would achieve the same in TSQL. I do this so that developers can anchor the MDX query format with something they would already be familiar with.

Let’s start with something simple

The below query shows a count of facts from the fact table Normalization_index

 

The above MDX query uses a Measure from the Fact Table Normalization_index. As you can see here the syntax for MDX is slightly different than the TSQL queries in that we uses Axes to slice and Dice the data hence the use the key word Columns

 

Let’s now add a breakup by year to the above query

 

In this query we add an additional Axes to the output which in our case is a columns from the Calendar Dimension column called Month. This results in the query looking much like the rows and columns format in a RDBMS. The Default Member for a dimension is usually the All member. At this point it might make sense to explain a Dimension , member , Tuple and Set

 

A Dimension is a table containing data on which we slice and Dice the data. A Calendar is a typically the most common example of a Dimension table it contains number of Columns (called Attributes in MDX) which we use to slice the data on.

 

A Attribute is a Column within a Dimension table on which we Query

 

A Member is a unique value within the Dimension Attribute e.g. January would a member of the Attribute Month in the Dimension Calendar.

 

A tuple is the unique point f intersection of Multiple Dimension members. E.g. you might represent a Tuple as ([Calendar].[Month].[January]) this is actually a single member but it intersects with the default members for all other dimensions so it’s the point of intersection of say January with Products Dimension for All Products and Region Dimension for All Regions. (since the default member for all Dimensions is [All]).

 

A Set is Collection of Tuples and is represented by {} you can define named SETS which could be thought of as a CTE with a table of rows that have some logic applied against it. A SET might look like below

 

With set [List of Countries in Indian SubContinent]

As

{

[Region].[Country].[India],

[Region].[Country].[Sri Lanka]

[Region].[Country].[Pakistan]

[Region].[Country].[Nepal]

[Region].[Country].[Bhutan]

}

 

The above set can then be used within the query as shown below

 

Select [Measures].[Sales] on Columns ,

[List of Countries in Indian SubContinent] on rows

From Orders

 

Notice how the MDX gives the Grand Total also. This can cause the results to be incorrect if we then use the data as it is within a Reporting tool so its probably better we remove the Grand Total as shown below by using the Children function against the Attribute Month. So essentially we are saying go to the Dimension [Calendar] lookup the Attribute [Month] and fetch all its Child entities.

Here we see how we can bypass the use of the Default member [All] in the output.

 

Next we might like to see the data distributed by the Origin Airport as well

 

Here we see how adding additional columns to the Axes requires encapsulating them in a tuple () this is because we need to find the point of intersection between the states and the months and pick the total count for each point of intersection.

 

Next let’s say we want to filter the result only for the state AK, AL and AR

 

Naturally at some point you will want to start filtering rows in order to make the data more specific to a certain use case or report and there are multiple ways you can perform this operation the above is probably the most easily to learn when starting out.

 

As you can see the MDX query is a bit convoluted when written this way so we normally prefer another approach as shown below:-

Just one other way that proves useful when you will want to use the results of the above set for additional Calculations later on.

 

Now let’s try to order the results by the count of flights grouped by State and Month. Basically the month the highest number of flights for each state should be listed as the first row for each group as shown below

 

While I normally prefer to perform Order by operations in the presentation layer in this query we sort the data as it often proves useful to identify top performing regions etc. and use the output for more calculations later on downstream. Another way we can achieve the same is using the RANK Function.

 

Next we might prefer to show only the top two months for each state

 

Sometimes we might want to perform a filter to show the top N number of results for a particular dataset. Often used in retail when we want to find the best products in a particular Category etc.

 

More MDX queries to follow soon for additional types for Queries that do not have a TSQL equivalent such as the use of the Time Series Analytics functions & Navigation Functions like Parallel Period , Parent etc.

Category: Databases

About opsadmin

This post was written by Jayanth Kurup. A Microsoft SQL Server Consultant and Trainer based out of Bangalore, India. Jayanth has been working on MS SQL Server for over 15 years. He is a performance tuning and Business Intelligence expert. Having worked with companies like Microsoft, DELL, Wells Fargo, Thomson Reuters and many other fortune 100 companies. Some other technologies Jayanth works on include Microsoft Azure, PowerBI, Python and AWS. When he isn’t consulting or training, Jayanth like to travel, paint and read. He is also very active in social causes and the founder of Enabled Business Solutions. Visit his company by clicking the link in the menu or email him directly.