Monthly Archives: August 2017

Can we use in-memory oltp instead of columnstore index for DW queries

Recently when having a chat with a client I was asked if it will be a good idea to replace a view with a static in memory table so that users querying the table for a dashboard will have better performance considering the data will be in the RAM.

Bottom line the best place to do this is by using a regular table with a columnstore index however I figured this is better demonstrated in a Video considering i get asked this question quite a bit.


Scaling SQL Express with Stretch table feature


Recently I was asked by a client if the Stretch table feature will work with SQL Express and if so what the limitations are. This seemed like an interesting use case so we explored further and here is what we found.

Can SQL EXPRESS edition in SQL 2016 be used with stretch tables feature in the first place?

The answer is yes as shown in the MS documentation and the screenshot below. And yes it works with the Vanilla flavor as well.

What is the Max size of the database in SQL Express?

The max size of the database in SQL Express is 10 GB. The details are in the link mentioned previously.

Does this include data moved into Stretch table as well?

So I created a VM in Azure running SQL Express and then created a database with a single table with a char(8000) data type column and populated 1310154 rows in it which includes an identity column and a datetime column.

I then tried to use stretch table feature in SQL 2016 and move some of the data into an Azure SQL database.

Almost immediately I came across some serious drawbacks of the feature.

  • A significant drawback of the feature is that it is very expensive considering the data being stored is not part of the frequently used OLTP workload.
  • In addition when provisioning the database you don’t get to choose the database version in Azure such as basic standard etc.
  • If the remote database is unavailable even queries referencing the existing data doesn’t work
  • The remote database is not queryable directly since tables etc. do not show up in SSMS directly.
  • Performance of the queries is not so great compared to local table.
  • It is possible to delete the archive database in Azure without any checks from local instance, this will result in local queries also failing since it looks for the remote table during query execution. This is even after stopping stretch table feature and choosing to leave the data in Azure.

However I am also able to bypass the 10GB limit of the SQL express edition so that’s a plus but is it enough? I don’t think so.


Are we looking at gender pay gap incorrectly?

This is a topic that is very dear to me not because of the gender aspect of it but because I am a strong believer of Equal pay for Equal work regardless of gender, race, religion or nationality. I strongly believe there is a wage gap and that there is gender bias but I have been looking at the data myself and have found things I am not able to reconcile with what I see in the real world. This has already been analyzed by a number of Universities and Newspapers around the world. But I don’t believe things until I see them with my own eyes (more on this in another blog).

I understand that many might not want to read the full text so if you are in a hurry simply scroll to the “THE THING THAT BUGS ME THE MOST” section at the bottom. Where I explain how even when Male and Female Employees are paid the same the method used to calculate the gap will show a difference in favor of women.

The first and foremost issue is we look at pay gap as a monolith. The most common number thrown around is 80 cents to the Dollar. That however is a gross generalization. Unless humanity is a uniform mixture of equal portions of men , women , black, white , married , single , young , old , rich , poor , Christian , Muslim porridge trying to bucket people into just two doesn’t make sense. But let’s assume all things being equal (which it is not) we will still have more men than women because of how evolution works see the video below for why.


To get a meaningful understanding we need to do an apples to apples comparison. So we look at the finer details like within a specific industry and during my research I find that the most common example used is that of CEOs. But the truth is CEOs are not the best occupation to use because their salaries vary not because of their ability but the market valuation of the companies they serve. The closest relatable example I can think of was software development because this profession inherently doesn’t have any gender bias unlike Truck Drivers or Nurses. Why, because a mind is a mind and doesn’t really matter what meat suit it is using.

The below graph shows the distribution for pay ratio of Women/Men on the X axis and the population ratio for Women/men on the Y Axis. Any industry where the pay/gender details were unavailable or too low were ignored. What stands out in the below graph is the fact that software industry does a decent job of pay parity with quite a few occupations above 90% (the remaining 10% can be accounted for by overtime, travel, educational qualifications, position within organization etc.). However we see something shocking in terms of the number of women vs men in the industry. We see that in almost all cases the number of men is more than double the number of women. The root cause for this disparity in the number of Women is due to the fact the Women do not pursue technical skills as much as men. I know this to be especially true in India. Additionally women are willing to trade pay for benefits like flexible working hours, WFH etc. Women tend not to Job Hop.

Now let’s look at the service industry which is not technical , here again we see that there is a population disparity in terms of the number of women employed in each sector. Again we see that the jobs are physically demanding and as a result less attractive as a career option. But if you notice you will see that despite the rigours of the job and the obvious disparity in the population of women the pay gap is close to 85 for most sectors. Basically in these profile we see that men take high risk jobs but the pay only slightly reflects the difference in risk the rest can be accounted for under overtime etc. as mentioned above. FYI the Blue Dot top right is hotel staff where tips etc are not considered.

So at this point we have established that there are certain industries where majority of women actively try to avoid working. (Yes I said it. It’s an open secret and numbers don’t lie). But let’s look at the industries where there is no such disparity between men and women in terms of employment. Like I mentioned earlier a truck driver or construction worker is predominantly a male occupation. Just like how Teaching, Nursing, textiles etc. are dominated by females. The below graph show only those occupations where the population ratio of women to men is above 97%.

Almost immediately we can see two trends:-

  • Almost all the jobs are low paying jobs.
  • In some of these industries women actually get paid more than men.

Naturally we also see Sales agent hovering at 58% but then sales is based on commissions and the ability to sell which some people have and others don’t.


Hopefully by now you’re convinced that the problem cannot be adequately described by using 80 Cents to a Dollar. Which I feel is just marketing hype more than anything else. But what if I told you this isn’t even the root cause of my concern.

To understand this I need to first show you where I got my data from. The data for the above charts and analysis comes from the US labor department website. Kudos to them for providing such accurate data, most countries don’t have anything near as complete as this dataset. Now if you look at the link provided in the references section you will see the value for Pay provided is Median not Average.

Why does this matter?

This matters because the median means something totally different and is not an accurate way to measure pay if you’re going to treat the group as a monolith. A median takes the middle number from a set of values. Which makes sense if the distribution of number follows a Gaussian curve and we have an equal number of people on either side of the median.

This distribution implies that the majority of employees are middle management. But any HR will tell you almost every job follows a pyramid structure.

So we are looking at something like the chart below, but is not the entire part of the Gaussian curve. It should actually be only the right hand side.


Take a fictional company where everything else is equal with 16 employees and their pay structure with a total budget of 1M shown as below:-

Notice the median multiplied by the number of employees results in a shortage of 200,000 against the budget while the average doesn’t. So now that I have made this point let’s try and approximate this model to something more realistic. If you remember the first scatter plot you will see the gender ratio for software engineers is 0.25, i.e. for every 4 guys there is one girl.

If we take the above headcount and split it by this ratio (giving the advantage to women where ever possible) we get something like this. Notice I still have only 16 employees and all I have done is added gender as a slicer.

Notice that men and women are given the same salary here (there is not pay gap). However when we calculate the Median notice how Women has smaller number compare to men. How can that be when we know for a fact that the salary is the same? Well it’s because of the number of people population distribution. In fact if you divide 42857 by 50000 you get .85 which is exactly what the first scatter plot shows.

If you are arguing that there is no women entry in the Top Level then know that if I did add that the median would only have gone up. Further proving my point. So what I am trying to say is even when we give men and women the same salary we can see differences due to the way we are calculating things.


There is a pay gap between genders but it’s not as bad as we think at least until we can get more accurate data. The real problem seems to be the low percentage of women employed in high paying sectors.


I would love to be proved wrong and I have done the math multiple times to confirm, I understand I can get bashed for saying this which is why I need proof that the above stats are incorrect or I made a mistake in calculating things. Which is why I have transparently included everything I have used to arrive at this conclusion in the link below. This is data set used and it reflects the number published by the US labor department as well.


Women in STEM

Over time statistics for genders