Monthly Archives: January 2015

Which is better Count (*) or Count (id)?

While conducting a training recently I was asked which is better Count(*) or Count(id) , while we wanted to discuss the topic further we didn’t have time so I promised to write a blog about it.

Coming right down to it, in almost all cases Count(*) is better. Typically you won’t be able to tell the difference for small tables but once the tables grow large you can see the difference in the execution plan as well as the time statistics. Take for example a table called tickets which I will use below to demonstrate.

Tickets table contains 16 million rows.

Without any indexes when executing the below queries with a clean buffer pool

 

select
count(Scheduleid)
from [dbo].Tickets
option (maxdop 1)

CPU time = 2777 ms, elapsed time = 8748 ms.

 

select
count(*)
from
[dbo].Tickets
option (maxdop 1)

CPU time = 1014 ms, elapsed time = 1009 ms.

 

Without any indexes when executing the below queries with a prepopulated buffer pool

select
count(Scheduleid)
from [dbo].Tickets
option (maxdop 1)

CPU time = 2465 ms, elapsed time = 2544 ms.

 

select
count(*)
from
[dbo].Tickets
option (maxdop 1)

CPU time = 1061 ms, elapsed time = 1052 ms.

 

Now let’s look at the execution plan

select
count(Scheduleid)
from [dbo].Tickets
option (maxdop 1)

select
count(*)
from
[dbo].Tickets
option (maxdop 1)

 

Notice from the above data how CPU time is almost always the same as elapsed time when using Count(*) and that it is always less than the option used for Count(id)

Once looking that the execution plan you will notice that when using count(*) that size of the row is always 9 bytes but when using a column it depends on the column datatype. Naturally at this point we can assume that the count is being done in two different ways depending on the query so the question is from how?

 

Now notice they change in behaviour once I make the scheduleid column not null within the tickets table.

 

select
count(Scheduleid)
from [dbo].Tickets
option (maxdop 1)

CPU time = 1030 ms, elapsed time = 1041 ms.

 

select
count(*)
from
[dbo].Tickets
option (maxdop 1)

CPU time = 1014 ms, elapsed time = 1023 ms.

 

The execution time becomes almost the same. The reason for this is because when using Count(*) the slot count is used within each page to determine the number of rows. But when using an individual column as the expression on which count happens SQL is forced to scan the column entirely to check for nulls even when the column doesn’t have nulls, it’s only after scanning the entire column that SQL truly knows. Hence the warning

 

Warning: Null value is eliminated by an aggregate or other SET operation.

 

Hope this clears things up a bit.


 

SQL Server DEV and SSDT – MINDTREE – CHN – DEC

Some feedback from the training

What did you like most about the training?

Knowledge on SQL

Trainers end to end knowledge about the tool.

Explained concepts with hands on.

Came to know topic didnt know earlier.

Trainer was able to explain the topics covered in training.

Performance improvement only by basic stuff(e.g. index)

Trainer is knowlegable and was able to provide effective answers to the trainee queries

Trainer was able to give a quick solution , when asked about a scenario which we face in a project.

Trainer was knowledgeble.

Good examples,

Answering the questions even if the question is not much related to the actual content

Lot of new features explained with real time senario

SQL 2014 ADMIN, DEV – Wells Fargo – JAN

Some feedback from the training

What did you like most about the training?

 

-Quite impressed by the way Jayant taught Performance tuning

-In depth explanation of all the key concepts.

-Given real time examples wherever needed which helps understanding complete picture

Part 2 Buffer pool extension SQL 2014

In this second part (read the first part here) of the series we continue testing the limits of our RAM and try to understand SQLs behaviour when it encounters memory pressure. To this end we have created a table that is little more than 2 GB in size and have set out SQL server buffer pool to be 2 GB in size.

Once I start fetching data from the table notice how the RAM in the buffer pool is utilized.

Buffer Pool                                         Value

—————————————-             ———–

Database                                             244050

Simulated                                           44800

Target                                                   524288

Dirty                                      8344

In IO                                      0

Latched                                                0

Page Life Expectancy                                                      77

Also notice how in the resource manager of the OS disk activity is now happening on tempdb as well.

This is an indication that SQL is running out of memory and having to store intermediate results in the tempdb. A good indicator for memory pressure is the low value of Page life expectancy. Also you will notice that the OS page file also gets used significantly as a result. This indicates that when SQL runs out of memory it turns to the next best thing the page file and tempdb. Both these files reside on disk and as a result any read / write IO on these files are subject to disk latency which typically is around 5 to 15 ms.

Enter the game changing SSD. Typical read/write latency is in the range of .3ms- 1ms!!!!

I remember reading a blog few years back were a DBA predicted that soon all companies will start storing their data in SSD which are excellent for Random IO compared to traditional disks. With buffer pool extension I think we are finally at the stage where databases can now move to faster and more reliable storage hardware. In the next post I will explain how to setup buffer pool extension and how it will improve IO performance.

Part 1 Buffer pool extension SQL 2014

In this series I am going to cover the feature Buffer pool extension in MS SQL Server. Now before we can discuss this feature it might be a good idea to define what a buffer pool is and why it needs extensions in the first place. The buffer pool is that part of the RAM in which the data is stored. The idea here is that RAM is divided into many parts when used by SQL Server. One such part is the buffer pool used to store data , other parts include RAM space allocated for CLR , XML , Full text Indexes , DBCC utilities etc. But the buffer pool is by far the largest of the groups since this is the par the SQL needs the most.

A common misconception among Windows Admins is that it’s a bad thing when SQL uses 80% of the RAM, this is counter intuitive to what Windows admins see with other applications which use just enough RAM to do what is needed. SQL however works differently and therefore tries to use as much RAM as possible since the more data in the RAM the faster the query execution and the fewer I/O to the slower disk subsystem.

So where do you configure how much RAM is used by Buffer pool?

The min and max memory setting under the memory tab of SQL Server properties actually configures the size of only the buffer pool. This is commonly misunderstood to be the total SQL Server memory which it is not.

In this series I am going to configure my Buffer poo to use a total of 2GB RAM, which is sufficient to demonstrate the behaviour of SQL server when it runs out of RAM. To be clear I am running this on a 64 bit system and PAE/AWE do not apply.

So I set the MIN and MAX memory setting of my SQL instance to 2048 MB i.e. 2GB

Next I create a dummy data table with a varchar column for 4000 bytes and populate it data up to 2GB.

More in the next part of series.