Wednesday, 22 February 2012

GROUP BY Clause And Performance Factors



In this article I am trying to describe the frequently used GROUP BY Clause and the Performance factor related to it.

A short description of GROUP BY Clause

The GROUP BY clause can be used in a SELECT statement to collect data across multiple records and group the results by one or more columns. The aggregate function columns are not included in GROUP BY cluse, I mean to say the columns that used the aggregate function is not included in the GROUP BY cluse.

The HAVING cluse is used to filters the GROUP BY cluse.


SELECT   columns1, columns2, SUM(columns3)

FROM     Mytable

GROUP BY columns1, columns2

HAVING   columns1>0


Is the GROUP BY is the Performance killer?

NO the GROUP BY itself is not the performance killer.

In many cases the GROUP BY clause dramatically decreases the performance of the Query.


Here are the some points to take care:


1.     To make the performance better, use the COMPOUND INDEXES for the GROUP BY fields.

2.     Don't use unnecessary fields or Unnatural Grouping options.

3.     I personally preferred the same sequence of columns in SELECT as well as GROUP BY. For Example.

Like this:

SELECT   columns1, columns2, SUM(columns3)

FROM      Mytable

GROUP BY  columns1, columns2



 Not Like this:

SELECT    columns1, columns2, SUM(columns3)

FROM      Mytable

GROUP BY  columns2, columns1


This is a small article, but I hope it is quite informative and thanking you to provide your valuable time on it.


Posted by: MR. JOYDEEP DAS