All Forums Database
mattit 3 posts Joined 11/11
03 Apr 2012
QUALIFY ... OVER (PARTITION BY ...) phrase

I'm pretty new to TD, and thought I was at the intermediate level of SQL knowledge until one of my bosses (actually my boss' boss) helped me to write a query, where he added the statement below to root out dupes:

SELECT a.*
FROM tmp_TEMP a
QUALIFY ROW_NUMBER() OVER(PARTITION BY TransID, TransLine ORDER BY ABS(Date1-Date2)) = 1

I have searched the Web as well as the TD documentation, and I am unable to find what I'm looking for.  Can someone please provide, in layman's terms, what the three functions QUALIFY, OVER() and PARTITION BY actually do?  I've figured out tha QUALIFY has something to do with a filter, but that's about all I've discovered.

Thanks much,
-- Matt
 

ulrich 714 posts Joined 09/09
03 Apr 2012

Hi,

Qualify filters on OLAP functions - simmilar to having clause in aggregate functions.

over and partition are both part of the row_number() function

check 

http://www.google.de/url?sa=t&rct=j&q=teradata%20sql%20functions%2013.10%20doc&source=web&cd=1&ved=0CC8QFjAA&url=http%3A%2F%2Fwww.info.teradata.com%2Fedownload.cfm%3Fitemid%3D102320046&ei=t-t7T-7lGMi0hAfpqqGvDA&usg=AFQjCNETZqqPVQxIJ2vcgqVYRVTs1R-vAg

OLAP functions in general and ROW_NUMBER in particular.

 

Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles

mattit 3 posts Joined 11/11
09 Apr 2012

The description didn't help much.  IHowever, 've found a little more.  The PARTITION BY is equivalent to a GROUP BY clause.  I'm still kind of stuck on the purpose of the QUALIFY and OVER clauses, though.  The example in the doc that you linked to has this short query:

SELECT StoreID, SUM(profit) OVER (PARTITION BY StoreID)
FROM facts
QUALIFY SUM(profit) OVER (PARTITION BY StoreID) > 2;

a) Why would you use a QUALIFY clause?  What would be the result of the above query without the QUALIFY, OVER and PARTITION BY functions/clauses?

b) What does the OVER() function do?

 

I'm usually pretty good at finding the info I want, but this one has me stumped.

ulrich 714 posts Joined 09/09
10 Apr 2012

You have a query and you have questions to the syntax. Why don't you try yourself?

The OVER is part of the OLAP functions syntax as the SET in an UPDATE.

QUALIFY will limit the result set - without you would get one row per row of facts back.

Partiton will reset the sum per storeid...

 

 

Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles

ulrich 714 posts Joined 09/09
10 Apr 2012

And some extracts from the documantation

The ANSI SQL:2008 window feature provides a way to dynamically define a subset of data, or
window, in an ordered relational database table. A window is specified by the OVER() phrase,
which can include the following clauses inside the parentheses:
• PARTITION BY
• ORDER BY
• RESET WHEN
• ROWS

PARTITION BY takes a column reference list and groups the rows based on the specified
column reference list over which the ordered analytical function executes. Such a grouping is
static.

If there is no PARTITION BY phrase or RESET WHEN phrase, then the entire result set,
delivered by the FROM clause, constitutes a single partition, over which the ordered analytical
function executes.

Use of QUALIFY Clause
Rows can be eliminated by applying conditions on the new column value. The QUALIFY
clause is analogous to the HAVING clause of aggregate functions. The QUALIFY clause
eliminates rows based on the function value, returning a new value for each of the
participating rows.

 

Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles

mattit 3 posts Joined 11/11
10 Apr 2012

That's the description that didn't help much (the doc's, not yours).  With some of your earlier statements and what I found, this is sort of what I was looking for (i.e., layman's terms) ...

QUALIFY = start new filtered subset
ROW_NUMBER = self explanatory
OVER {req. by ROW_NUMBER} = start a grouping statement
PARTITION BY = equivalent to GROUP BY

Put it all together from the code I supplied

QUALIFY ROW_NUMBER() OVER(PARTITION BY TransID, TransLine ORDER BY ABS(Date1-Date2)) = 1

and you get a way to only return the top row of a set within a query.  I realize you most likely knew that, but I was trying to understand not only what the statement did, but also each clause/function of the statement.  A more simplified explanation might be:

Return the first row of a subset created by a Filter (QUALIFY) ROW_NUMBER by (OVER) grouping (PARTITION BY) <fields> with an answer set that is ORDERed BY <fields>.

Thanks for the help.

You must sign in to leave a comment.