All Forums Database
tlittell 2 posts Joined 07/09
13 Jul 2009
qualify rank() over (partition.....question

Hi everyone....This query returns the top(1) commodity and count for each of the 10 customers. This query works fine and returns the desired results except for the additional row where the cnt value is the same. Of the 2 cnt = 22 rows I would be happy with taking the first record based on commodity_desc alphabetically (CHEMICALS in this case).Results:cust_acct_nbr commodity_desc cnt11083 GIFTWARE, NOVELTIES 1132223626 CHEMICALS 613912675 COMPUTERS 965948090 ELECTRONIC PARTS 42601163 CHEMICALS 101614014 COMPUTERS 596287794 ELECTRICAL AUDIO VIDEO 24663609 HOUSEWARES 417580149 CHEMICALS 227580149 TOILETRIES 228089459 FURNITURE 18Query:sel m.cust_acct_nbr cust_acct_nbr, com.commodity_desc, count(m.claim_cycle_sequence_nbr) cntfrom claim_prod_view_db.claim_detail m, claim_prod_view_db.claim_commodity com,ui_results_db.tl434409_dmg_denied_shprs shprs ---table contains the 10 customer values---where m.commodity_cd = com.commodity_cdand m.cust_acct_nbr = shprs.fxg_shprand com.company_cd = 'FDEG'AND m.sqi_intl_dom_cd = 'D'AND m.sqi_count_flg = 'Y'AND m.ship_dup_flg = 'N'AND m.sqi_loss_damage_flg IN ('d', 'D')AND m.sqi_paid_denied_flg ='d'and m.company_cd = 'FDEG' AND m.claim_close_dt >= '2009-06-01'AND m.claim_close_dt < '2009-07-01'group by 1,2qualify rank() over (partition by cust_acct_nbr order by cnt desc) = 1I'm new to Teradata (coming from Oracle) and new to the forum and appreciate any help anyone can give. Thanks.

dnoeth 4613 posts Joined 11/04
13 Jul 2009

You do exactly the same, you would do in Oracle :-)Replace RANK with ROW_NUMBER:qualify row_number() over (partition by cust_acct_nbr order by cnt desc, commodity_desc) = 1Dieter

Dieter

tlittell 2 posts Joined 07/09
13 Jul 2009

Thanks Dieter...guess I should stop trying to over think these things.

med 2 posts Joined 01/12
21 Jan 2012

hello, i need some help please,

I have a dataset similar to the following:

Patient_ID Unit       Start_Time        End_time
1234             3FE       3/2/09 12:00     3/5/09 13:00
1234             3FE       5/5/09 13:00      5/5/09 14:00
1234             3FE       5/5/09 14:00      5/7/09 18:00
1234             ICU       5/7/09 18:00      5/16/09 5:00
1234             3FE        5/16/09 5:00      5/23/09 7:00
1234             3FE       5/23/09 7:00      5/25/09 3:00
1234             3FE        5/25/09 3:00      5/28/09 8:00

 

The result set should look like the following:

 

Patient_ID Unit       Start_Time        End_time
1234             3FE       3/2/09 13:00      5/7/09 18:00
1234             ICU       5/7/09 18:00      5/16/09 5:00
1234             3FE        5/16/09 5:00      5/28/09 8:00

 

 

Any suggestions would be appreciated.

 

 

dnoeth 4613 posts Joined 11/04
21 Jan 2012

The expected result doesn't match your example data, the first end_time should probably be 5/5/09 13:00.

There are several ways using OLAP functions, the most efficient is probably:

SELECT Patient_ID, Unit, Start_Time,
  COALESCE(MIN(prev_end_time)
           OVER (PARTITION BY patient_id, Unit
                 ORDER BY start_time
                 ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)
           ,max_end_time) AS End_time
FROM
 (
  SELECT Patient_ID, Unit, Start_Time,
    MAX(End_time)
    OVER (PARTITION BY patient_id, Unit
          ORDER BY start_time
          ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_end_time,
    MAX(End_time)
    OVER (PARTITION BY patient_id, Unit) AS max_end_time
  FROM dropme
  -- find the first row after a gap
  QUALIFY start_time > prev_end_time OR prev_end_time IS NULL
 ) AS dt

If you're on TD13.10 one of the new functions to normalize periods will be faster:

WITH cte(Patient_ID, Unit, pd) AS
 (
   SELECT
      Patient_ID,
      Unit,
      PERIOD(CAST(Start_Time AS TIMESTAMP(6)), CAST(End_Time AS TIMESTAMP (6))) AS pd
   FROM dropme
 )
SELECT
   Patient_ID,
   Unit,
   BEGIN(pd) AS Start_Time,
   END(pd) AS End_Time
FROM TABLE
 (
   TD_SYSFNLIB.TD_NORMALIZE_OVERLAP_MEET(NEW VARIANT_TYPE(cte.Patient_ID, cte.Unit),
                                 cte.pd)
   RETURNS (Patient_ID INT, Unit CHAR(3), pd PERIOD(TIMESTAMP(0)))
   HASH BY cte.Patient_ID, cte.Unit
   LOCAL ORDER BY cte.Patient_ID, cte.Unit, cte.pd
 )
AS dt (Patient_ID, Unit, pd);

This query might return the count of normalized rows, too.

Dieter

Dieter

med 2 posts Joined 01/12
22 Jan 2012

thanks Dieter thats great !!

prawen 4 posts Joined 09/12
19 Sep 2012

what is the replacement of "qualify rank()" which is in the teradata to informix query.?

Kanch 3 posts Joined 04/13
21 Apr 2013

Hi Dieter , 
 
I have the similar requirement with slight change , need your Help in that -
 
My dataset is as below -
Company ID   Effective Date   Status Codes
C1                    1st March              JS
C1                     2nd March             AB
C1                     3rd March              ZB - previous row
C1                       4th March            JS
C1                     5th March              JS - current row
 
I want my resultset to return only current and preiosu record where there is change in the status , like  below
Company ID                     Current Status                Preious Status
C1                                        JS                                    ZB
can you please help me in this.
 
Thanks
Kanch

dnoeth 4613 posts Joined 11/04
22 Apr 2013

Hi Kanch,
i don't get your requirement, you should post some more data and more detailed explanation, why you need which output.
Dieter

Dieter

Kanch 3 posts Joined 04/13
22 Apr 2013

Hi Dieter ,
I am having a company record in table having different status stored  as below -
Comapny ID - company identifier
Effv Date - Date on which record is getting inserted to table
Staus Codes - Status codes linked to company identifier , which can be same or different thatn previous record.
Company ID   Effective Date   Status Codes
C1                    1st March              JS
C1                     2nd March             AB
C1                     3rd March              ZB 
C1                     4th March               JS
C1                     5th March              JS 
Here I want to return the company identifier along with
-it's currentstatus i.e. J S (5th March)
-it's previous status where there is a change i.e. Z B (3rd March).
Thanks for your Help!
 
Kanch
 

dnoeth 4613 posts Joined 11/04
22 Apr 2013

Hi Kanch,
so you don't want all the changes, just the last one plus the most current row:

SELECT *    
FROM 
 (
   SELECT * FROM tab
   QUALIFY -- get the most current row and all rows where the status changes in the next row
      COALESCE(
         MIN(status) 
         OVER (PARTITION BY companyid 
               ORDER BY effectivedate 
               ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING)
         ,status||'x') <> status
 ) dt
QUALIFY -- get the current and the previous row
   RANK() 
   OVER (PARTITION BY companyid 
         ORDER BY effectivedate DESC) <= 2

Dieter

Dieter

Kanch 3 posts Joined 04/13
24 Apr 2013

Thanks Dieter :)
I am getting the desired output.
 
Thanks
Kanch

saurora 1 post Joined 10/13
28 Oct 2013

That was a nice idea for the above example, Dieter. I have a similar situation and trying to get an idea how to get the desired output shown:
Below shown is the SOURCE and the desired OUTPUT. I am looking to get only one record for each OFFICE_ID and CAL_DT where it has the populated CLOSE_DT and LY_CLOSE_DT.
Any response is greatly appreciated.
 
 

 

SOURCE

 

 

 

 

OFFICE_ID

CAL_DT

CLOSE_DT

LY_CLOSE_DT

1

100

2011-06-05

2011-06-05

?

2

100

2011-06-06

2011-06-06

?

3

100

2012-06-05

?

2011-06-05

4

100

2012-06-05

2012-06-05

?

5

100

2012-06-06

?

2011-06-06

6

100

2012-06-06

2012-06-06

?

7

100

2013-06-05

?

2012-06-05

8

100

2013-06-06

?

2012-06-06

 

 

 

 

 

 

DESIRED OUTPUT

 

 

 

 

OFFICE_ID

CAL_DT

CLOSE_DT

LY_CLOSE_DT

1

100

2011-06-05

2011-06-05

?

2

100

2011-06-06

2011-06-06

?

3

100

2012-06-05

2012-06-05

2011-06-05

5

100

2012-06-06

2012-06-06

2011-06-06

7

100

2013-06-05

?

2012-06-05

8

100

2013-06-06

?

2012-06-06

dnoeth 4613 posts Joined 11/04
28 Oct 2013

Sorry, but this is not readable/understandable.
Dieter

Dieter

Qaisar Kiani 337 posts Joined 11/05
29 Oct 2013

Use the following quslify statement
Qualify Rank(partition by OFFICE_ID, CAL_DT) =1
And filter the records based on the condition WHERE CLOSE_DT IS NOT NULL and LY_CLOSE_DT IS NOT NULL...

ficklingIan 2 posts Joined 05/13
26 Nov 2013

Hi
I have tried the above code but I dont seem to be getting the expected results. Example table
myTab(c1 varchar(10), c2 varchar(10), c3 varchar(10), sd date ed date);
example
data row1: 'A','B','C','2012-01-01','2012-01-02'
data row2: 'A','B','C','2012-01-03','2012-01-09'
data row 3 'A+','B','C','2012-01-10','2012-01-19'
data row 4 'A','B','C','2012-01-20','9999-12-31'
 
What I expected back was three rows in which data row 1 and data row2 are 'combined', however this is not the case.
The query I use is :-

Select c1
      ,c2
      ,c3
      ,sd
      ,coalesce(min(prev_end_time)
           over (partition by c1,c2,c3
                 order by sd
                 rows between 1 following and 1 following), max_end_time) as end_time
From 
     (
       Select c1
             ,c2
             ,c3
             ,sd
                   ,max(ed) 
                       over (partition by c1,c2,c3
                               order by sd
                               rows between unbounded  preceding and 1 preceding) as prev_end_time
                   ,max(ed) over (partition by c1,c2,c3) as max_end_time
        From insight.if_test1
        qualify sd > prev_end_time  or prev_end_time is null
       ) as dt

H E L P

Ian Fickling

janus79 2 posts Joined 11/12
06 Mar 2014

Hi,

I have data like the following in the database. I need the start date and end date of the first record where ID2 is matching on the next records in sequence. can somebody please help with a query to do this case. - john
input    
rnk   ID    ID2    START   DT END-DT
1      aa   XXX     1-Oct    1-Nov
2      aa   XXX    5-Oct     5-Nov
3      aa   XXX    6-Oct     6-Nov
4      aa   YYYY   1-Sep    1-Oct
5      aa   YYYY   4-Sep    4-Oct
6      aa   ZZZZ   1-Dec  1-Jan
7      aa   XXX    30-Sep  30-Oct
8      aa   XXX    1-Oct   10-Oct
output    
rnk   ID   ID2    START DT    END-DT
1      aa   XXX    1-Oct         1-Nov
2      aa   XXX    1-Oct         1-Nov
3      aa   XXX    1-Oct        1-Nov
4      aa   YYYY   1-Sep       1-Oct
5      aa  YYYY    1-Sep        1-Oct
6      aa  ZZZZ   1-Dec        1-Jan
7      aa  XXX     30-Sep      30-Oct
8      aa  XXX     30-Sep     30-Oct
 

dnoeth 4613 posts Joined 11/04
08 Mar 2014

Hi John,
some questions:
- is rnk based on the same sort as start and end date, i.e. are the dates for the first row within the group always the lowest?
- is rnk actually a unique number without gaps?
- what's your TD release?
 

Dieter

Shashank0586 2 posts Joined 11/13
13 Aug 2015

Hi Dieter,
I need your help to prepare a query for the below scenario.
My input is :-
Three columns Subscr_Id, Start_Dt, End_Dt, for a Subscr_Id = 151 we have different start_dt and End_dt as below:-

high date= 31-12-9999
 
I want my output to look like below for the same Subscr_Id:-

Shashank0586 2 posts Joined 11/13
15 Aug 2015

Hi Dieter,
I need your help to prepare a query for the below scenario.
My input is :-

Subscription ID

Assigned bundle category with classification "BUNDL"

Start Date

End Date

GSM1723456789

1

01.01.2000

14.05.2014

GSM1723456789

110

15.05.2014

01.06.2014

GSM1723456789

11

02.06.2014

high date

Three columns Subscr_Id, Start_Dt, End_Dt, for a Subscr_Id = 151 we have different start_dt and End_dt as below:-
high date= 31-12-9999
 
I want my output to look like below for the same Subscr_Id:-

Subscription Id

Monthly bundle category with classification "BUNDL_MON"

Start Date

End Date

GSM1723456789

1

01.01.2000

14.05.2014

GSM1723456789

111

15.05.2014

31.05.2014

GSM1723456789

110

01.06.2014

01.06.2014

GSM1723456789

111

02.06.2014

30.06.2014

GSM1723456789

11

01.07.2014

high date

 

data_miner1234 1 post Joined 12/15
19 Dec 2015

Hi Dieter need your help in getting the following information.

There are say 18 columns and one of them is  a date that identifies the point of time when the data was captured. Lets call it it snapshot date. 
Need to qualify top 4 latest snapshot dates every month in a year and store them in a table. 

I am currently using QUALIFY ROW_NUMBER () OVER (PARTITION BY col1, col2, ..... col18, year, month) <= 4

While this brings me close to the answer I am not completely there. Reason being for combinations of col 1 through 18 that do not exist for the 4th snapshot, it will capture the 5th snapshot date. I just want whatever data that's present for the latest 4 snapshots of a month, could you guide me?

Thanks,
Harsh
 

dnoeth 4613 posts Joined 11/04
19 Dec 2015

Hi Harsh,
you need to partiton by year/month and order by date like

QUALIFY ROW_NUMBER()
        OVER (PARTITION BY EXTRACT(YEAR FROM datecol), 
                           EXTRACT(MONTH FROM datecol)
              ORDER BY datecol DESC) <= 4

 

Dieter

jainayush007 50 posts Joined 03/11
09 Apr 2016

Hello,

 

Need help.Need to be able to aggregate travel hrs over 12 month rolling for every state-city-traveler. And further at state-city.

 

Sample Data -

 

State City Traveler YR-MNTH travel hrs

Hawaii Honolulu Traveler1 201,601 3

Hawaii Honolulu Traveler2 201,601 108

Hawaii Honolulu Traveler3 201,601 593

Hawaii Honolulu Traveler3 201,512 722

Hawaii Honolulu Traveler2 201,512 107

Hawaii Honolulu Traveler2 201,511 88

Hawaii Honolulu Traveler3 201,511 610

Hawaii Honolulu Traveler3 201,510 710

Hawaii Honolulu Traveler2 201,510 135

Hawaii Honolulu Traveler3 201,509 584

Hawaii Honolulu Traveler2 201,509 114

Hawaii Honolulu Traveler2 201,508 158

Hawaii Honolulu Traveler3 201,508 666

Hawaii Honolulu Traveler3 201,507 606

Hawaii Honolulu Traveler2 201,507 144

Hawaii Honolulu Traveler2 201,506 76

Hawaii Honolulu Traveler3 201,506 677

Hawaii Honolulu Traveler3 201,505 600

Hawaii Honolulu Traveler2 201,505 76

Hawaii Honolulu Traveler2 201,504 79

Hawaii Honolulu Traveler3 201,504 595

Hawaii Honolulu Traveler2 201,503 109

Hawaii Honolulu Traveler3 201,503 690

Hawaii Honolulu Traveler3 201,502 571

Hawaii Honolulu Traveler2 201,502 75

 

Hawaii Honolulu Traveler2 201,501 89

Hawaii Honolulu Traveler3 201,501 605

 

Expected result -

 

State City Traveler YR-MNTH cal travel hrs

Hawaii Honolulu Traveler1 201,601 3

Hawaii Honolulu Traveler2 201,601 1269

Hawaii Honolulu Traveler3 201,601 7624

Hawaii Honolulu Traveler2 201,512 1250

Hawaii Honolulu Traveler3 201,512 7636

 

Further aggregate - 

 

State City YR-MNTH cal travel hrs

Hawaii Honolulu 201,601 8896

Hawaii Honolulu 201,512 8886

 

Thanks.

dnoeth 4613 posts Joined 11/04
10 Apr 2016

If there's always a row per month you can simply use:

SUM(travelhours) 
OVER (PARTITION BY State, City, Traveler 
      ORDER BY YR-MNTH
      ROWS 11 PRECEDING)

 
And for the state/city-combination you aggregat first:

SUM(SUM(travelhours)) 
OVER (PARTITION BY State, City
      ORDER BY YR-MNTH
      ROWS 11 PRECEDING)

...
GROUP BY State, City

 

Dieter

jainayush007 50 posts Joined 03/11
12 Apr 2016

Hello. Thanks for your response. There could be more that 1 row per month which is causing the complexity. Any ideas?

You must sign in to leave a comment.