First thing you should try is to rewrite the NOT IN on NULLable columns to a NOT EXISTS:
AND NOT EXISTS ( SELECT * FROM ASBIVWDB_dv4.MT_AS_TIMECARD_RESOURCE MATR WHERE WATR.BK_AS_PROJECT_TASK_NUM = MATR.BK_AS_PROJECT_TASK_NUM AND WATR.BK_AS_PROJECT_CD = MATR. BK_AS_PROJECT_CD AND WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD = MATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD AND WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY = MATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY AND WATR.FISCAL_YEAR_WEEK_NUM_INT = MATR.FISCAL_YEAR_WEEK_NUM_INT )
This should change Explain to a single step doing an Exclusion Join on the PIs.
Dieter
Of course it's still All-AMP, could you show the new Explain and some info about the number of rows and Indexes/Statistics?
Dieter
This is already the most efficient plan you can get, a direct join using the PIs of both tables.
Btw, this Explain is the result of an EXISTS instead of NOT EXISTS.
Regarding skew, how skewed is that table?
Can you provide COUNT/MIN/AVG/MAX(CurrentPerm) for this table from dbc.TableSizeV?
HELP STATS ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE INDEX NUPI_MT_AS_TIMECARD_RESOURCE;
will tell you about the rowcount, distinct values and max rows per value. Usually a max rows per value up to a few hundred is considered acceptable, otherwise you might consider a different PI.
But changing the PI might affect other queries in a positive or negative way, so you need some analysis to find out.
Dieter
If this is the SkewFactor from Teradata Administrator it's large (the max AMP got approx. 3 times the number of rows than the average AMP), but it's small table. What's the size of the other table?
How many AMPs do you have?
I usually run a query like this to get info about skew:
SELECT
DatabaseName
,TableName
,CAST(SUM(currentperm) / 1024**3 AS DEC(9,2)) AS GB
,CAST((MAX(CurrentPerm) / AVG(CurrentPerm)) AS DEC(5,3))AS SkewFactor
,CAST((HASHAMP() + 1) * MAX(currentperm) / 1024**3 AS DEC(9,2)) AS GBSkewed
,HASHAMP() + 1 AS NumAMPs
FROM dbc.TableSizeV
WHERE DatabaseName = ...
AND TableName = ...
GROUP BY 1,2
Dieter
HELLO DIETER,
WILL YOU PLEASE SUGGEST ME ...AS MY KNOWLEDGE I WILL CONCERN THE FOLLOWING STEPS
IF TABLE CONTAIN MULTISET .. SO IT ALLOWS THE DUPLICATES
SO TRY TO AVOID THE SKENESS "SET" ON CREATION OF TABLE...
AND PARIMARY INDEX ON MULTIPLE COLUMNS..
Hi Kalyan,
this table is not highly skewed, but number of rows per PI value might be high:
select
hashrow(BK_AS_PROJECT_TASK_NUM , BK_AS_PROJECT_CD ,FISCAL_YEAR_WEEK_NUM_INT),
count(*) as cnt
from MT_AS_TIMECARD_RESOURCE
group by 1
qualify row_number() over (order by cnt desc) <= 10
Regarding skew:
SET or MULTISET will not change skew (just avoid the overhead of duplicate row checking), only a different PI can help.
Dieter
CAN ANY ONE SEND ME A QUERY... I WILL TUNE THAT QUERY AS PER MY KNOWLEDGE
REGARDS,
MAHESH
You need to consider few things like number of records in the table, how many rows are there on average per index value etc. If there are two many rows against few of the index values then the skew factor will increase causing the performance degradation.
May be you need to re-think about your index choices.
Regards,
This SQL should give you idea;
SELECT HASHAMP(HASHBUCKET(HASHROW(<<NUPI Columns))) AS "AMP #",COUNT(*) FROM <<table>> GROUP BY 1 ORDER BY 2 DESC;
Cheers!
Skewness is related to the way the data is distributed across the AMPS, and its the INDEX selection that defines the data distribution...
PARTITION doesn't have to do anything with the distribution of the data, it only defines the way the data is looked in blocks on the AMPS...
Use the query that I mentioned above
SELECT HASHAMP(HASHBUCKET(HASHROW(<<NUPI Columns))) AS "AMP #",COUNT(*) FROM <<table>> GROUP BY 1 ORDER BY 2 DESC
And try to Include/Exclude columns in the <<NUPI COLUMNS>>, and look at the results of COUNT(*) to determine the even distribution of the data... If the values get closer to each other then this means the data distribution is more even across the AMPS....
For further details, may be you can start digging into Teradata Physical Design documentation...
No, because you didn't share any valuable input!
DDLs, Explains etc...
What do you expect????
Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles
Overall the plan seems reasonable for this SQL assuming the stats infos are correct.
The PI for COMREFDB_DV1.R_FISCAL_MONTH_TO_YEAR is missing.
How many rows does this table have?
How many amps does your table have?
Is 24,261,154 rows in step 4 an accruate estimate?
Are stats collected for DV_BOOKINGS_TYPE?
How long does the query run? And what is expected SLA for this query?
Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles
The table? Which?
Can you run?
select count(*) from COMREFDB_DV1.R_FISCAL_MONTH_TO_YEAR
select count(*) from SLSORDDB_DV1.R_BOOKINGS_FMTH where DV_BOOKINGS_TYPE = 'CORE';
Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles
I don't think you can do much here.
Stats infos seems to be acruate. And with these the plan seems to be a good one.
The majority of the time seems to be consumed by the aggregation which seems also be reasonable due to the long grouping list.
If you really want to see the impact of the join you can try the following
submit
select 'when RBF.FISCAL_YEAR_MONTH_INT = ' !! trim(FISCAL_YEAR_MONTH_INT) !! ' then ' !! trim(FISCAL_YEAR_NUMBER_INT) from COMREFDB_DV1.R_FISCAL_MONTH_TO_YEAR
and paste the result into the <<<>>> section of
SELECT Case <<<>>> else null end AS FISCAL_YEAR_INT ,RBF.PRODUCT_KEY ,RBF.END_CUSTOMER_KEY ,RBF.BILL_TO_CUSTOMER_KEY ,RBF.SHIP_TO_CUSTOMER_KEY ,RBF.SOLD_TO_CUSTOMER_KEY ,RBF.PARTNER_SITE_PARTY_KEY ,RBF.SALES_TERRITORY_KEY ,RBF.SALES_REP_NUMBER ,RBF.ADJUSTMENT_TYPE_CODE ,RBF.SALES_CHANNEL_CODE ,RBF.SALES_CREDIT_TYPE_CODE ,RBF.IDE_ADJUSTMENT_CODE , RBF.ADJUSTMENT_CODE ,RBF.BKGS_MEASURE_TRANS_TYPE_CODE ,RBF.CANCELLED_FLG ,RBF.ACQUISITION_FLG ,RBF.DISTRIBUTOR_OFFSET_FLG ,RBF.CORPORATE_BOOKINGS_FLG ,RBF.CHANNEL_BOOKINGS_FLG ,RBF.OVERLAY_FLG , RBF.IC_REVENUE_FLG ,RBF.CHARGES_FLG ,RBF.SALESREP_FLG ,RBF.MISC_FLG ,RBF.SERVICE_FLG ,RBF.INTERNATIONAL_DEMO_FLG ,RBF.REPLACEMENT_DEMO_FLG ,RBF.REVENUE_FLG ,RBF.RMA_FLG ,RBF.WIPS_ORIGINATOR_ID_INT ,RBF.DV_DISCOUNT_BAND_CD ,RBF.DV_ORDER_BAND_CD ,SUM( RBF.DD_COMP_US_NET_PRICE_AMOUNT ) ,SUM( RBF.DD_COMP_US_LIST_PRICE_AMOUNT ) ,SUM( RBF.DD_COMP_US_COST_AMOUNT ) ,SUM( RBF.DD_EXTENDED_QUANTITY ) ,SUM( RBF.DD_COMP_US_HOLD_NET_PRICE_AMT ) ,SUM( RBF.DD_COMP_US_HOLD_LIST_PRICE_AMT) ,SUM( RBF.DD_COMP_US_HOLD_COST_AMOUNT ) ,SUM( RBF.DD_EXTENDED_HOLD_QUANTITY ) ,SUM( RBF.DD_COMP_US_STANDARD_PRICE_AMT ) ,SUM(RBF.TRADE_IN_AMOUNT) , RBF.DV_CA_BOOKINGS_TYPE_CD FROM SLSORDVWDB_DV1.R_BOOKINGS_FMTH RBF WHERE RBF.DV_BOOKINGS_TYPE = 'CORE' GROUP BY FISCAL_YEAR_INT ,RBF.PRODUCT_KEY ,RBF.END_CUSTOMER_KEY ,RBF.BILL_TO_CUSTOMER_KEY ,RBF.SHIP_TO_CUSTOMER_KEY ,RBF.SOLD_TO_CUSTOMER_KEY ,RBF.PARTNER_SITE_PARTY_KEY ,RBF.SALES_TERRITORY_KEY ,RBF.SALES_REP_NUMBER ,RBF.ADJUSTMENT_TYPE_CODE ,RBF.SALES_CHANNEL_CODE ,RBF.SALES_CREDIT_TYPE_CODE ,RBF.IDE_ADJUSTMENT_CODE , RBF.ADJUSTMENT_CODE ,RBF.BKGS_MEASURE_TRANS_TYPE_CODE ,RBF.CANCELLED_FLG ,RBF.ACQUISITION_FLG ,RBF.DISTRIBUTOR_OFFSET_FLG ,RBF.CORPORATE_BOOKINGS_FLG ,RBF.CHANNEL_BOOKINGS_FLG ,RBF.OVERLAY_FLG , RBF.IC_REVENUE_FLG ,RBF.CHARGES_FLG ,RBF.SALESREP_FLG ,RBF.MISC_FLG ,RBF.SERVICE_FLG ,RBF.INTERNATIONAL_DEMO_FLG ,RBF.REPLACEMENT_DEMO_FLG ,RBF.REVENUE_FLG ,RBF.RMA_FLG ,RBF.WIPS_ORIGINATOR_ID_INT ,RBF.DV_DISCOUNT_BAND_CD ,RBF.DV_ORDER_BAND_CD ,RBF.DV_CA_BOOKINGS_TYPE_CD
Check the explain and the runtimes...
If the runtime is not drastic better you won't have a chance to speed this up.
Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles
No - you didn't again provide any valuable input before dumping your questions.
And open new threads for new questions.
Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles
If information is confidential you should not share it in this forum.
Beside this I still don't know what your issue is.
Der Beginn aller Wissenschaften ist das Erstaunen, dass die Dinge sind, wie sie sind.-Aristoteles
Hi,
I have a query which is specifically user/BO query…
It is taking lot of time ..
How to improve the performance of the same..
Below is the query and index/stats information..
LOCKING tab1 FOR ACCESS
SELECT
tab2.TODAY_DATE,
tab3.CIRCLE_NAME,
COALESCE(tab4.CALL_DIRECTION_ID,'SMS_MO','SMMO','SMS_MT','SMMT','DIAMETER','GPRS',tab4.CALL_DIRECTION_ID),
COUNT(DISTINCT tab1.CP_ACCOUNT_KEY),
SUM(tab1.CALL_COUNT),
SUM(tab1.CALL_DURATION)/60
FROM
tab2,
tab3,
tab4,
tab1
WHERE
( tab4.CALL_DIRECTION_KEY=tab1.CALL_DIRECTION_KEY )
AND ( tab3.GEOGRAPHY_KEY=tab1.CIRCLE_KEY )
AND ( tab2.TIME_KEY=tab1.DAY_KEY )
AND ( tab4.CALL_DIRECTION_ID NOT IN ('TRANSIT-O','TRANSIT-I','TRANS','TRANSIT','TRA','TRAN','-901','-902','0','50','1','2','3') )
AND
tab2.TODAY_DATE BETWEEN {d '2012-07-01'} AND {d '2012-07-07'}
GROUP BY
1,
2,
3
tab1
PRIMARY INDEX ( DAY_KEY ,CIRCLE_KEY ,CALL_SCENARIO_KEY ,CALL_DIRECTION_KEY ,CALL_TYPE_KEY ,CP_ACCOUNT_KEY )
PARTITION BY RANGE_N(CAST((SUBSTR(TRIM(BOTH FROM DAY_KEY ),1 ,8 )) AS DATE FORMAT 'YYYYMMDD') BETWEEN '2012-01-01' AND '2017-12-31' EACH INTERVAL '1' DAY ,
NO RANGE, UNKNOWN)
INDEX idx1 ( DAY_KEY ,CIRCLE_KEY ,CALL_DIRECTION_KEY );
COLLECT STATS ON tab1 COLUMN (day_key)
COLLECT STATS ON tab1 COLUMN (cp_account_key);
COLLECT STATS ON tab1 COLUMN (day_key,circle_key,call_direction_key);
Thanks
Lavi



Hi ,
I have a qry which is taking more cpu and skew is more. pls suggest how to tune this query.
Explain:
Explain SELECT
WATR.BK_AS_PROJECT_TASK_NUM ,
WATR.BK_AS_PROJECT_CD ,
WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD ,
WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY ,
WATR.FISCAL_YEAR_WEEK_NUM_INT ,
WATR.AS_TASK_RES_TIMECARD_HOURS ,
WATR.AS_TASK_RES_LABOR_COST_USD_AMT ,
WATR.AS_TASK_RES_LABOR_COST_PRJ_AMT ,
WATR.AS_TASK_RES_LABOR_COST_LOC_AMT ,
WATR.AS_RSRC_LABOR_RT_USD_AMT ,
CURRENT_TIMESTAMP(0),
USER ,
CURRENT_TIMESTAMP(0),
USER
FROM WORKDB_dv4.WI_MT_AS_TIMECARD_RESOURCE WATR
WHERE DML_TYPE = 'I'
AND (WATR.BK_AS_PROJECT_TASK_NUM ,
WATR.BK_AS_PROJECT_CD ,
WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD ,
WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY ,
WATR.FISCAL_YEAR_WEEK_NUM_INT )
NOT IN
( SELECT MATR.BK_AS_PROJECT_TASK_NUM ,
MATR. BK_AS_PROJECT_CD ,
MATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD ,
MATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY ,
MATR.FISCAL_YEAR_WEEK_NUM_INT
FROM ASBIVWDB_dv4.MT_AS_TIMECARD_RESOURCE MATR)
1) First, we lock WORKDB_dv4.WATR for access, and we lock
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE for access.
2) Next, we do an all-AMPs SUM step to aggregate from WORKDB_dv4.WATR
by way of an all-rows scan with a condition of (
"WORKDB_dv4.WATR.DML_TYPE = 'I'"). Aggregate Intermediate Results
are computed globally, then placed in Spool 5.
3) We do an all-AMPs SUM step to aggregate from
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE by way of an all-rows scan with
no residual conditions. Aggregate Intermediate Results are
computed globally, then placed in Spool 7.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by
way of an all-rows scan into Spool 3 (all_amps) (compressed
columns allowed), which is duplicated on all AMPs.
2) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by
way of an all-rows scan into Spool 4 (all_amps) (compressed
columns allowed), which is duplicated on all AMPs.
5) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from WORKDB_dv4.WATR by way of an
all-rows scan with a condition of ("WORKDB_dv4.WATR.DML_TYPE
= 'I'"), which is joined to
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE by way of an all-rows scan
with no residual conditions. WORKDB_dv4.WATR and
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE are joined using an
exclusion merge join, with a join condition of (
"(WORKDB_dv4.WATR.BK_AS_PROJECT_TASK_NUM =
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_PROJECT_TASK_ NUM)
AND ((WORKDB_dv4.WATR.BK_AS_PROJECT_CD =
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_PROJECT_CD) AND
((WORKDB_dv4.WATR.BK_AS_TASK_RSRC_EXPNDTR_TYP_CD =
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_TASK_RSRC_EXP NDTR_TYP_CD)
AND ((WORKDB_dv4.WATR.BK_AS_RSRC_CISCO_WRKR_PRTY_KEY =
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_RSRC_CISCO_WR KR_PRTY_KEY)
AND (WORKDB_dv4.WATR.FISCAL_YEAR_WEEK_NUM_INT =
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.FISCAL_YEAR_WEEK_NU M_INT
))))"), and null value information in Spool 4 and Spool 3.
Skip this join step if null exists. The result goes into
Spool 2 (group_amps), which is built locally on the AMPs.
The size of Spool 2 is estimated with no confidence to be
184,396 rows (37,063,596 bytes). The estimated time for this
step is 0.99 seconds.
2) We do an all-AMPs RETRIEVE step from WORKDB_dv4.WATR by way
of an all-rows scan with a condition of (
"WORKDB_dv4.WATR.DML_TYPE = 'I'") into Spool 9 (all_amps)
(compressed columns allowed), which is built locally on the
AMPs. Then we do a SORT to order Spool 9 by the hash code of
(WORKDB_dv4.WATR.BK_AS_PROJECT_TASK_NUM), and null value
information in Spool 4 and Spool 3. Skip this retrieve step
if there is no null. The size of Spool 9 is estimated with
no confidence to be 184,396 rows (21,943,124 bytes). The
estimated time for this step is 0.33 seconds.
3) We do an all-AMPs RETRIEVE step from
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE by way of an all-rows scan
with no residual conditions into Spool 10 (all_amps)
(compressed columns allowed), which is duplicated on all AMPs.
Then we do a SORT to order Spool 10 by the hash code of (
ASBIDB_DV4.MT_AS_TIMECARD_RESOURCE.BK_AS_PROJECT_TASK_ NUM).
The result spool file will not be cached in memory, and null
value information in Spool 4 and Spool 3. Skip this retrieve
step if there is no null. The size of Spool 10 is estimated
with low confidence to be 146,184,000 rows (6,870,648,000
bytes). The estimated time for this step is 39.67 seconds.
6) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of an
all-rows scan, which is joined to Spool 10 (Last Use) by way of an
all-rows scan. Spool 9 and Spool 10 are joined using an exclusion
merge join, with a join condition of ("(BK_AS_PROJECT_TASK_NUM =
BK_AS_PROJECT_TASK_NUM) AND ((BK_AS_PROJECT_CD = BK_AS_PROJECT_CD)
AND ((BK_AS_TASK_RSRC_EXPNDTR_TYP_CD =
BK_AS_TASK_RSRC_EXPNDTR_TYP_CD) AND
((BK_AS_RSRC_CISCO_WRKR_PRTY_KEY = BK_AS_RSRC_CISCO_WRKR_PRTY_KEY)
AND (FISCAL_YEAR_WEEK_NUM_INT = FISCAL_YEAR_WEEK_NUM_INT ))))"),
and null value information in Spool 4 (Last Use) and Spool 3 (Last
Use). Skip this join step if there is no null. The result goes
into Spool 2 (group_amps), which is built locally on the AMPs.
The size of Spool 2 is estimated with no confidence to be 184,396
rows (37,063,596 bytes). The estimated time for this step is 0.99
seconds.
7) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 2 are sent back to the user as the result of
statement 1.