Can you pls give your suggestions for the below requirement-->
There is a table which would be having history of 24 months of data at any point of time.
there is a requirement that users would be querying based on a month and year only for their reports ,not on the PI of the table.
Any month would be having appx. 120 GB data (after looking at the volume).
we are planning to have column (yr_mon) in table which would be storing the month & year so that there is no need to derive the month and year during the run time in a query.
We were planning to make yr_mon as PPI of this new table but as we know that in order to make the queries efficient pi and ppi should be
used together in order for the faster retreival however in this case User would only be knowing the month and year for querying hence
there would be a redistribution in absence of PI in queries,hence not going for PPI.
Finally we are thinking of making a NUSI on (yr_mon) column for fast retreival on this column alone.
Your opinions on the above pls.. thanks in advance..
If you add a yr_mon column you probably keep your date column resulting in doubling the diskspace, 4 (date) -> 8 (date+int) bytes.
If users want to access a year they still have to write a condition like BETWEEN 201201 and 201212. If you don't want your user to deal with some basic calculation like BETWEEN DATE '2012-01-01' and DATE '2012-12-31' you might easily hide that month-begin/end calculation in a SQL-UDF: BETWEEN month_begin(201201) and month_end(201201).
We were planning to make yr_mon as PPI of this new table but as we know that in order to make the queries efficient pi and ppi should be used together in order for the faster retreival however in this case User would only be knowing the month and year for querying hence there would be a redistribution in absence of PI in queries,hence not going for PPI.
This is a confusing sentence, especially without any punctuation.
When the partitioning column is used in the WHERE-condition you get partition elimination regardless if it's part of the PI or not.
But PI-access/joins without additionally specifying the partitioning might be slower (much slower when the number of partitions is large).
"redistribution" is totally independant of that.
The selectivity of a NUSI on yr_mon is quite low, the optimizer will probably not use it.
thanks Dieter for your suggestions..
Apologies for any confusing sentence. I will reteirate it.
I mean that if we keep (date) column as a partitioned column and Primary Index (some other column) then as users has mentioned that only they would be qyerying on the date column with range of 1 month ..
then in case , access with partitioned column alone (with out PI) will not be that much efficient as it would have been with PI + partioned column.. agree?
so can we do some thing else here for performance tuning..
thanks in advance.
access with partitioned column alone (with out PI) will not be that much efficient as it would have been with PI + partioned column.. agree?
No. When only the partitioning column is used the performance is exacty the same regardless if the partitioning column is also part of the PI or not.
There's a lot of details about PPI in the "Database Design" manual, Chapter 9: Primary Indexes and NoPI Objects
If you are creating your date column as PPI , then using the column in filter condition will lead to partition elimination and will enhance the performace. You may come across situation when it is not always possible to query on PI column so thats ok. However if you have created PPI on the table then you should attempt to use PPI column in your query for better results.
PPI + PI combination to use all the time in query is not possible. So its ok if user query on some other column which is not PI but actually PPI.
Thanks dieter & Raj,
Just to elaborate more ,mentioning below the example-->
Create table Orders
(orderid number NOT NULL,
order_yr_mon integer ,
order name varchar(100),
primary index (orderid)
PARTITION BY RANGE_N(Order_year_mon BETWEEN DATE '2009-12-06' AND DATE '2020-12-31' EACH INTERVAL '1' month);
Now if users query like this
Select * from Order where order_year_mon ='201207' (with out order id)
so my question is would this query would give the partiton elimination or as efficient if in case users query with orderid and order_year_mon both?
As users are insisting that they will query with order_year_mon only. .
Your comments on above pls.
why don't you simply try it yourself and explain the query?
You would have noticed that your DDL will fail.