I have a query (Main Query) is like this. I am executing this in Toad connected to Netezza DB.
SELECT * FROM db1.schema1.Table1 WHERE (pd_num, pd_num_mtr, pd_num_prefix, sqr_num) IN (SELECT pd_num, pd_num_mtr, pd_num_prefix, max (sqr_num) sqr_num FROM db1.schema1.table1 WHERE create_date >= '01/01/2012' AND cd_operator <> 'N' GROUP BY pd_num, pd_num_mtr, pd_num_prefix)
When I execute this I get some 1 million records as my output. I further executed a query (Query2) to analyze the number of records belonging to the group as follows.
select pd_num_mtr,pd_num_prefix,count(*) from db1.schema1.table1 GROUP BY pd_num, pd_num_mtr order by count(*) desc
I get the below out put for this.
pd_num pd_num_mtr count(*) 001 15 500 002 15 200 003 30 100
Which means I have some 500 records pulled for the pd_num and pd_num_mtr combination with each of these records having an update_timestamp value. Now this needs to be modified as follows.
So among these 500 records, I need to pull only the one with maximum update_timestamp which will limit the count to only 1 record instead of 500.1 from 200 records, 1 record from 100 records with the max update timestamp value.
How can I modify the first query (main query) to acheive this? So that if the run the query2, I get the below as the output.
pd_num pd_num_mtr count(*) 001 15 1 002 15 2 003 30 3
Appreciate your help again. Thank you.
We will have to use row_number function for this. Assuming 'update_timestamp' as your timestamp column.
SELECT PD_NUM_MTR,PD_NUM_PREFIX FROM ( SELECT PD_NUM_MTR,PD_NUM_PREFIX,ROW_NUMBER() OVER (PARTITION BY PD_NUM_MTR,PD_NUM_PREFIX ORDER BY update_timestamp desc ) AS RK FROM DB1.SCHEMA1.TABLE1 ) WHERE RK=1;