Hi, Can anyone suggest me on how to delete the duplicate records from a teradata table.I should not create a new similar table,but delete from the existing table only. thanks,Nagendhran
If you're saying you cannot create a new table at all to assist you in removing dups, I don't think there is a way to do that. I would suggest 3 options:1) Insert/select into a new SET table. Then drop the old table and rename the new one to the old name.2) Insert/select into a new table grouping by (or using distinct) on all columns. Then drop the old table and rename the new one to the old name.3) Insert only the dups into a 2nd SET table (by doing a group on all columns with a HAVING COUNT(*) > 1). Then do a delete/join (joining the dups table back to the original table), followed by an insert/select from the dups table into the first (in the same unit of work if you need to). This will at least keep the existing table intact, but the dups will be temporarily deleted before one copy of them is inserted back in.
Another option that I thought of after my last post is to export the dup rows (with a GROUP BY on all columns and HAVING COUNT(*) > 1) to a file. Then, using Mload or Tpump to bring the rows back in and do a delete and insert for each row.With this method, there is no need for a 2nd table and you won't see all of the dups rows disappear from the table at the same time. The re-insert of each dup row will occur immediately after the deletion of the rows.
Export the records to a flatfile and then load using fastload. Fastload does not load the duplicate records.
Ramakrishna Vedantam, Tata consultancy services, Hyderabad
Thanks barry,ramakrishna for your suggestions.It really helped me a lot.rgds.,Nagendhran
Here is a good tool to find and delete duplicate files
[url] http://www.duplicateFilesDeleter.com [/url]
I have tried www.duplicateFilesDeleter.com and it does work.
del from yourtablename
where primary index column in
sel primary index column from yourtablename
qualify row_number()over(partition by primary index order by some column) > 1
.. it will keep one record and deletes all the other. you do not need to create any seperate table for this
Did you notice, that the two previos messages where kind of spam and the original post was four years old?
Btw, your solution will not work, but remove all duplicate rows.
How to delete duplicate Records from TD table,
here is multistep approach(more of a workaround)
create multiset table delete_duplicates(
Now insert duplicate rows.
Alter table to add new column rank1(give any name you like);
Now run the following insert and delete statement,
insert into delete_duplicates
select col1,col2,row_number() over(partition by col1,col2 order by col1,col2 )
delete from delete_duplicates where rank1 is null or rank1>1;
try the below query.
delete from <Table_Name> where <Table_Name>.rowid
not in (select max(<Table_Name>.rowid) from <Table_Name>
group by <C1>, <C2>);