Can I delete database duplicates based on multiple columns?

I asked this question a while back to delete duplicate records based on a column. The answer worked great:

delete from tbl
where id NOT in
(
select  min(id)
from tbl
group by sourceid
)

I now have a simillar situation but the definition of duplicate record is based on multiple columns. How can I alter this above SQL to identify duplicate records where a unique record is define as concatenated from Col1 + Col2 + Col3. Would i just do something like this ?

delete from tbl
where id NOT in
(
select  min(id)
from tbl
group by col1, col2, col3
)

Answers


This shows the rows you want to keep:

;WITH x AS 
(
  SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
      (PARTITION BY col1, col2, col3 ORDER BY id)
  FROM dbo.tbl
)
SELECT col1, col2, col3 FROM x WHERE rn = 1;

This shows the rows you want to delete:

;WITH x AS 
(
  SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
      (PARTITION BY col1, col2, col3 ORDER BY id)
  FROM dbo.tbl
)
SELECT col1, col2, col3 FROM x WHERE rn > 1;

And once you're happy that the above two sets are correct, the following will actually delete them:

;WITH x AS 
(
  SELECT col1, col2, col3, rn = ROW_NUMBER() OVER 
      (PARTITION BY col1, col2, col3 ORDER BY id)
  FROM dbo.tbl
)
DELETE x WHERE rn > 1;

Note that in all three queries, the first 6 lines are identical, and only the subsequent query after the CTE has changed.


Try this one. I created a table tblA with three columns.

CREATE TABLE tblA
(
id int IDENTITY(1, 1),
colA int, 
colB int, 
colC int
)

And added some duplicate values.

INSERT INTO tblA VALUES (1, 2, 3)
INSERT INTO tblA VALUES (1, 2, 3)
INSERT INTO tblA VALUES (4, 5, 6)
INSERT INTO tblA VALUES (7, 8, 9)
INSERT INTO tblA VALUES (7, 8, 9)

If you replace the select with a delete in the statement below you will have your multiple column delete working.

SELECT MIN(Id) as id
FROM
(
SELECT COUNT(*) as aantal, a.colA, a.colB, a.colC
FROM tblA       a
INNER JOIN tblA b   ON b.ColA = a.ColA
                    AND b.ColB = a.ColB
                    AND b.ColC = a.ColC
GROUP BY a.id, a.colA, a.colB, a.colC
HAVING COUNT(*) > 1
) c
INNER JOIN tblA d ON d.ColA = c.ColA
                    AND d.ColB = c.ColB
                    AND d.ColC = c.ColC
GROUP BY d.colA, d.colB, d.colC

Need Your Help

Trying to use qsort with vector

c++ sorting vector qsort

I'm trying to learn c++ and was trying using sort and qsort. sort() works just fine

How to obtain a list of titles of all Wikipedia articles

mediawiki wikipedia wikipedia-api mediawiki-api

I'd like to obtain a list of all the titles of all Wikipedia articles. I know there are two possible ways to get content from a Wikimedia powered wiki. One would be the API and the other one would ...