Avoid duplicates in INSERT INTO SELECT query in SQL Server
I have the following two tables:
Table1 ---------- ID Name 1 A 2 B 3 C Table2 ---------- ID Name 1 Z
I need to insert data from Table1 to Table2. I can use the following syntax:
INSERT INTO Table2(Id, Name) SELECT Id, Name FROM Table1
However, in my case, duplicate IDs might exist in Table2 (in my case, it's just "1") and I don't want to copy that again as that would throw an error.
I can write something like this:
IF NOT EXISTS(SELECT 1 FROM Table2 WHERE Id=1) INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 ELSE INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 WHERE Table1.Id<>1
Is there a better way to do this without using IF - ELSE? I want to avoid two INSERT INTO-SELECT statements based on some condition.
Using NOT EXISTS:
INSERT INTO TABLE_2 (id, name) SELECT t1.id, t1.name FROM TABLE_1 t1 WHERE NOT EXISTS(SELECT id FROM TABLE_2 t2 WHERE t2.id = t1.id)
Using NOT IN:
INSERT INTO TABLE_2 (id, name) SELECT t1.id, t1.name FROM TABLE_1 t1 WHERE t1.id NOT IN (SELECT id FROM TABLE_2)
Using LEFT JOIN/IS NULL:
INSERT INTO TABLE_2 (id, name) SELECT t1.id, t1.name FROM TABLE_1 t1 LEFT JOIN TABLE_2 t2 ON t2.id = t1.id WHERE t2.id IS NULL
Of the three options, the LEFT JOIN/IS NULL is less efficient. See this link for more details.
In MySQL you can do this:
INSERT IGNORE INTO Table2(Id, Name) SELECT Id, Name FROM Table1
Does SQL Server have anything similar?
I just had a similar problem, the DISTINCT keyword works magic:
INSERT INTO Table2(Id, Name) SELECT DISTINCT Id, Name FROM Table1
Using ignore Duplicates on the unique index as suggested by IanC here was my solution for a similar issue, creating the index with the Option WITH IGNORE_DUP_KEY
In backward compatible syntax , WITH IGNORE_DUP_KEY is equivalent to WITH IGNORE_DUP_KEY = ON.
From SQL Server you can set a Unique key index on the table for (Columns that needs to be unique)
I was facing the same problem recently... Heres what worked for me in MS SQL server 2017... The primary key should be set on ID in table 2... The columns and column properties should be the same of course between both tables. This will work the first time you run the below script. The duplicate ID in table 1, will not insert...
If you run it the second time, you will get a
Violation of PRIMARY KEY constraint error
This is the code:
Insert into Table_2 Select distinct * from Table_1 where table_1.ID >1
A little off topic, but if you want to migrate the data to a new table, and the possible duplicates are in the original table, and the column possibly duplicated is not an id, a GROUP BY will do:
INSERT INTO TABLE_2 (name) SELECT t1.name FROM TABLE_1 t1 GROUP BY t1.name
A simple DELETE before the INSERT would suffice:
DELETE FROM Table2 WHERE Id = (SELECT Id FROM Table1) INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1
Switching Table1 for Table2 depending on which table's Id and name pairing you want to preserve.