Efficiently duplicate some rows in PostgreSQL table
I have PostgreSQL 9 database that uses auto-incrementing integers as primary keys. I want to duplicate some of the rows in a table (based on some filter criteria), while changing one or two values, i.e. copy all column values, except for the ID (which is auto-generated) and possibly another column.
However, I also want to get the mapping from old to new IDs. Is there a better way to do it then just querying for the rows to copy first and then inserting new rows one at a time?
Essentially I want to do something like this:
INSERT INTO my_table (col1, col2, col3) SELECT col1, 'new col2 value', col3 FROM my_table old WHERE old.some_criteria = 'something' RETURNING old.id, id;
However, this fails with ERROR: missing FROM-clause entry for table "old" and I can see why: Postgres must be doing the SELECT first and then inserting it and the RETURNING clauses only has access to the newly inserted row.
RETURNING can only refer to the columns in the final, inserted row. You cannot refer to the "OLD" id this way unless there is a column in the table to hold both it and the new id.
Try running this which should work and will show all the possible values that you can get via RETURNING:
INSERT INTO my_table (col1, col2, col3) SELECT col1, 'new col2 value', col3 FROM my_table AS old WHERE old.some_criteria = 'something' RETURNING *;
It won't get you the behavior you want, but should illustrate better how RETURNING is designed to work.
Good! I test this code, but I change this (FROM my_table AS old) in (FROM my_table) and this (WHERE old.some_criteria = 'something') in (WHERE some_criteria = 'something')
This is the final code that I use
INSERT INTO my_table (col1, col2, col3) SELECT col1, 'new col2 value', col3 FROM my_table AS old WHERE some_criteria = 'something' RETURNING *;
This can be done with the help of data-modifiying CTEs (Postgres 9.1+):
WITH sel AS ( SELECT id, col1, col3 , row_number() OVER (ORDER BY id) AS rn -- order any way you like FROM my_table WHERE some_criteria = 'something' ORDER BY id -- match order or row_number() ) , ins AS ( INSERT INTO my_table (col1, col2, col3) SELECT col1, 'new col2 value', col3 FROM sel ORDER BY id -- redundant to be sure RETURNING id ) SELECT s.id AS old_id, i.id AS new_id FROM (SELECT id, row_number() OVER (ORDER BY id) AS rn FROM ins) i JOIN sel s USING (rn);
SQL Fiddle demonstration.
This relies on the undocumented implementation detail that rows from a SELECT are inserted in the order provided (and returned in the order provided). It works in all current versions of Postgres and is not going to break. Related:
Window functions are not allowed in the RETURNING clause, so I apply row_number() in another subquery.
More explanation in this related later answer:
'old' is a reserved word, used by the rule rewrite system. [ I presume this query fragment is not part of a rule; in that case you would have phrased the question differently ]
DROP TABLE IF EXISTS tmptable; CREATE TEMPORARY TABLE tmptable as SELECT * FROM products WHERE id = 100; UPDATE tmptable SET id = sbq.id from (select max(id)+1 as id from products) as sbq; INSERT INTO products (SELECT * FROM tmptable); DROP TABLE IF EXISTS tmptable;
add another update before the insert to modify another field
UPDATE tmptable SET another = 'data';