SQL: subquery has too many columns
I'm trying to make a query with postgresql. The database contains two relations: "kingdom", which includes some english kings, and "dinasty", which contains some people from the stuart dinasty
The relation "kingdom" includes the king name and when his kingdom started and ended. The relation "dinasty" includes name, gender, birth and death.
What I'm trying to query is the king that was the oldest one when he died.
With my query I'm receiving this error at LINE 3 (NOT IN): subquery has too many columns
This is the query:
SELECT kingdom.king, dinasty.birth, dinasty.death FROM kingdom, dinasty WHERE kingdom.king = dinasty.name AND kingdom.king NOT IN ( SELECT DISTINCT R1.king, R1.birth, R1.death FROM ( SELECT DISTINCT R1.king, D1.birth, D1.death FROM kingdom AS R1, dinasty AS D1, dinasty AS D2 WHERE R1.king=D1.name ) AS R1, ( SELECT DISTINCT R1.king, D1.birth, D1.death FROM kingdom AS R1, dinasty AS D1, dinasty AS D2 WHERE R1.king=D1.name ) AS R2 WHERE R1.death-R1.birth < R2.death-R2.birth );
What is inside the NOT IN is correct.
Answers
You are projecting three columns in your subquery, but comparing a single one of them in the IN clause. Select just the required column (r1.king) for the IN in the subquery:
SELECT kingdom.king, dinasty.birth, dinasty.death FROM kingdom, dinasty WHERE kingdom.king = dinasty.name AND kingdom.king NOT IN ( SELECT DISTINCT R1.king FROM ( SELECT DISTINCT R1.king, D1.birth, D1.death FROM kingdom AS R1, dinasty AS D1, dinasty AS D2 WHERE R1.king=D1.name ) AS R1, ( SELECT DISTINCT R1.king, D1.birth, D1.death FROM kingdom AS R1, dinasty AS D1, dinasty AS D2 WHERE R1.king=D1.name ) AS R2 WHERE R1.death-R1.birth < R2.death-R2.birth );
As was answered, your column count didn't match, but there's a much simpler way to write this.
When writing queries, it's best to think them through in stages. First, you need to know how old each king was when they died:
SELECT *, death-birth AS lived_for FROM dinasty
Now that you have that, you can use DISTINCT ON to find the longest-lived king for each kingdom
SELECT DISTINCT ON( name ) name, birth, death, lived_for FROM ( SELECT *, death-birth AS lived_for FROM dinasty ) a ORDER BY name, lived_for DESC ;
The distinct on will take the first row for each distinct value, so it's important you pair it with the correct ORDER BY. First we order by the name of the dinasty, then how long the king lived for in descending order. That means the first king shown for each dinasty will be the longest-lived one, and that is the record that DISTINCT ON will keep for each dinasty.
Note that I also removed the JOIN to kindgom, but you can add that back in if needed:
SELECT k.*, oldest.* FROM ( SELECT DISTINCT ON( name ) name, birth, death, lived_for FROM ( SELECT *, death-birth AS lived_for FROM dinasty ) a ORDER BY name, lived_for DESC ) oldest JOIN kingdom k ON k.king = oldest.name ;
Finally, if you ever need to use multiple columns in a sub-select, you can use the ROW() construct:
SELECT ... FROM table_a WHERE ROW(f1, f2, f3) NOT IN (SELECT f1a, f2a, f3a FROM ... ) ;