SQL table join many-to-many very slow
Using PostgreSQL, I'm joining around 10 tables in a query, all of them being many-to-many relationships. The DB is still very small at the moment (hundreds of rows in total, all tables combined), but the query is extremely slow (1+ min processing time).
As every join multiplies the number of rows (joining 2 tables having 5 records each will yield 25 rows), the result quickly becomes huge with more than 300,000 rows. The query is in the following form :
select * from student_profile sp join student_profile_skills sps on sp.id = sps.student_profile_id join student_profile_hobby sph on sp.id = sph.student_profile_id --and other 8 similar joins where sp.id = 1;
The tables are straightforward (junction tables with 2 FKs). What's the recommended practice here ? Is it the query that has to be written in a more optimized way, or use separate queries ? Thx in advance !
Additional info :
CREATE TABLE student_profile ( id serial NOT NULL, first_name text NOT NULL, last_name text NOT NULL, country_id integer, city_id integer, faculty_id integer, university_id integer, degree_id integer, degree_year integer, created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT student_profile_pkey PRIMARY KEY (id), CONSTRAINT student_profile_country_id_fkey FOREIGN KEY (country_id) REFERENCES country (id), CONSTRAINT student_profile_city_id_fkey FOREIGN KEY (city_id) REFERENCES city (id), CONSTRAINT student_profile_faculty_id_fkey FOREIGN KEY (faculty_id) REFERENCES faculty (id), CONSTRAINT student_profile_university_id_fkey FOREIGN KEY (university_id) REFERENCES university (id), CONSTRAINT student_profile_degree_id_fkey FOREIGN KEY (degree_id) REFERENCES degree (id) ); CREATE TABLE student_profile_skill ( id serial NOT NULL, student_profile_id integer NOT NULL, skill_id integer NOT NULL, position integer NOT NULL, created_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP, updated_at timestamp with time zone NOT NULL DEFAULT CURRENT_TIMESTAMP, CONSTRAINT student_profile_skill_pkey PRIMARY KEY (id), CONSTRAINT student_profile_skill_student_profile_id_fkey FOREIGN KEY (student_profile_id) REFERENCES student_profile (id), CONSTRAINT student_profile_skill_skill_id_fkey FOREIGN KEY (skill_id) REFERENCES skill (id), CONSTRAINT student_profile_skill_unique UNIQUE (student_profile_id, skill_id), CONSTRAINT student_profile_skill_position_unique UNIQUE (student_profile_id, position) );
The recommended practice in almost all many-to-many connections is to split them up to one-to-many/many-to-one. However, you have 1-to-many tables in a 3 way join that results in an insane amount of repeating records if your students have more than 1 skill/hobby.
Example of a 3 way join where you have 1 student with 2 skills and 1 hobby:
Student Skill Hobby Smith J. Linguistic Fishing Smith J. Profiling Fishing
I would suggest 1 of 2 options:
- Only select the values you need when you need them, (because I can't imagine you needing a table with both hobbies and skills. For instance: Separate views for hobbies and skills.
- Create a script for a temporary table where you write up all skills into 1 field and all hobbies into another. (You'll probably need to use cursors to build them, so keep in mind you don't want to rebuild them often).