PostgreSQL - weird query planner behavior

Assume I have a query like this:

SELECT * 
  FROM clients c
  INNER JOIN clients_balances cb ON cb.id_clients = c.id
  LEFT JOIN clients com ON com.id = c.id_companies
  LEFT JOIN clients com_real ON com_real.id = c.id_companies_real
  LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table
  LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table
  LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms
  LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND cpgw.id_companies = c.id_companies_real)
WHERE 
  EXISTS (SELECT * FROM accounts WHERE (name LIKE 'x' OR accname LIKE 'x' OR ani LIKE 'x') AND id_clients = c.id)
  AND c."type" = '0'
  AND c."id" > 0 
ORDER BY c."name";

This query takes around 35 seconds to run when used in the production environment ("clients" has about 1 million records). However, if I take out ANY join - the query will take only about 300 ms to execute.

I've played around with the query planner settings, but to no avail.

Here are a few explain analyze outputs:

http://explain.depesz.com/s/hzy (slow - 48049.574 ms)
http://explain.depesz.com/s/FWCd (fast - 286.234 ms, rate_tables JOIN removed)
http://explain.depesz.com/s/MyRf (fast - 539.733 ms, paygw_clients_profiles JOIN removed)

It looks like in the fast case the planner starts from the EXISTS statement and has to perform join for only two rows in total. However, in the slow case it will first join all the tables and then filter by EXISTS.

What I need to do is to make this query run in a reasonable time with all seven join in place.

Postgres version is 9.3.10 on CentOS 6.3.

Thanks.

UPDATE

Rewriting the query like this:

SELECT * 
  FROM clients c
  INNER JOIN clients_balances cb ON cb.id_clients = c.id
  INNER JOIN accounts a ON a.id_clients = c.id AND (a.name = 'x' OR a.accname = 'x' OR a.ani = 'x')
  LEFT JOIN clients com ON com.id = c.id_companies
  LEFT JOIN clients com_real ON com_real.id = c.id_companies_real
  LEFT JOIN rate_tables rt_orig ON rt_orig.id = c.orig_rate_table
  LEFT JOIN rate_tables rt_term ON rt_term.id = c.term_rate_table
  LEFT JOIN payment_terms pt ON pt.id = c.id_payment_terms
  LEFT JOIN paygw_clients_profiles cpgw ON (cpgw.id_clients = c.id AND cpgw.id_companies = c.id_companies_real)
WHERE 
  c."type" = '0' AND c.id > 0 
ORDER BY c."name";

makes it run fast, however, this is not acceptable, as account filtration parameters are optional, and I still need the result if there are no matches in that table. Using "LEFT JOIN accounts" instead of "INNER JOIN accounts" kills the performance again.

Answers


As suggested by Tome Lane, I've changed the following two parameters: join_collapse_limit and from_collapse_limit to 10 instead of the default 8, and this solved the issue.


Need Your Help

C++ Priority Queue with Struct

c++ struct priority-queue

I'm using Priority Queue with Struct

Save excel as pdf, name of the file as a variables value

excel vba excel-vba

Im trying to save a selection to pdf, everything works fine except that I cant save the file with and specific name, I want the name to be report Sub and the date and time ... it works fine except...