SQL Query takes much longer when decreasing the date range

I have a query which includes a date range.

When I have the date range as "> '2013-05-01'" in the query, I get the results in 188.300 ms. However if I change the date range to "BETWEEN '2013-05-01' AND '2013-08-01'" I get the results after 1102312.636 ms. This doesn't make sense to me at all as the 2nd date range includes a lot less data.

Here are the 2 queries and their explains underneath:

SELECT 
SUM(quantity)
FROM
transaction_master
INNER JOIN transaction_line_items ON transaction_line_items.link_guid = transaction_master.guid
AND master_code = 'AAL027PU' and item_colour = 'BE'
AND (sale_date  > '2013-05-01')
AND transaction_type = 'POSSALE'

Explain: http://explain.depesz.com/s/hPtI

SELECT 
SUM(quantity)
FROM
transaction_master
INNER JOIN transaction_line_items ON transaction_line_items.link_guid = transaction_master.guid
AND master_code = 'AAL027PU' and item_colour = 'BE'
AND sale_date BETWEEN '2013-05-01' AND '2013-08-01'
AND transaction_type = 'POSSALE'

Explain: http://explain.depesz.com/s/WN1

Thanks!

Answers


Examination of the query plans suggested bad row-count estimates, due to:

  • Uneven distributions and low statistics targets;

  • Outdated stats (is autovacuum running often enough?)

  • A query planner mis-estimation

First, run ANALYZE. If you need to do this by hand, it probably means autovacuum isn't running enough, or you just recently bulk-loaded a table and autovac hasn't kicked off yet.

If that doesn't help, adjust the stats targets for the relevant columns, so analyze samples more rows.

If you still get the same estimates, this may suggest that the planner is mis-estimating the way the inputs will combine. That's harder to deal with; you'd need to report it to pgsql-performance and seek advice there.


Your conditions apply to the JOIN statement and I suspect that you should add WHERE clause. Otherwise it looks like server scans all tables for the records with matching condition instead of filter out required records first and then do the join. Of course, simple join using foreign key only will ve much faster. So try this:

SELECT 
SUM(quantity)
FROM
transaction_master
INNER JOIN transaction_line_items ON transaction_line_items.link_guid = transaction_master.guid
WHERE
    master_code = 'AAL027PU' and item_colour = 'BE'
AND sale_date BETWEEN '2013-05-01' AND '2013-08-01'
AND transaction_type = 'POSSALE'

Need Your Help

How can I read blobfield without freezing?

multithreading delphi blob freeze unidac

I want to read blobfield (with blobstream) from client side (over network) but application freezes while fetching data. How can I read blobfield without freezing and showing percentage with a progr...