MySQL 500 million rows table in select query with join

I'm concerned about the performance of the query below once the tables are fully populated. So far it's under development and performs well with dummy data.

The table "adress_zoo" will contain about 500 million records once fully populated. "adress_zoo" table looks like this:

  CREATE TABLE `adress_zoo` 
     ( `adress_id` int(11) NOT NULL, `zoo_id` int(11) NOT NULL, 
     UNIQUE KEY `pk` (`adress_id`,`zoo_id`), 
     KEY `adress_id` (`adress_id`) ) 
     ENGINE=InnoDB DEFAULT CHARSET=latin1;

The other tables will contain maximum 500 records each.

The full query looks like this:

  SELECT a.* FROM jos_zoo_item AS a 
  JOIN jos_zoo_search_index AS zsi2 ON zsi2.item_id = a.id 
  WHERE a.id IN (   

     SELECT r.id FROM ( 

        SELECT zi.id AS id, Max(zi.priority) as prio 
        FROM jos_zoo_item AS zi 
        JOIN jos_zoo_search_index AS zsi ON zsi.item_id = zi.id 
        LEFT JOIN jos_zoo_tag AS zt ON zt.item_id = zi.id 
        JOIN jos_zoo_category_item AS zci ON zci.item_id = zi.id 
        **JOIN adress_zoo AS az ON az.zoo_id = zi.id** 

        WHERE 1=1 
        AND ( (zci.category_id != 0 AND ( zt.name != 'prolong' OR zt.name is NULL)) 
        OR (zci.category_id = 0 AND zt.name = 'prolong') ) 
        AND zi.type = 'telefoni' 
        AND zsi.element_id = '44d3b1fd-40f6-4fd7-9444-7e11643e2cef' 
        AND zsi.value = 'Small' 
        AND zci.category_id > 15 
        **AND az.adress_id = 5** 

        GROUP BY zci.category_id ) AS r 
  ) 

  AND a.application_id = 6 
  AND a.access IN (1,1) 
  AND a.state = 1 
  AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2012-06-07 07:51:26') 
  AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2012-06-07 07:51:26') 
  AND zsi2.element_id = '1c3cd26e-666d-4f8f-a465-b74fffb4cb14' 

  GROUP BY a.id 
  ORDER BY zsi2.value ASC

The query will usually return about 25 records.

Based on your experience, will this query perform acceptable (respond within say 3 seconds)? What can I do to optimise this?

As adviced by @Jack I ran the query with EXPLAIN and got this:

Answers


This part is an important limiter:

az.adress_id = 5

MySQL will limit the table to only those records where adress_id matches before joining it with the rest of the statement, so it will depend on how big you think that result set might be.

Btw, you have a UNIQUE(adress_id, zoo_id) and a separate INDEX. Is there a particular reason? Because the first part of a spanning key can be used by MySQL to select with as well.

What's also important is to use EXPLAIN to understand how MySQL will "attack" your query and return the results. See also: http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html


To avoid subquery you can try to rewrite your query as:

SELECT a.* FROM jos_zoo_item AS a 
  JOIN jos_zoo_search_index AS zsi2 ON zsi2.item_id = a.id 
  INNER JOIN 
  (   
     SELECT ** distinct ** r.id FROM ( 

        SELECT zi.id AS id, Max(zi.priority) as prio 
        FROM jos_zoo_item AS zi 
        JOIN jos_zoo_search_index AS zsi ON zsi.item_id = zi.id 
        LEFT JOIN jos_zoo_tag AS zt ON zt.item_id = zi.id 
        JOIN jos_zoo_category_item AS zci ON zci.item_id = zi.id 
        **JOIN adress_zoo AS az ON az.zoo_id = zi.id** 

        WHERE 1=1 
        AND ( (zci.category_id != 0 AND ( zt.name != 'prolong' OR zt.name is NULL)) 
        OR (zci.category_id = 0 AND zt.name = 'prolong') ) 
        AND zi.type = 'telefoni' 
        AND zsi.element_id = '44d3b1fd-40f6-4fd7-9444-7e11643e2cef' 
        AND zsi.value = 'Small' 
        AND zci.category_id > 15 
        **AND az.adress_id = 5** 

        GROUP BY zci.category_id ) AS r 
  ) T
    on a.id = T.id
  where 
  AND a.application_id = 6 
  AND a.access IN (1,1) 
  AND a.state = 1 
  AND (a.publish_up = '0000-00-00 00:00:00' OR a.publish_up <= '2012-06-07 07:51:26') 
  AND (a.publish_down = '0000-00-00 00:00:00' OR a.publish_down >= '2012-06-07 07:51:26') 
  AND zsi2.element_id = '1c3cd26e-666d-4f8f-a465-b74fffb4cb14' 

  GROUP BY a.id 
  ORDER BY zsi2.value ASC

This approach don't perform subquery for each candidate row. Performance may be increased only if T is calculated in few milliseconds.


Need Your Help

PHP Updates to database not working

php mysql post

I am trying to create a login system that then shows all users information in a list, I have everything working so far except I can not have the 'rank' and 'role' update on the edit page working at...

quadratic formula with scanner inputs

java input java.util.scanner output quadratic

Okay so I am a complete Java noob, and I'm trying to create a program for class that runs a quadratic equation using scanner inputs. So far what I've got is this: