MySql subselects and counts based on counts

Here is model of my table structure. Three tables.

----------------     ----------------------------     -------------------------------
|possibilities |     |realities                 |     |measurements                 |
|--------------|     |--------------------------|     |-----------------------------|
|pid| category |     |rid | pid | item | status |     |mid | rid | meas | date      |
|--------------|     |--------------------------|     |-----------------------------|
|1  | animal   |     |1   | 1   | dog  | 1 (yes)|     |1   | 1   | 3    | 2012-01-01|
|2  | vegetable|     |2   | 1   | fox  | 1      |     |2   | 3   | 2    | 2012-01-05|
|3  | mineral  |     |3   | 1   | cat  | 1      |     |3   | 1   | 13   | 2012-02-02|
----------------     |4   | 2   | apple| 2 (no) |     |4   | 3   | 24   | 2012-02-15|
                     |5   | 1   | mouse| 1      |     |5   | 2   | 5    | 2012-02-16|
                     |7   | 1   | bat  | 2      |     |6   | 6   | 4    | 2012-02-17|
                     ----------------------------     -------------------------------

What I'm after is a result that will show me a series of counts based on measurement ranges for a particular entry from the "possibilities" table where the status of the related "realities" is 1 (meaning it's currently being tracked), BUT the only measurement that is relevant is the most recent one.

Here is an example result I'm looking for using animal as the possibility.

-----------------------
| 0-9 | 10-19 | 20-29 |
|---------------------|
| 2   | 1     | 1     |
-----------------------

So, in this example, the apple row was not used to make a count because it isn't an animal, nor was bat because it's status was set to no (meaning don't measure), and only the most recent measurements were used to determine the count.

I currently have a workaround in my real world use but it doesn't follow good database normalization. In my realities table I have a current_meas column that get updated when a new measurement is taken and entered in the measurements table. Then I only need to use the first two tables and I have a single SELECT statement with a bunch of embedded SUM statements that use an IF the value is between 0-9 for example. It gives me exactly what I want, however my app has evolved to the point where this convenience has become a problem in other areas.

So, the question is, is there a more elegant way to do this in one statement? Subselects? temporary tables? Getting the counts is the heart of what the app is about.

This is a PHP5, MySQL5, JQuery 1.8 based webapp, in case that give me some more options. Thanks in advance. I love the stack and hope to help back as much as it has helped me.

Answers


Here's one approach

Create a temp table to get the recent measurements

CREATE TEMPORARY TABLE RecentMeasurements
SELECT * FROM Measurements m
INNER JOIN (SELECT max(mid) max_id,date FROM Measurements GROUP BY DATE ORDER BY DATE ) x
ON x.max_id=m.mid

then do you query:

SELECT *, your counting logic
FROM Realities
WHERE status = 1 AND pid = 1
INNER JOIN RecentMeasurements

Here is what I ended up doing based on the two answers suggested.

  1. First I created a temporary table that generates a table of realities that are based on one possibility (animals) and whose status is 1 (yes).
  2. Second I created a temporary table that generates a table of the individual realities from the first temp table, and finds the most recent measurement for each one.
  3. From this second table I do a select that gives me the breakdown of counts in ranges.

When I tried it with just one temp table the query would take 5-10 seconds per possibility. In my real-world use I currently have 30 possibilities (a script loops through each one and generates these temp tables and selects), well over 1,000 realities (600 active on any given day, 100 added per month), and over 21,000 measurements (20-30 added daily). That just wasn't working for me. So breaking it up into smaller pools to draw from reduced it to the whole report running in under 3-4 seconds.

Here is the MySQL stuff with my real-world table and column names.

//Delete the temporary tables in advance
$delete_np_prod = 'DROP TABLE IF EXISTS np_infreppool';
mysql_query($delete_np_prod) or die ("Drop NP Prod Error " . mysql_error ());
$delete_np_max = 'DROP TABLE IF EXISTS np_maxbrixes';
mysql_query($delete_np_max) or die ("Drop NP Max Error " . mysql_error ());

//Make a temporary table to hold the totes of this product at North Plains that are active
$create_np_prod_pool_statement = 'CREATE TEMPORARY TABLE np_infreppool
SELECT inf_row_id FROM infusion WHERE formid = ' . $active_formids["formid"] . ' AND location = 1 AND status = 1';
mysql_query($create_np_prod_pool_statement) or die ("Prod Error " . mysql_error ());

//Make a temporary table to hold the tote with its most recent brix value attached to it.
$create_np_maxbrix_pool_statement = 'CREATE TEMPORARY TABLE np_maxbrixes
SELECT b.inf_row_id AS inf_row_id, b.brix AS brix from brix b, np_infreppool pool WHERE b.inf_row_id = pool.inf_row_id AND b.capture_date = (SELECT max(capture_date) FROM brix WHERE inf_row_id = pool.inf_row_id )';
mysql_query($create_np_maxbrix_pool_statement) or die ("Brix Error " . mysql_error ());

//Get the counts for slected form from NP
$get_report_np = "SELECT 
        SUM(IF(brix BETWEEN 0 AND 4,1,0)) as '0-4',
        SUM(IF(brix BETWEEN 5 AND 9,1,0)) as '5-9',
        SUM(IF(brix BETWEEN 10 AND 14,1,0)) as '10-14',
        SUM(IF(brix BETWEEN 15 AND 19,1,0)) as '15-19',
        SUM(IF(brix BETWEEN 20 AND 24,1,0)) as '20-24',
        SUM(IF(brix BETWEEN 25 AND 29,1,0)) as '25-29',
        SUM(IF(brix BETWEEN 30 AND 34,1,0)) as '30-34',
        SUM(IF(brix BETWEEN 35 AND 39,1,0)) as '35-39',
        SUM(IF(brix BETWEEN 40 AND 44,1,0)) as '40-44',
        SUM(IF(brix BETWEEN 45 AND 49,1,0)) as '45-49',
        SUM(IF(brix BETWEEN 50 AND 54,1,0)) as '50-54',
        SUM(IF(brix BETWEEN 55 AND 59,1,0)) as '54-49',
        SUM(IF(brix BETWEEN 60 AND 64,1,0)) as '60-64',
        SUM(IF(brix BETWEEN 65 AND 69,1,0)) as '65-69',
        SUM(IF(brix >=70, 1, 0)) as 'Over 70'
    FROM np_maxbrixes";
$do_get_report_np = mysql_query($get_report_np);
$got_report_np = mysql_fetch_array($do_get_report_np);

UPDATE

I got it to work in a single SELECT statement without using temporary tables and it works faster. Using my sample schema above, here is how it looks.

    SELECT 
    SUM(IF(m.meas BETWEEN 0 AND 4,1,0)) as '0-4',
    SUM(IF(m.meas BETWEEN 5 AND 9,1,0)) as '5-9',
    SUM(IF(m.meas BETWEEN 10 AND 14,1,0)) as '10-14',
    SUM(IF(m.meas BETWEEN 15 AND 19,1,0)) as '15-19',
    SUM(IF(m.meas BETWEEN 20 AND 24,1,0)) as '20-24',
    SUM(IF(m.meas BETWEEN 25 AND 29,1,0)) as '25-29',
    SUM(IF(m.meas BETWEEN 30 AND 34,1,0)) as '30-34',
    SUM(IF(m.meas BETWEEN 35 AND 39,1,0)) as '35-39',
    SUM(IF(m.meas BETWEEN 40 AND 44,1,0)) as '40-44',
    SUM(IF(m.meas BETWEEN 45 AND 49,1,0)) as '45-49',
    SUM(IF(m.meas BETWEEN 50 AND 54,1,0)) as '50-54',
    SUM(IF(m.meas BETWEEN 55 AND 59,1,0)) as '54-49',
    SUM(IF(m.meas BETWEEN 60 AND 64,1,0)) as '60-64',
    SUM(IF(m.meas BETWEEN 65 AND 69,1,0)) as '65-69',
    SUM(IF(m.meas >=70, 1, 0)) as 'Over 70'
FROM measurement m, realities r 
WHERE r.status = 1 AND r.pid = " . $_GET['pid'] . " AND r.rid = m.rid AND m.date = (SELECT max(date) FROM measurements WHERE rid = r.rid)

Need Your Help

Django HttpResponse strips JSON of double escapes breaks jQuery

jquery django json

I've got an interesting problem with JSON and Django. I think I've narrowed it down to this problem. The problem is that I have a JSON object something like this: