'group by' works on MySQL, but not Oracle

I have a query that works on MySQL but doesn't work on Oracle, and I'm trying to convert. This is my table:

unique_row_id  http_session_id  page_name   page_hit_timestamp
----------------------------------------------------------------
0              123456789        index.html  2010-01-20 15:00:00
1              123456789        info.html   2010-01-20 15:00:05
2              123456789        faq.html    2010-01-20 15:00:15
3              987654321        index.html  2010-01-20 16:00:00
4              987654321        faq.html    2010-01-20 16:00:05
5              987654321        info.html   2010-01-20 16:00:15
6              111111111        index.html  2010-01-20 16:01:00
7              111111111        faq.html    2010-01-20 16:01:05
8              111111111        info.html   2010-01-20 16:01:15

The SQL is

select http_session_id, unique_row_id, page_name, page_hit_timestamp 
from page_hits 
group by http_session_id;

On MySQL, this will return 3 rows (one for each unique http_session_id).

On Oracle, I get a "ORA-00979: not a GROUP BY expression" error. I've tried playing around with distinct too, but I can't get it to work.

Just to be clear - I would like a ResultSet that contains one row per unique http_session_id. It is preferable that the unique_row_id would be the max one (e.g. 2 for http_session_id==123456789), but this is not significant.

I'm on the verge of breaking this into multiple separate sql statements (one "select distinct http_session_id", and the other to iterate through all these and select the max(unique_row_id). Any pointers would be gratefully received - I would love to avoid this!

Rgds, Kevin.

Answers


Would this work:

select max(unique_row_id), http_session_id
from page_hits
group by http_session_id

Incidentally; what does my sql return in your resultset for columsn that are included in the resultset but not in the group by clause? (page_name, page_hit_timestamp)


The reason you encounter the ORA error is because MySQL supports non-standard GROUP BY clauses, calling it a "feature". It's documented here.

The standard SQL GROUP BY clause must include ALL columns specified in the SELECT clause, that are not wrapped in aggregate functions (LIKE COUNT, MAX/MIN, etc), to be specified in the GROUP BY clause.

If you want one, unique row per http_session_id value - look at using ROW_NUMBER:

SELECT x.*
  FROM (select http_session_id, unique_row_id, page_name, page_hit_timestamp,
               ROW_NUMBER() OVER (PARTITION BY http_session_id 
                                      ORDER BY http_session_id) AS rank
          FROM page_hits) x
 WHERE x.rank = 1

I think that a GROUP BY requires a variable to have been used in a WHERE clause or aggregation function in the SQL standard?

Try using SELECT MAX(unique_row_id) GROUP BY http_session_id.


In standard SQL, if you have a GROUP BY clause, all columns that are not part of it have to be in aggregates. In MySQL, this rule was relaxed by design.

For instance, this is allowed in MySQL but not in standard SQL:

SELECT customer_id, country, SUM(amount) FROM records GROUP BY customer_id

There's one caveat: MySQL assumes you know what you're doing. If the same customer has records in multiple countries, the query will simply grab the first country in the table, disregarding all others. Furthermore, since the order of rows is undefined, and there is no ORDER BY, you might get different results each time you run the query.

In Standard SQL, you have two choices:

SELECT customer_id, country, SUM(amount) FROM records GROUP BY customer_id, country

or

SELECT customer_id, MIN(country), SUM(amount) FROM records GROUP BY customer_id

Another option in Oracle, if you want:

select DISTINCT
       FIRST_VALUE(unique_row_id)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) unique_row_id,
       http_session_id,
       FIRST_VALUE(page_name)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) page_name,
       FIRST_VALUE(page_hit_timestamp)
       OVER (PARTITION BY http_session_id
             ORDER BY unique_row_id DESC) page_hit_timestamp
from page_hits;

This will get a distinct set of http_session_id's, and for each, returns the unique_row_id, page_name and page_hit_timestamp from the row with the greatest unique_row_id for that http_session_id, e.g.:

unique_row_id  http_session_id  page_name   page_hit_timestamp
----------------------------------------------------------------
2              123456789        faq.html    2010-01-20 15:00:15
5              987654321        info.html   2010-01-20 16:00:15
8              111111111        info.html   2010-01-20 16:01:15

Need Your Help

FFMpeg: Take Certain Amount of Screenshots Between X and X?

batch-file ffmpeg screenshot

Is there any way to get ffmpeg to take X number of screenshots between X time and X time? The way I'm doing my command line code now is like this:

Closing a connection with a `with` statement

python imap with-statement imaplib

I want to have a class that represents an IMAP connection and use it with a with statement as follows: