Searching multiple columns in SQL for value vs storing in one column

I need to store lotto numbers in a SQL database. There are 8 separate numbers that make up the lotto number.

EX: 07-12-34-40-59-80-88-89

I first thought to store this data in a single row with columns named lotto1 - lotto8. This makes sense because lotto numbers are always displayed low to high, and users will need to see the stored lotto numbers. Writing a select statement would be trivial.

I also need to see how many of the lotto numbers are winners. To do this I need to search for each winning number in every single column.

EX: If the winning numbers were 01-02-03-04-05-06-07-08, I would need to check every column to see "07" is in my lotto1 column.

Checking for winners would be easier if all lotto numbers were stored in a 1 column, and they had a LottoTicketID to associate the 8 rows of data with 1 ticket. When checking for winners, I don't need to say which specific numbers/positions were matches, just how many matches there were.

SELECT LottoTicketID,Count(LottoTicketID) from LOTTOTABLE WHERE LottoNumber in (01,02,03,04,05,06,07,08) GROUP BY LottoTicketID

Storing this way would cause problems because I would need to write 7 joins just to display the numbers.

Can someone suggest a way to store this data so that I can: 1. Quickly displayed the lotto numbers/entires 2. Check for winners

Answers


The way you'll probably want to tackle this is to have two tables to store information about lottery tickets:

  • lottery_tickets: To store information about each individual lottery ticket:

    lottery_tickets
    ----------------------
    id [PK]
    first_name
    last_name
    date_given
    ...
    
  • lottery_ticket_numbers: To store that corresponding lottery ticket's numbers:

    lottery_ticket_numbers
    ----------------------
    id        [PK](FK)
    number    [PK]
    position  
    

Connect the two on the id field via a 1:N identifying relationship, and have each lottery ticket have 8 corresponding rows in the lottery_ticket_numbers table. The position field will keep track of each numbers' position (1-8) in the whole string of numbers.

What this will allow you to do is easily figure out the winning lottery tickets by JOINing the tables together and narrowing it down to which lottery tickets have ALL the matching numbers:

SELECT
    a.*,
    GROUP_CONCAT(b.number ORDER BY position SEPARATOR '-') AS lottery_number
FROM
    lottery_tickets a
INNER JOIN
    lottery_ticket_numbers b ON a.id = b.id
WHERE
    b.number IN (07,12,34,40,59,80,88,89)
GROUP BY 
    a.id
HAVING 
    COUNT(b.number) = 8

The result set will look something like this:

id   |   first_name   |   last_name   |   date_given   |   lottery_number
--------------------------------------------------------------------------------
823  |   Zane         |   Bien        |   2012-01-01   |   34-80-07-89-12-40-59-88
2321 |   Jeff         |   Clark       |   2012-01-14   |   59-07-88-40-12-34-80-89

Note:

  • If you are using MySQL, the GROUP_CONCAT() function consolidates the winning tickets' numbers in the format XX-XX-XX-XX-XX-XX-XX-XX with numbers in their original order that the person got it. If you're not using MySQL, I'm not sure if similar function exists in other DBMSs.

Now you'll probably want to store data about the actual winning lottery numbers as well. For this, you will also utilize two tables in much the same way: one for the lottery number and another for its corresponding numbers:

  • winning_numbers:

    winning_numbers
    ----------------------
    id [PK]
    date_pulled
    
  • winning_numbers_numbers:

    winning_numbers_numbers
    ----------------------
    id     [PK](FK)
    number [PK]
    

To query for the winning tickets:

SELECT
    a.*,
    GROUP_CONCAT(b.number ORDER BY position SEPARATOR '-') AS lottery_number
FROM
    lottery_tickets a
INNER JOIN
    lottery_ticket_numbers b ON a.id = b.id
WHERE
    b.number IN
    (
        SELECT number
        FROM winning_numbers_numbers
        WHERE id = <id of a particular lottery number>
    )
GROUP BY 
    a.id
HAVING 
    COUNT(b.number) = 8

Nearly the same as the previous query, except that now the number list is a result set from a subquery.

Displaying the winning lottery numbers is simple. Ex:

Retrieve all winning numbers and their pull date; displaying their number string as numbers in ascending order, and ordering by the most recent pulled number:

SELECT
    GROUP_CONCAT(b.number ORDER BY b.number SEPARATOR '-') AS winning_number,
    a.date_pulled
FROM
    winning_numbers a
INNER JOIN
    winning_numbers_numbers b ON a.id = b.id
GROUP BY 
    a.id
ORDER BY
    a.date_pulled DESC

Note that when you insert winning numbers, you won't have to worry at all about the order of the numbers which you insert. Since they are always displayed in ascending order, the ORDER BY in the GROUP_CONCAT() takes care of that for you.


Now let's take a look at what you'll have to do if you use the column-based approach (storing winning numbers and ticket numbers as single row but with eight columns housing each number):

Assuming the schema:

winning_numbers(id, date_pulled, n1, n2, n3, n4, n5, n6, n7, n8)

ticket_numbers(id, fname, lname, date_given, n1, n2, n3, n4, n5, n6, n7, n8)

Find all winning tickets for a given lottery number:

SELECT
    a.*
FROM 
    ticket_numbers a
INNER JOIN
    winning_numbers b ON
        a.n1 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n2 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n3 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n4 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n5 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n6 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n7 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        a.n8 IN (b.n1, b.n2, b.n3, b.n4, b.n5, b.n6, b.n7, b.n8) AND
        b.id = <winning number id>

That's one heck of a lot of INs if you were to ask me...!!!

The advantage of this approach is that you wouldn't need the MySQL-specific GROUP_CONCAT function to display the ticket numbers.


Storing this way would cause problems because I would need to write 7 joins just to display the numbers.

The only time you would need to do 7 joins to show all the numbers would be if you need them on a single row. If you're processing these outside the database (or even using a stored procedure) collecting them into a readable format is trivial.

Another option that you might have, if your database supports it, would be to put all the numbers in a single array-typed column or use some sort of multi-row analytic function. You could always just write a view on this to solve the problem once.


Why would you need to write 7 joins for displaying the numbers? You could write something like this:

SELECT LottoNumber FROM LOTTOTABLE
WHERE LottoTicketID=n
ORDER BY LottoNumber

And you could check them using the query you suggested already:

SELECT LottoTicketID, Count(LottoTicketID) 
FROM LOTTOTABLE 
WHERE LottoNumber in (01,02,03,04,05,06,07,08) 
GROUP BY LottoTicketID

You could also restrict this list to just the winners by using:

HAVING Count(LottoTicketID) = 8

You could also store the lotto numbers as TINYINT to keep the database size down.


There are a lot of advantages in storing the numbers in a single column, as you mention. IF the only remaining issue is the complexity of displaying them together, there are many possibilities for doing that cleanly and easily - and it's frequently overlooked, so the question is frequently asked.

It was dealt with well here, for several SQL flavors, as well as some platform independent options: Concatenate many rows into a single text string?


Need Your Help

Matplotlib contourplot fails when no contour exists

python numpy matplotlib scipy

I use the cookbook example frin http://wiki.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data to make contourplots. However some of my data may just contain zeros, in which case I get an

iPhone OS: Rotate just the button images, not the views

iphone objective-c user-interface

I am developing an iPad application which is basically a big drawing canvas with a couple of button at the side. (Doesn't sound very original, does it? :P)