Advantages of using two tables instead of a column with 2 different values
I'm creating a database structure. I have to store ingoing and outgoing messages and i'm wondering wich is the best way to do this.
2 separated tables or the same tables with an ENUM('in', 'out') column ?
Any suggestion ?
If you are sending messages to users from other users, what I do is to create a sent_message table and a message_to_users table.
The likelihood is that you won't want to properly delete a message at any point, so I just put flags in for that.
sent_message ------------ sent_message_id from_id int subject varchar(128) body text status char(1) sent_datetime datetime message_to_user ------- message_to_user_id int sent_message_id int to_id int read_datetime datetime status char(1)
The status of the sent_message would be s(ent) or d(eleted) and the status for the message_to_user would be a(rrived), r(ead), or d(eleted)
This method allows for easy "reply all" functionality, and saves space when sending a message to more than one user.
One thing that will dictate your structure is whether incoming and outgoing messages need differnt data stored concerning them. If they do, you likely will need separate tables.
Also will you usually be requesting them separately or will you always need both types from the same query.
In making the determination you need to sit down and decide what data you need to store about each type and how you are going to query the data. That will end up dictating your structure. In a typical message situtation, you will likely have many many records and it will be of benefit to design with that volumn in mind. I might even test both ways with a set of test records in the multimillions to see what impact my basic desgn choice had. I know people talk about not prematurely optimizing, but the basic structure of a database is very hard to change once you have millions of actual records, it is worth the time to set it up now with test records and see which of the possbilities will work best with the type of querying you need to do (don't forget to test with indexes as they make a huge performance differnce). This is not premature optimization, this is testing the likely load before commiting to a poor design and being unable to refactor later when users are screaming about performance.
Since the message is the very same object you should include box_id as a reference to boxes table. This will help you store messages not only in inbox/outbox, but in trash, drafts and other "folders" you may think of.
Otherwise you can have many-to-many relationship and store same message in several boxes (just like gmail labels).
If 90% of the columns are the same, use one table.
TABLE messages id INT subject STRING direction ENUM INDEX direction
I would suggest separate tables if they are accessed/managed by separate processes. If the same process manages both types of message, then use the same table.
One table is the best solution.
Usually any given data entity should be stored in a table. In this case message is your data entity.
As a side point I would advise against the use of enums in tables - in this case the messages will belong to incoming or outgoing - so the message direction should be stored in a separate table with an constraint to ensure that they are valid.
Also the direction is probably a mis-nomer and you may wish to call in & out folders or location or boxes (as @Eimantas points out).
I think it depends on whether or not you would ever want to see all messages at once. If your queries are going to return either all of the incoming OR all of the outgoing, but NEVER all of them together; then you'll want two tables. Especially if you end-up with a lot of rows in each, two tables will be a faster solution.
If it's a messaging system for messages between users of the same site/application, you could simply use 1 table with senderId and recipientId. Inbox are the messages where the users id matches recipientId, outbox where the users id matches senderId.
Mind you, this doesn't scale well for messages to multiple users at once though. In that case you need a separate table as illustrated by Matt Allen.
If the messages are really the same entity, differing only in a single attribute value, use a single table. If you want a subset available in certain routines, create single-table views to get only the inbound or outbound messages.
If the messages are different entities, and particularly if they validate against a different set of userids, you'll need two table.