Rails - Common fields/report data among multiple models - STI, hstore, or split tables?
I have a rails app in which I have a group of models (let's call them Events) that have some fields in common (date, title, user_id), but then I need some "subtypes". A SalesEvent might have a article_id and an amount. An InterviewEvent might have a comments field. And so on.
I know 3 business requirements I need to meet:
- in some occasions I'll want to frame the Events as a whole (i.e. "get all the Events for this user, and sort them chronologically, grouped in months")
- in other occasions I will need only the "subtypes" ("get all the articles sold by this user").
- the number of subtypes can be moderately high (still TBD, but we estimate around 20, depending on user feedback)
I'm pondering about how to structure the tables to support this model. I came out with 5 possible ways to model this, but each one has its own drawbacks.
- Option A: Separate tables - sales_events and interview_events. This would make 2) very simple, and 3) feasible, but 1) would be very cumbersome to implement.
- Option B: Single table inheritance. This would solve 1) and 2) more or less easily, but but has the issue of requiring more and more nullable fields, which doesn't play well with 3)
- Option C: Using hstore - Since we're using Postgres in production, we could use hstore - we would have a "data" field governed by a "type" string field. This would solve 1), 2) and 3), but ties us to postgresql, and we would implement a key business object in a technology we are not very familiar with. I'd rather avoid that if possible.
- Option D: events table with polymorphic link to ***_event_data. We would basically have an events table with a type and event_data_id, and then we would have sale_event_data, interview_event_data, etc. This satisfies 1) and 3) well, but 2) is a bit weak than in other approaches, since there will be lots of joins involved in linking the events with their data.
- Option E: Sale has_one :event. This does the same as Option D, except that the "link to the other" is on the "data" part. It also solves 1) and 3), and also involves some joins in 2), but it seems a bit more "clean"; there are no polymorphic associations here, just "regular" sql ones.
Right now I'm inclined to use Option E. But I'd like to know if anyone sees an obvious disadvantage on it, or a greater benefit in one of the other options, or a better option that I didn't think of.
I have used almost all your suggested options. While I would eliminate options A, B and D for the following reasons, I can't talk about C because I don't know hstore and don't use Postgres:
- Option A: Separate tables, as you said, would be very difficult to maintain. Each time you would want to change the structure of events, you'd have to do it on all the sub_events tables.
- Option B: Single table inheritance, I have used it a lot and dropped it. I felt like a big design drawback between what you see in the database and what your models look like. Lots of nil fields also.
- Option D: events table with polymorphic link to *_event_data. Polymorphic tables are not meant for that purpose. They are a way to have different type fields in a model so you could reference it without specifying the type explicitly.
Option E seems OK, but where the foreign key should be stored? Hard to tell and may lead to difficult to maintain situations.
Personally, I would go with the code I want to write, what would make using it and reading it later easier. I like things when they are more specific. And I would simply change the way I name my models so that it satisfies my needs. You have to be creative!
I would rather write something like that:
conference.event_information.users OR sales_event.settings.title OR interview.shared_information.comments OR event.interview_details.starting_at
With all that examples, I'd use classical has_many and belongs_to relationships.
I think that the whole concept of data types and inheritance can put you in situations where it does not solve problems or make things clearer. Sometimes you just need to see things a little differently.
I hope it helps.
Rails doesn't support Multiple Table Inheritance by default, but it turns out it's possible to model it pretty closely.
See this article:
Basically, it uses a module to "modify" Option D. I'm still pondering about Wawa Loo's answer, but this one is also worth considering.
EDIT: more on multiple-table inheritance: a gem called "citier" http://peterhamilton.github.com/citier/index.html
EDIT2: I ended up using multiple_table_inheritance:
But I'm not very satisfied with the results. This is probably one of those places where having the business data tightly coupled with the persistence policies (as ActiveRecord does) doesn't help very much. It does the job sufficiently well, but it is not perfect (notably, instance methods can be "inherited", but not class methods. Things like scopes have to be repeated/mixed in separatedly on each subclass).