Core Data model design — search vs relationships?
I'm familiar with Core Data basics and have done some dabbling, but have not really done any major apps. Now I need to plan for one. And the question is not specifically about Core Data, but more about data design in general, though I am going to use Core Data to implement it on iPhone which is important for considering performance.
Imagine I am making an email app, where emails are the core object. I need to provide multiple views into the email store: search by user as well as many other criteria: say, "all emails with more than two recipients", "all emails where subject is longer than X", "all emails containing word X" etc.
Some objects, like people (senders/recipients), lend themselves naturally to being modeled as first-class objects, so I could do that and just create many-to-many relations between people and emails. Other searches, such as some examples above, are more artificial and there is no natural way to model them. However, I am able to enumerate the new searches in advance, i.e I know beforehand what will be the criteria.
So, to do things like "emails with >2 recipients" and "emails where subject is longer than X", I think I have two strategies:
1) model these as a special "search" object, and create many-to-many relations between emails and search objects when inserting new objects into store so it is a simple join query when searching;
2) not model anything beyond the core email object and just do searches with predicates from the store at runtime.
My question is:
based on your Core Data instincts, how big is the difference between these two strategies from a performance perspective? My gut tells me #1 will always be faster, but if it is 10%, I am willing to take the performance hit in order to be more flexible with #2. But if #2 will be 200% slower, I need to put more work into modeling the search object and essentially pre-generating all the search results.
I know the exact answer will depend on specifics of data, but there must be a gut feeling you have :) Let's say there are on the order of tens of thousands, but not millions, of content objects, and each record is a few paragraphs of content text with several fields of metadata.
Typically, I would recommend going with strategy two and only spend time researching and developing other techniques if you actually run into performance issues during testing. Core Data is often faster than people think especially on the iPhone.
However, if you are able to determine all the possible searches ahead of time, that does give you an advantage. It sounds like as an email is created, you would check it and add it to all the appropriate "search" objects. My gut feeling is that strategy one would be significantly faster especially at tens of thousands of email objects.