How many foreign keys is too many?

After running across this article: http://diovo.com/2008/08/are-foreign-keys-really-necessary-in-a-database-design/

It seems like a good idea to use foreign keys when designing a database. But when are you using too many?

For example, suppose I have a main table used to store a list of machinery part information that other programs make reference to with the following columns:

  1. ID
  2. Name
  3. Colour
  4. Price
  5. Measurement Units
  6. Category
  7. etc...

Should I be making tables containing a list of all possible colours, units and categories and then setting these as foreign keys to the corresponding columns in my machine part info table? At what point would the benefit of using foreign keys out weight the fact that I'm making all these extra tables and relationships?

Answers


Any attribute for which you want to be able to state, with certainty, that there are only known, valid values present in the database should be protected with a foreign key. Otherwise, you can only hope to catch invalid values in your application code and whatever interfaces are created in the future.

It is NOT a bad thing to have more tables and relationships. The only issue -- and it usually is not one -- has to do with the overhead of maintaining the indexes that are used in enforcing those relationships. Until you experience performance issues you should create a foreign key relationship for every column that "should" have one (because the values need to be validated against a list).

The performance considerations would have to be pretty dire before I would be willing to sacrifice correctness for performance.


Every Design is a compromise of competing goals, so there are very few simple answers (except the wrong ones).

I would certainly put discrete measures such as name, color, category, measure unit, etc.. in their own key tables. Variable measures (cost, number of units ,etc..) not so much, unless you have units in standard size packages (i.e. only 1, 6, 12, etc..)


The simplest way to design a database is to start with the requirements. In one classical methodology, the requirements are summarized in an ER (Entity-Relationship) model. In this model, relationships between entities are not invented, they are discovered. If they lie within the scope of the information the database is supposed to cover, then they are part of the model. Period.

From there, when you turn to database design, you already know what relationships you need. You have a few decisions to make about the structure of your tables, but almost all the foreign keys that reference a primary key are a direct consequence of the requirements.

Of course, if you are at liberty to change the requirements as you go through the design process, then you can do anything you want.


Dimensional modelling covers all points of your question well. Having too many foreign key relationships can make query performance suffer. Kimball's Group Reader is a great introduction to Dimensional Design and how to translate customer requirements to a schema.

http://en.wikipedia.org/wiki/Dimensional_modeling

The main question to ask is 'how constrained does the data need to be?' Concerning the color of machine parts, I'd assume, it would be in everyone's best interest not to have burnt ciena and camomille as color options. So, a look up table for these would be best.


Need Your Help

Will automatic return type deduction work for main?

c++ c++14

Will I be able to do the following for the main function in C++1y (C++14):

always accept webRTC webcam request

google-chrome webcam webrtc

I'm building a web application that uses the computer's webcam via webRTC.