How to design a database schema for storing text in multiple languages?

We have a PostgreSQL database. And we have several tables which need to keep certain data in several languages (the list of possible languages is thankfully system-wide defined).

For example lets start with:

create table blah (id serial, foo text, bar text);

Now, let's make it multilingual. How about:

create table blah (id serial, foo_en text, foo_de text, foo_jp text,
                              bar_en text, bar_de text, bar_jp text);

That would be good for full-text search in Postgres. Just add a tsvector column for each language.

But is it optimal? Maybe we should use another table to keep the translations? Like:

create table texts (id serial, colspec text, obj_id int, language text, data text);

Maybe, just maybe, we should use something else - something out of the SQL world? Any help is appreciated.

Answers


I think it is best if you create two tables. One for languages, one for ids and so on. first_table( id ) second_table( s_id, id_first_table, language_id, language_text)


Here's a great article the Mozilla developers put together on making their database multilingual. It's specific to CakePHP, but the info can easily be applied to other systems. Also, note that it makes SQL queries significantly more complex, which is a drawback. That will generally be true regardless of your i18n implementation, though.

  1. Part 1
  2. Part 2
  3. Part 3

Need Your Help

Sensitive data in Viewstate?

c# asp.net viewstate

I need to store sensitive data across few pages (run over https) per session.