Relational Database Normalization Issues

My website is a job seek site. There are three types of users: User, Employee or Admin.

  • User can search and apply for a job,
  • Employee can post a job, browser resumes,
  • Admin is to manage the site.

Here are all the tables that I defined.

-- Users table, users = jobseekers, containing jobseekers info  
DROP TABLE IF EXISTS users;
CREATE TABLE users (
  user_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  first_name VARCHAR(20) NOT NULL,
  last_name VARCHAR(40) NOT NULL,
  email VARCHAR(80) NOT NULL,
  pass CHAR(60) NOT NULL,
  user_phone VARCHAR(11) NOT NULL,
  user_address VARCHAR(250) NOT NULL,
  active CHAR(32) NULL,
  last_login_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
                                     ON UPDATE CURRENT_TIMESTAMP,
  last_login_ip VARCHAR(15) NOT NULL,
  registration_time DATETIME NOT NULL,
  registration_ip VARCHAR(15) NOT NULL,
  PRIMARY KEY (user_id),
  UNIQUE KEY (email),
  INDEX login (email, pass)
) ENGINE = INNODB;


-- Employers table, containing employers info
DROP TABLE IF EXISTS employers;
CREATE TABLE employers (
  employer_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  first_name VARCHAR(20) NOT NULL,
  last_name VARCHAR(40) NOT NULL,
  company_name VARCHAR(80) NOT NULL,
  email VARCHAR(80) NOT NULL,
  pass CHAR(40) NOT NULL,
  employer_phone VARCHAR(11) NOT NULL,
  employer_mobile VARCHAR(11),
  employer_address VARCHAR(250) NOT NULL,
  active CHAR(32) NULL,
  last_login_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
                                     ON UPDATE CURRENT_TIMESTAMP,
  last_login_ip VARCHAR(15) NOT NULL,
  registration_time DATETIME NOT NULL,
  registration_ip VARCHAR(15) NOT NULL,
  PRIMARY KEY (employer_id),
  UNIQUE KEY (email),
  INDEX login (email, pass)
) ENGINE = INNODB;


-- Administrators table, containing site administrators info
-- Note: move created_time after last_login_time, otherwise SQL error #1293
DROP TABLE IF EXISTS administrators;
CREATE TABLE administrators (
   admin_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
   first_name VARCHAR(20) NOT NULL,
   last_name VARCHAR(40) NOT NULL,
   email VARCHAR(80) NOT NULL,
   pass CHAR(40) NOT NULL,
   last_login_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
                                     ON UPDATE CURRENT_TIMESTAMP,
   last_login_ip VARCHAR(15) NOT NULL,
   created_time DATETIME NOT NULL,
   PRIMARY KEY (admin_id)
) ENGINE = INNODB;


-- CVs table, containing CV info
DROP TABLE IF EXISTS cvs;
CREATE TABLE cvs (
   cv_id INT(10) UNSIGNED NOT NULL auto_increment,
   cv_name VARCHAR(60) NOT NULL,
   user_id INT UNSIGNED NOT NULL,
   description VARCHAR(80),  
   PRIMARY KEY (cv_id)
) ENGINE = INNODB;


-- Jobs table, containing job information
-- Note: must use MYISAM to support Fulltext search
DROP TABLE IF EXISTS jobs;
CREATE TABLE jobs (
   job_id INT(10) UNSIGNED NOT NULL auto_increment,
   job_title VARCHAR(30) NOT NULL,
   employer_id INT UNSIGNED NOT NULL,
   company_name VARCHAR(80) NOT NULL,
   description TEXT NOT NULL,
   town VARCHAR(30) NOT NULL,
   county VARCHAR(30) NOT NULL,
   contact_name VARCHAR(40) NOT NULL,
   contact_phone VARCHAR(11) NOT NULL,
   contact_email VARCHAR(80) NOT NULL,
   salary SMALLINT(5) UNSIGNED NOT NULL,
   confirm TINYINT(1) UNSIGNED NOT NULL default 0,
   posted_time TIMESTAMP NOT NULL,
   deadline INT(10) UNSIGNED NOT NULL,
   job_status SET('open', 'closed') NOT NULL,  
   employer_paid SET('yes', 'no') NOT NULL,
   PRIMARY KEY (job_id),
   FULLTEXT (job_title, description)
) ENGINE = MYISAM;


-- Jobs users applied
DROP TABLE IF EXISTS jobs_applied;
CREATE TABLE jobs_applied (
   jobs_applied_id INT(10) UNSIGNED NOT NULL auto_increment,
   user_id INT UNSIGNED NOT NULL,
   cv_id INT(10) UNSIGNED NOT NULL,
   cv_name VARCHAR(60) NOT NULL,
   job_id INT(10) unsigned NOT NULL,
   job_title VARCHAR(30) NOT NULL,
   company_id INT(10) unsigned NOT NULL,
   company_name VARCHAR(80) NOT NULL,
   applied_time TIMESTAMP NOT NULL,
   PRIMARY KEY (jobs_applied_id)
) ENGINE = INNODB;


-- Reports table, containing info to produce site reports
DROP TABLE IF EXISTS reports;
CREATE TABLE reports (
   report_id INT(10) UNSIGNED NOT NULL auto_increment,
   user_id INT UNSIGNED NOT NULL,
   employer_id INT UNSIGNED NOT NULL,
   job_id INT(10) unsigned NOT NULL,
   job_title VARCHAR(30) NOT NULL,
   job_posttime TIMESTAMP NOT NULL,
   content VARCHAR(250) NOT NULL,
   report_time INT(10) UNSIGNED NOT NULL,
   PRIMARY KEY (report_id)
) ENGINE = INNODB;

Some people say the tables I designed are exactly what I must not do in a relational database because they are full of duplicates. I don't get it. Anyone take a look at my design please and point out the design errors?

Answers


The biggest issue that jumps out at me (although there are a few) is that half of the Employer and Administrator fields are really just part of User/Authorization management. Keep the concept of 'Seekers' and 'Employers' and 'Administrators' separate (separate tables are fine and will actually help with FK relations), but there should be a unified 'Users/Accounts' relation that they relate to.

For instance, imagine the three "role discriminating" tables. A User can be associated with zero or all three through DRI (SQL does not inherently support distributed FKs) so business rules should be in place to ensure that right associations - but why couldn't an Employer also be a Seeker?

These tables can contain additional information as relate to the specific role (Seeker, Employer, Admin). The benefit of keeping separate tables (instead of just a discriminating role column) is that;

  1. Additional columns can be added to the refined type as required;
  2. FKs can be established along role_id, instead of needing a two-part (user_id,role).
Seekers (people looking for jobs)
---
seeker_id (PK)
user_id (FK Users, not null)

Employers
---
employer_id (PK)
user_id (FK Users, not null)

Administrators
---
admin_id (PK)
user_id (FK Users, not null)

Users/Accounts
---
- All the data that relates to login/authorization information such as username
- and password salt/hash, account contact e-mail, etc.
- You probably want to separate the authentication such as login/auth information
- and additional details, such as "last login from" or "registered from", etc.

Most "CMS" systems will already have authentication and authorization schemes in-place.


Other things that should be normalized (but strike me far less than the duplicate discussed above) are "Contact Information" and "Companies/Locations". It might also be worthwhile separating a "Job" from a "JobListing". Also, there appear to be some fields that are simply duplicated without reason:

company_id INT(10) unsigned NOT NULL,
company_name VARCHAR(80) NOT NULL,

Since the relation already has the company_id for a FK/Join, the company_name column is simply duplicate data and should be removed.


You need to normalize your database schema / design to avoid storing duplicate data.

See this article on Normalization with Examples: Description of the database normalization basics

In your particular case, you should separate out the Company, Contact, JobLocation, JobType, JobIndustry etc into their own tables and use relationships to link them.


Need Your Help

Possible to partition five elements by median with six comparisons?

algorithm sorting selection median

Given five random elements, it's possible to find the median using just six comparisons. But I have an extra requirement in that the following condition is also satisfied:

Why not resize and clear works in GotW 54?

c++ vector deque gotw

Referring to article Gotw 54 by HerbSutter, he explains about