Database normalization is the process of structuring a relational database to reduce data redundancy and improve data integrity. Normal forms are a series of guidelines that help achieve this goal by ensuring that tables are logically organized and free from undesirable characteristics such as data duplication, update anomalies, and insertion/deletion anomalies. The key normal forms are:
- First Normal Form (1NF),
- Second Normal Form (2NF)
- Third Normal Form (3NF),
- Boyce-Codd Normal Form (BCNF).
First Normal Form (1NF)
Data must be stored in atomic (indivisible) values, i.e. each column should contain values of a single type and each row must be unique
-- Table that violates 1NF
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_name VARCHAR(255),
items VARCHAR(255) -- Storing multiple items as a comma-separated string
);
INSERT INTO Orders (order_id, customer_name, items) VALUES
(1, 'Alice', 'Apple,Banana'),
(2, 'Bob', 'Orange');
-- Violation: The 'items' column contains non-atomic values-- Correcting 1NF violation by normalizing the data
CREATE TABLE Orders (
order_id INT,
customer_name VARCHAR(255),
item VARCHAR(255),
PRIMARY KEY (order_id, item)
);
INSERT INTO Orders (order_id, customer_name, item) VALUES
(1, 'Alice', 'Apple'),
(1, 'Alice', 'Banana'),
(2, 'Bob', 'Orange');Second Normal Form (2NF)
Besides all the requirements of 1NF, all non-key columns must be fully dependent on the entire primary key (no partial dependencies).
-- Table that violates 2NF
CREATE TABLE OrderDetails (
order_id INT,
product_id INT,
product_name VARCHAR(255),
PRIMARY KEY (order_id, product_id)
);
-- Violation: 'product_name' depends only on 'product_id' (partial dependency)-- Normalized tables that satisfy 2NF
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
order_date DATE
);
CREATE TABLE Products (
product_id INT PRIMARY KEY,
product_name VARCHAR(255)
);
CREATE TABLE OrderProducts (
order_id INT,
product_id INT,
PRIMARY KEY (order_id, product_id),
FOREIGN KEY (order_id) REFERENCES Orders(order_id),
FOREIGN KEY (product_id) REFERENCES Products(product_id)
);Third Normal Form (3NF)
Besides all requirements of 2NF, there should be no transitive dependencies (i.e., non-key attributes must not depend on other non-key attributes).
-- Table that violates 3NF
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
department_id INT,
department_name VARCHAR(255)
);
-- Violation: 'department_name' depends on 'department_id', which depends on 'employee_id'-- Corrected 3NF structure
CREATE TABLE Departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(255)
);
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
department_id INT,
FOREIGN KEY (department_id) REFERENCES Departments(department_id)
);Boyce-Codd Normal Form (BCNF)
Besides all requirements of 3NF, for every functional dependency, the left-hand side must be a super key (all determinants must be candidate keys).
-- Table that violates BCNF
CREATE TABLE TeachingAssignments (
teacher_id INT,
subject VARCHAR(255),
department VARCHAR(255),
PRIMARY KEY (teacher_id, subject)
);
-- Violation: 'subject' determines 'department', but 'subject' is not a candidate key-- Corrected BCNF structure
CREATE TABLE Subjects (
subject VARCHAR(255) PRIMARY KEY,
department VARCHAR(255)
);
CREATE TABLE TeachingAssignments (
teacher_id INT,
subject VARCHAR(255),
PRIMARY KEY (teacher_id, subject),
FOREIGN KEY (subject) REFERENCES Subjects(subject)
);Key Differences Between 2NF, 3NF, and BCNF
| Normal Form | Dependency Type Addressed | Example Violation |
|---|---|---|
| 2NF | Partial Dependency on Composite Keys | ’ProductName’ depends only on ‘ProductID’ |
| 3NF | Transitive Dependency on Non-Key Columns | ’DepartmentName’ depends on ‘DepartmentID’ |
| BCNF | Non-Candidate Key Dependency | ’Subject’ determines ‘Department’ without being a key |