Database normalization is the process of structuring a relational database to reduce data redundancy and improve data integrity. Normal forms are a series of guidelines that help achieve this goal by ensuring that tables are logically organized and free from undesirable characteristics such as data duplication, update anomalies, and insertion/deletion anomalies. The key normal forms are:

  • First Normal Form (1NF),
  • Second Normal Form (2NF)
  • Third Normal Form (3NF),
  • Boyce-Codd Normal Form (BCNF).

First Normal Form (1NF)

Data must be stored in atomic (indivisible) values, i.e. each column should contain values of a single type and each row must be unique

-- Table that violates 1NF
CREATE TABLE Orders (
    order_id INT PRIMARY KEY,
    customer_name VARCHAR(255),
    items VARCHAR(255) -- Storing multiple items as a comma-separated string
);
 
INSERT INTO Orders (order_id, customer_name, items) VALUES
(1, 'Alice', 'Apple,Banana'),
(2, 'Bob', 'Orange');
 
-- Violation: The 'items' column contains non-atomic values
-- Correcting 1NF violation by normalizing the data
CREATE TABLE Orders (
    order_id INT,
    customer_name VARCHAR(255),
    item VARCHAR(255),
    PRIMARY KEY (order_id, item)
);
 
INSERT INTO Orders (order_id, customer_name, item) VALUES
(1, 'Alice', 'Apple'),
(1, 'Alice', 'Banana'),
(2, 'Bob', 'Orange');

Second Normal Form (2NF)

Besides all the requirements of 1NF, all non-key columns must be fully dependent on the entire primary key (no partial dependencies).

-- Table that violates 2NF
CREATE TABLE OrderDetails (
    order_id INT,
    product_id INT,
    product_name VARCHAR(255),
    PRIMARY KEY (order_id, product_id)
);
 
-- Violation: 'product_name' depends only on 'product_id' (partial dependency)
-- Normalized tables that satisfy 2NF
CREATE TABLE Orders (
    order_id INT PRIMARY KEY,
    order_date DATE
);
 
CREATE TABLE Products (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(255)
);
 
CREATE TABLE OrderProducts (
    order_id INT,
    product_id INT,
    PRIMARY KEY (order_id, product_id),
    FOREIGN KEY (order_id) REFERENCES Orders(order_id),
    FOREIGN KEY (product_id) REFERENCES Products(product_id)
);

Third Normal Form (3NF)

Besides all requirements of 2NF, there should be no transitive dependencies (i.e., non-key attributes must not depend on other non-key attributes).

-- Table that violates 3NF
CREATE TABLE Employees (
    employee_id INT PRIMARY KEY,
    department_id INT,
    department_name VARCHAR(255)
);
 
-- Violation: 'department_name' depends on 'department_id', which depends on 'employee_id'
-- Corrected 3NF structure
CREATE TABLE Departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(255)
);
 
CREATE TABLE Employees (
    employee_id INT PRIMARY KEY,
    department_id INT,
    FOREIGN KEY (department_id) REFERENCES Departments(department_id)
);

Boyce-Codd Normal Form (BCNF)

Besides all requirements of 3NF, for every functional dependency, the left-hand side must be a super key (all determinants must be candidate keys).

-- Table that violates BCNF
CREATE TABLE TeachingAssignments (
    teacher_id INT,
    subject VARCHAR(255),
    department VARCHAR(255),
    PRIMARY KEY (teacher_id, subject)
);
 
-- Violation: 'subject' determines 'department', but 'subject' is not a candidate key
-- Corrected BCNF structure
CREATE TABLE Subjects (
    subject VARCHAR(255) PRIMARY KEY,
    department VARCHAR(255)
);
 
CREATE TABLE TeachingAssignments (
    teacher_id INT,
    subject VARCHAR(255),
    PRIMARY KEY (teacher_id, subject),
    FOREIGN KEY (subject) REFERENCES Subjects(subject)
);

Key Differences Between 2NF, 3NF, and BCNF

Normal FormDependency Type AddressedExample Violation
2NFPartial Dependency on Composite Keys’ProductName’ depends only on ‘ProductID’
3NFTransitive Dependency on Non-Key Columns’DepartmentName’ depends on ‘DepartmentID’
BCNFNon-Candidate Key Dependency’Subject’ determines ‘Department’ without being a key