MySQL Normalization

Normalization is the process of organizing a database's tables and columns to reduce data redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and defining proper relationships between them. Normalization follows a series of rules called Normal Forms.

Why Normalize?

  • Eliminates redundancy — The same data is not stored in multiple places.
  • Prevents anomalies — Inserting, updating, or deleting data does not create inconsistencies.
  • Saves storage — No duplicate data means less disk usage.
  • Improves maintainability — Updating one record updates it everywhere it is referenced.

Anomalies in Unnormalized Data

Consider this single flat table:

student_id | student_name | course_name | teacher_name | teacher_phone
-----------+--------------+-------------+--------------+--------------
1          | Riya Sharma  | Maths       | Mr. Verma    | 9876543210
2          | Anil Joshi   | Maths       | Mr. Verma    | 9876543210
3          | Riya Sharma  | Science     | Ms. Pillai   | 9123456789

Problems:

  • Update anomaly — Changing Mr. Verma's phone requires updating multiple rows.
  • Insert anomaly — A new teacher cannot be added without assigning them a student.
  • Delete anomaly — Deleting the only student in Science removes Ms. Pillai's record entirely.

First Normal Form (1NF)

A table is in 1NF when:

  • Each column contains atomic (indivisible) values — no lists or multiple values in one cell.
  • Each column contains values of a single type.
  • Each row is unique (has a primary key).

Violation Example

student_id | student_name | courses
-----------+--------------+------------------
1          | Riya Sharma  | Maths, Science

1NF Compliant

student_id | student_name | course
-----------+--------------+---------
1          | Riya Sharma  | Maths
1          | Riya Sharma  | Science

Second Normal Form (2NF)

A table is in 2NF when it is in 1NF and every non-key column depends on the entire primary key, not just part of it. This applies to tables with composite primary keys.

Violation (Partial Dependency)

-- Primary key: (student_id, course_id)
student_id | course_id | student_name | course_name
-----------+-----------+--------------+------------
1          | 101       | Riya Sharma  | Maths

student_name depends only on student_id, not on course_id. This is a partial dependency — a 2NF violation.

2NF Solution

-- Students table
student_id | student_name
-----------+-------------
1          | Riya Sharma

-- Courses table
course_id | course_name
----------+------------
101       | Maths

-- Enrollments table
student_id | course_id
-----------+----------
1          | 101

Third Normal Form (3NF)

A table is in 3NF when it is in 2NF and every non-key column depends directly on the primary key — not on another non-key column. This removes transitive dependencies.

Violation (Transitive Dependency)

emp_id | emp_name    | dept_id | dept_name
-------+-------------+---------+----------
1      | Ravi Kumar  | 10      | IT

dept_name depends on dept_id, which depends on emp_id. That is a transitive dependency.

3NF Solution

-- Employees table
emp_id | emp_name   | dept_id
-------+------------+--------
1      | Ravi Kumar | 10

-- Departments table
dept_id | dept_name
--------+----------
10      | IT

Summary of Normal Forms

Normal FormRequirement
1NFAtomic values, no repeating groups, unique rows
2NF1NF + no partial dependencies on composite keys
3NF2NF + no transitive dependencies
BCNFStricter version of 3NF for edge cases

Key Points

  • Normalization removes data redundancy and prevents update, insert, and delete anomalies.
  • 1NF requires atomic values and a unique primary key per row.
  • 2NF removes partial dependencies — all columns must depend on the full primary key.
  • 3NF removes transitive dependencies — columns must depend directly on the primary key only.
  • Most production databases target 3NF; some reporting systems intentionally denormalize for read performance.

Leave a Comment

Your email address will not be published. Required fields are marked *