Normalization is a set of rules that help design relational databases efficiently by reducing redundancy and improving data integrity. These rules are organized into "normal forms".
Rule: Every column should contain atomic (indivisible) values, and each row should be unique.
CustomerId | Name | PhoneNumbers |
---|---|---|
1 | Alice | 123-456, 789-012 |
Here, PhoneNumbers
stores multiple values in a single field.
CustomerId | Name | PhoneNumber |
---|---|---|
1 | Alice | 123-456 |
1 | Alice | 789-012 |
Absolutely! Here's a clearer and slightly more real-world rewrite of Second Normal Form (2NF) with a better example and deeper explanation:
A table is in 2NF if:
- It is already in 1NF (no repeating groups, atomic values),
- Every non-key column is fully dependent on the entire primary key — not just part of it.
This rule mostly applies when you have composite primary keys (i.e., two or more columns making up the key).
Imagine we have a table that records which products are in which orders, along with product information:
OrderId | ProductId | ProductName | UnitPrice | Quantity |
---|---|---|---|---|
1001 | 2001 | Coffee | 5.99 | 2 |
1001 | 2002 | Tea | 3.49 | 1 |
Here, the composite key is (OrderId, ProductId)
— together they identify the row.
But:
ProductName
andUnitPrice
only depend onProductId
, not the full key.- This causes redundancy (if "Coffee" appears in multiple orders, its name and price are repeated).
Split the table into two:
OrderId | ProductId | Quantity |
---|---|---|
1001 | 2001 | 2 |
1001 | 2002 | 1 |
ProductId | ProductName | UnitPrice |
---|---|---|
2001 | Coffee | 5.99 |
2002 | Tea | 3.49 |
Now:
- Each non-key column in
OrderProduct
depends only on the full key (OrderId + ProductId
). - Product-related info lives in a separate table.
- Prevents duplication of product data across many orders
- Easier to update (change price once)
- Reduces potential inconsistencies
Rule: Must be in 2NF, and all columns must depend only on the key, not on other non-key columns.
CustomerId | Name | ZipCode | City |
---|---|---|---|
1 | Alice | 12345 | Oslo |
Here, City
depends on ZipCode
, not on CustomerId
.
Split into:
Customer
CustomerId | Name | ZipCode |
---|---|---|
1 | Alice | 12345 |
ZipCodeInfo
ZipCode | City |
---|---|
12345 | Oslo |
Normal Form | Fixes Problem With |
---|---|
1NF | Repeating groups / multivalued fields |
2NF | Partial dependencies (on part of a key) |
3NF | Transitive dependencies |
Normalization helps build databases that are:
- Easier to maintain
- Less redundant
- More consistent
However, denormalization (intentionally breaking these rules) is sometimes useful for performance.