Computer

Data Normalization

Data Normalization

Database normalization stands for a series of steps followed to obtain a database design so that it allows for consistent storage and efficient access of data in a relational database. In other words, data normalization is the process of efficiently organizing data in a database. The steps involved in data normalization are followed to- (1) eliminate redundant data (e.g., storing the same data in more than one table), (2) ensure data dependencies make sense. Data normalization reduces data redundancy and chances of occurring data inconsistency. The process also reduces the amount of space a database consumes and ensures that data is stored logically in its specific place.

There are a number of guidelines set to ensure if a database is normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization) through five. Most commonly in most applications normal forms such as 1NF, 2NF and 3NF are seen along with the occasional 4NF. The fifth normal form is rarely seen.

First Normal Form (1NF): First normal form (1NF) sets the very basic rules for an organized database- (a) Eliminate duplicative columns from the same table and (b) Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key). This form requires that all column values in a table are atomic. In other words, this normalization form eliminates repeating groups by putting each into a separate table and connecting them with a primary key- foreign key relationship.

Second Normal Form (2NF): Second normal form (2NF) addresses the concept of removing duplicative data- (a) Meet all the requirements of the first normal form; (b) Remove subsets of data that apply to multiple rows of a table and place them in separate tables and © Create relationships between these new tables and their predecessors through the use of foreign keys. This form requires that there are no non-trivial functional dependencies of a non-key attribute on a part of a candidate key.

Third Normal Form (3NF): Third normal form (3NF) requires to- (a) Meet all the requirements of the second normal form and (b) Remove columns that are not dependent upon the primary key. In other words, it requires that there are no non-trivial functional dependencies of non-key attributes on something else than a superset of a candidate key.

Fourth Normal Form (4NF): Fourth normal form (4NF) requires- (a) Meet all the requirements of the third normal form and (b) A relation is in 4NF if it has no multi-valued dependencies. Or it requires that there are no non-trivial multi-valued dependencies of attribute sets on something else than a superset of a candidate key.

Fifth Normal Form (5NF or PJ/NF): Fifth normal form requires that there are no non-trivial join dependencies that do not follow the key constraints.

Besides, there are also two forms such as:

Boyce-Codd Normal Form (or BCNF): This form requires that there are no non-trivial functional dependencies of attributes on something else than a superset of a candidate key.

Domain-Key Normal Form (or DK/NF): This requires that all constraints follow from the domain and the key constraints.