Please enable JavaScript.
Coggle requires JavaScript to display documents.
Logical Database Design. Normalization (The Functional Dependencies…
Logical Database Design. Normalization
Avoidance of Modification Anomalies
modification anomalies - unexpected side effects that occur when changing the contents of a table with excessive redundancies. A good database design avoids modification anomalies by eliminating excessive redundancies.
some of the problems with this design
This table has insertion anomalies. An insertion anomaly occurs when extra data beyond the desired data must be added to the database.
Remember that a row cannot exist with null values for part of its primary key.
This table has update anomalies. An update anomaly occurs when it is necessary to change multiple rows to modify only a single fact.
This table has deletion anomalies. A deletion anomaly occurs whenever deleting a row inadvertently causes other data to be deleted.
To deal with these anomalies, users may circumvent them (such as using a default primary key to insert a new course) or database programmers may write code to prevent inadvertent loss of data. A better solution is to modify the table design to remove the redundancies that cause the anomalies.
The Normalisation Process
The normalisation process assumes that you start with some informal description of all the data attributes that the database application appears to require; this is often called un-normalised data.
The set of attributes is then tested using criteria defined by each of the normalisation stages. If the data fails the criteria, there is a prescribed procedure for correcting the structure of the data; this inevitably involves the criteria od a additional tables.
The Functional Dependencies
Normalization is a process for evaluating and correcting table structures to minimize data redundancies, thereby reducing the likelihood of data anomalies. The normalization process involves assigning attributes to tables based on the concept of determination. Before outlining the normalization process, it is a good idea to review the concepts of determination and functional dependence.
Functional dependence
The attribute B is fully functionally dependent on the attribute A if each value of A determines one and only one value of B.
example
ProjectID → Title
(read as ProjectID functionally determines Title)
In this case, the attribute ProjectID is known as the determinant attribute, and the attribute Title is known as the dependent attribute.
Functional dependence (generalized definition)
Attribute A determines attribute B (that is, B is functionally dependent on A) if all (generalized definition) of the rows in the table that agree in value for attribute A also agree in value for attribute B.
Fully functional dependence (composite key)
If attribute B is functionally dependent on a composite key A but not on any subset of that composite key, the attribute B is fully functionally dependent on A.
The normalization process works one relation at a time, identifying the dependencies on that relation and normalizing the relation. normalization starts by identifying the dependencies of a given relation and progressively breaking up the relation (table) into a set of new relations (tables) based on the identified dependencies.
Two types of functional dependencies (FDs) that are of special interest in normalization are partial dependencies and transitive dependencies.
A partial dependency exists when there is a functional dependence in which the determinant is only part of the primary key (remember the assumption that there is only one candidate key).
Partial dependencies tend to he straightforward and easy to identify.
A transitive dependency exists when there are functional dependencies such that X → Y, Y → Z, and X is the primary key. In that case, the dependency X → Z is a transitive dependency because X determines the value of Z via Y. Unlike partial dependencies, transitive dependencies are more difficult to identify among a set of data.
there is an effective way to identify transitive dependencies: they occur only when a functional dependence exists among nonprime attributes.
throughout the discussion of the normalization process, the existence of a functional dependence among nonprime attributes will be considered a sign of a transitive dependency. To address the problems related to transitive dependencies, changes to the table structure are made based on the functional dependence that signals the transitive dependency’s existence. Therefore, to simplify the description of normalization, from this point forward the signaling dependency will be called the transitive dependency.
A FD diagram compactly displays the functional dependencies of a particular table. You should arrange FDs to visually group columns sharing the same determinant.
If you prefer, you can list FDs rather than arrange them in a diagram. For large collections of FDs, it is difficult to make a diagram.
The First Normal Form
The First Normal Form (1NF) prohibits nesting or repeating groups in tables. A table not in this form is unnormalized or nonnormalized.
To convert an unnormalized table into 1NF, you replace each value of a repeating group with a row. In a new row, you copy the nonrepeating columns.
A relational table must not contain repeating groups. Normalizing the table structure will reduce the data redundancies. If repeating groups do exist, they must be eliminated by making sure that each row defines a single entity. In addition, the dependencies must be identified to diagnose the normal form. Identification of the normal form lets you know where you are in the normalization process.
A relation is in the First Normal Form (1NF) if it has a fixed number of column, all of whose values are atomic.
The Second Normal Form
Conversion to the Second Normal Form (2NF) occurs only when the 1NF has a composite primary key. If the table in 1NF has a single-attribute primary key, then the table is automatically in 2NF
A relation is in the Second Normal Form if the relation is in the 1NF and each nonkey column depends on all candidate keys, not on a subset of any candidate key
To see if a table is in 2NF, you should look for FDs that violate the definition. An FD in which part of a key determines a nonkey column violates 2NF. If the key contains only one column, the table is in 2NF
To place the table into 2NF, split the original table into smaller tables that satisfy the 2NF definition. In each smaller table, the entire primary key (not part of the primary key) should determine the nonkey columns. The splitting process involves the project operator of relational algebra.
The splitting process should preserve the original table in two ways. First, the original table should be recoverable by using natural join operations on the smaller tables. Second, the FDs in the original table should be derivable from the FDs in the smaller tables. Technically, the splitting process is known as a nonloss, dependency-preserving decomposition.
The Third Normal Form
To eliminate the modification anomalies, the definition of the Third Normal Form (3NF) should be applied. A table is in the Third Normal Form if it is in 2NF and each nonkey column depends only on candidate keys, not on other nonkey columns.
An FD in which one nonkey column determines another nonkey column violates 3NF.
An equivalent way to define 3NF is that 3NF prohibits transitive dependencies
Case Study
A Logical Model of an IT Company Database