HOW TO DESIGN A DATABASE (How To Design A Data Structure (The 6 basic…
HOW TO DESIGN A DATABASE
Notes taken while studying "Murach's Oracle SQL And PL/SQL For Developers"
How To Design A Data Structure
A database system is modeled after a real_world system
The 6 basic steps for designing a data structure
: Identify the data elements
: Subdivide each element into its smallest useful components.
: Identify the tables and assign solumns
: Identify the primary and foreign keys.
: Review whether the data structure is normalized
: Identify the indexes
: a model of the database rather than the database itself
A Relational database system should model the real-world environment where it's used. It is the designers job to analyze the real-world system and them map it onto a relational database system.
In a relational database, a table typically represents an object, or
in the real world. Each column of a table is used to store an
associated with the entity, and each row represents on ** of the entity
To model a database and the relationships between its tables after a real-world system, you can use a technique called
entity-relationship (ER) modeling
: a table represented by one object.
: a column in a table that stores one item of information for the entity
: a row the stores one occurrence of the entity.
How To Identify The Data Elements
You can identify data elements in a variety of ways, including
existing systems, and
An invoice document can be used to help you identify the data elements in a real-world system.
As you identify the data elements of a system, you should begin thinking about the
that those elements are associated with. That will help you identify the tables of the DB later on.
How To Subdivide The Data Elements
If a data element contains two or more components, you should consider
subdividing the element each time you use it
The extent to which you subdivide a data element depends on how it will be used. Because it's difficult to predict all future uses for the data,
most designers subdivide data elements as much as possible
When you subdivide a data element, you
can easily rebuild it when necessary by concatenating the individual components
How To Identify The Tables And Assign Columns
After you subdivide
all of the data elements
for a database,
you should group them by the entities with which they're associated
. These entities will later become the tables of the DB, and the elements will become the columns.
If a data element relates to more than one entity, you can include it under all of the entities it relates to. Then,
when you normalize the database, you may be able to remove the duplicate elements
As you assign the elements to entities, you should
omit elements that aren't needed
, and you should
add any additional elements that ARE needed
How To Identify The Primary And Foreign Keys
Each table should have a primary key
that uniquely identifies each row.
, you should
set an existing column for the primary key
values of the primary key should seldom
, if ever,
. The values also
be short and easy to enter correctly
If a suitable column doesn't exist for a primary key, you can
create an ID column that's incremented by one for each new row as the primary key.
If two tables have a one-to-many relationship, you may need to
add a foreign key column to the table on the "many" side
foreign key column must have the same data types as the primary key column
it's related to.
If two tables have a many-to-many relationship, you'll need to
to relate them
each of the tables in the many-to-many relationship will have a one-to-many relationship with the linking table
. The linking table doesn't usually have a primary key.
If two tables have
should be related by their primary keys
. This type of relationship is
typically used to improve performance
columns with large amounts of data can be stored in a separate table
: a table that links two many-to-many tables to relate them.
How To Enforce The Relationships Between Tables
Operations that can violate referential integrity
Delete a row form the primary key table.
The foreign key table contains one or more rows related to the deleted row
Insert a row in the foreign key table
The foreign key value doesn't have a matching primary key value in the related table
Update the value of a primary key
The foreign key table contains one or more rows related to the row that's changed
Update the value of a foreign key
The new foreign key value doesn't have a matching primary key value in the related table
the relationships between tables are maintained correctly.
--That means that
a table with a foreign key doesn't have rows with foreign key values that don't have matching primary key values in the related table
In Oracle, you can
enforce referential integrity by using declarative referential integrity
by defining triggers
declarative referential integrity (DRI)
foreign key constraints
When you define foreign key constraints, you can
specify how referential integrity is enforced when a row is deleted from the primary key table
. The options are
a). to return an error,
b). to delete the related rows in the foreign key table
c), to set the foreign key values in the related rows to null
If referential integrity isn't enforced and
a row is deleted from the primary key table that has related rows in the foreign key table
, the rows in the foreign key table are
said to be
: the relationship between tables are maintained correctly.
declarative referential integrity (DRI)
: is used to enforce relationships between tables and maintain their referential integrity.
foreign key constraint
: indicates how the referential integrity between the tables is enforced
: rows in the foreign key table that no longer have a related row in the primary key table.
How Normalization Works
a formal process you can use to separate the data in a data structure into related tables
, which can cause storage and maintenance problems.
unnormalized data structure
a table can contain information about two or more entities
. It can also contain
, columns that contain repeating values, and
data that's repeated
in two or more rows.
normalized data structure
each table contains information about single entity
each piece of information is stored in exactly on place
To normalize a data structure, you
apply the normal forms in sequence
. Although there are a total of seven normal forms,
a data structure is typically considered normalized if the first three normal forms are applied
: a formal process used to separate the data in a data structure into related tables.
: redundancy that can cause storage and maintenance problems.
unnormalized data structure
: a table that contains information about two or more entities, repeating columns, columns that contain repeating values, and data that's repeated in two or more rows.
normalized data structure
: where each table contains info. about a single entity, and each piece of info. is stored in exactly on place.
: are used to eliminate problems caused by
There are 7 types.
How To Identify The Columns To Be Indexed
WHEN TO CREATE AN INDEX
When the column is a foreign key
When the column is used frequently in search conditions or joins
When the column contains a large number of distinct values
When the column is updated infrequently
automatically creates an index for a primary key
An index provides a way for a database management system to locate information more quickly
. When it uses an index,
the database management system can go directly to a specific row
rather than having to search through all the rows until it find it.
Indexes speed performance when searching and joining tables.
that include two or more columns
. You should
use this type of index when the columns in the index are updated infrequently
when the index will cover almost every search condition on the table
Because indexes must be updated each time you add, update, or delete a row,
you shouldn't create more indexes that you need
: a structure that provides for locating one or more rows dierctly.
: when a DBMS searches through the entire table, because no index exists.
: an index that consists of two or more columns.
How To Normalize A Data Structure
The 7 Normal Forms
: The value stored at the intersections of each row and column must be a scalar value, and a table must not contain any repeating columns.
: Every non-key column must depend on the entire primary key
: Every non-key column must depend only on the primary key
: A table must not have more than one
, where the primary key has a one-to-many relationship to non-key columns. This form gets rid of
misleading many-to-many relationships
: A non-key column can't be dependent on another non-key column. This prevents
, where column A depends on column C and column B depends on column C.Since both A and B depend on C, A and B should be moved into another table with C as the key.
: The data structure is split into smaller and smaller tables until all redundancy has been eliminated. If further splitting would result in tables that couldn't be joined to recreate the original table, the structure is in fifth normal format.
: Every constraint on the relationship is dependent only on key constraints and domain constraints, where a
is the set of allowable values for a column. This form prevents the insertion of any unacceptable data by enforcing constraints at the level of a relationship, rather than at the table of column level, DKNF is less a design model than an abstract "ultimate" normal form.
The Benefits Of Normalization
a. Since a normalized database has more tables than an unnormalized database, and since each table has an index on its primary key,
the database has more indexes. And that make data retrieval more efficient.
b. Since each table contains information about a single entity,
each index has fewer columns (usually one) and fewer rows. And that makes data retrieval and insert, update, and delete operations more efficient.
Each table has fewer indexes, which makes insert, update, and delete operations more efficient
Data redundancy is minimized,
simplifies maintenance and reduces storage
Each normal form assumes that the design is already in the previous normal form
A database is typically considered to be normalized if it's in third normal form (3NF).
The other four forms are not commonly used.
How To Apply The First Normal Form
For a table to be in first normal form (1NF),
its columns must not contain repeating values.
each column must contain a single, scalar value
. In addition,
the table must no contain repeating columns that represent a set of values.
A table in first normal form often has repeating values in its rows
This can be resolved by applying the second normal form
How To Apply The Second Normal Form
For a table to be in second normal form,
every non-key column must depend on the entire primary key
If a column doesn't depend on the entire key, it indicates that the table contains information for more than one entity
. This is reflected by the table's composite key.
To apply second normal form, you
move columns that don't depend on the entire primary key to another table
then establish a relationship between the two tables.
Second normal form
helps remove redundant row data,
which can save storage space, make maintenance easier, and reduce the chance of storing inconsistent data.
How To Apply The Third Normal Form
For a table to be in third normal form,
every non-key column must depend
on the primary key
If a column doesn't depend only on the primary key, it implies that the column is assigned to the wrong table or that it can be computed from other columns in the table.
A column that can be computed from other columns contains
When And How To Denormalize A Data Structure
When To Denormalize
When a column from a joined table is used repeatedly in search criteria
, you should
consider moving that column to the primary key table if it will eliminate the need for a join
If a table is updated infrequently
, you should
consider denormalizing it to improve efficiency.
Because that data remains relatively constant, you
don't have to worry about data redundancy errors once the initial data is entered and verified.
Include columns with derived values when those values are used frequently in search conditions
. If you do that,
you need to be sure that the column value is always synchronized with the value of the columns it's derived from
Data structures that are normalized to the fourth normal form and beyond
typically require more joins than tables normalized to the third normal form and can therefore be less efficient
that work with tables
that are normalized to the fourth normal form (4NF) and beyond
typically more difficult to code and debug
to some extent,
usually to the third normal form (3NF)
result in larger tables, redundant data, and reduced performance.
denormalization when necessary
. It is
better to adhere to the normal forms unless it is clear that performance will be improved by denormalizing