Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 3: Data Warehousing - Coggle Diagram
Chapter 3: Data Warehousing
definition
A
single, complete,
and
consistent store of data
obtained from a variety of
different sources
made available to end users in a what they can understand and use in a business context. [Barry Devlin]
A
data warehouse (Star Schema) is a collection of integrated databases designed to support a DSS.
By comparison: an
OLTP (on-line transaction processor)
or
operational system (ERD)
is
used to deal with the everyday running of one aspect of an enterprise.
OLTP systems are usually designed independently
of each
other and it is
difficult for them to share information.
Why Do We
Need Data Warehouses
?
Consolidation (unite, combine) of information resources
Improved query performance
(build based on star schema)
Separate research
and
decision support functions
from the
operational systems
Foundation for data mining, data visualization, and advanced reporting
Data Warehouse Used
for?
Knowledge discovery: 1.
Making consolidated reports
. 2.
Finding relationships and correlations
. 3.
Data mining
. 4. Examples:
Banks identifying credit risks
, Insurance companies searching for fraud,
Medical research
Data Warehouses Differ From
Operational DBMS
?
Goals
Structure
Size
Performance optimization
Technologies used
OLTP
vs.
OLAP
OLTP
(
on-line transaction (query) processing)
-
Major task of traditional relational DBMS
,
Day-to-day operations
: p
urchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc.
OLAP
(
on-line analytical processing)
-
Major task of data warehouse system,
Data analysis
and
decision making
Data Warehouse
- the
queryable source of data in the enterprise
. It is
comprised of the union of all of its constituent data marts
. FUNCTION:
Holds multiple subject areas, Holds very detailed information, Works to integrate all data sources
Data Mart - a logical subset of the complete data warehouse.
Often
veiwed as a restriction of the data warehouse to a single business process
or to a
group of related business processes
targeted toward a particular business group. FUNCTION:
Often holds only one subject area- for example, Finance, or Sales
, May
hold more summarised data
(although many hold full detail)
Operational Data Store (ODS)
– A
point of integration for operational systems
that
developed independent of each other
. Since an
ODS supports day to day operations, it needs to be continually updated
.
Technique for assembling and managing data from various sources
for the
purpose of answering business questions
. Thus making decisions that were not previous possible.
A
decision support database maintained separately
from
the organization’s operational database
A
data warehouse
is based on a
multidimensional data model
which v
iews data in the form of a data cube
data cube
is a
type of multidimensional (most popular) matrix
that
lets users explore and analyze a collection of data
from many
different perspectives
. E.g:
sales data
>
allows data to be modeled and viewed in multiple dimensions
, Part sold, Location, Customer
model
can exist in the form of:
1. Star schema
- A
fact table in the middle connected to a set of dimension tables
. Contains: 1. A large central table (fact table). 2. A set of
smaller attendant tables (dimension table), one for each dimension
.
2. Snowflakes schema
- A refinement of star schema where
some dimensional hierarchy is further splitting (normalized) into a set of smaller dimension tables,
forming a shape similar to snowflake. the snowflake structure can
reduce the effectiveness of browsing
, since more joins will be needed.
3. Fact constellations
-
Splitting the original star schema into more star
schemes each of them
describes facts on another level of dimension hierarchies