Quality-Oriented Data Warehouse Design

Design Approaches DW

Demand-driven or top-down

  • conceptual schema without considering data sources

Supply driven or bottom-up

  • conceptual design is based on data sources

Hybrid

Attribute vs. Dimension
Difference in realizing real- world scenarios

Summarizability

Conditions

Completeness

  • e.g. one customer has at least one job

Disjointness/Strictness

  • e.g. every customer has at most one job

Type compatibility

  • e.g. do not SUM account balances over time

Drill-Down Incompleteness

  • element has no entry in lower hierarchy
    (e.g. Andorra has no regions, this leads to SUM of sales are different in regions and countries
  • 1, to 0, into 1 to 1,

Non-Strictness

  • week can be in multiple months, but
    never many to many relationships inside dimensions

Roll-Up Incompleteness

  • element has no super hierarchy
  • e.g. Napkin is not assigned to super group
  • introduce category "other"

Multidimensional Normal Forms

Aims

Reasonable

Control over optional dimensions levels

Efficient physical design

1st MNF

  • formal definition via restrictions on FDs
  • Faithfulness
  • Completeness
  • Avoidance of redundancies

2nd MNF

  • addresses optional levels
  • formal definition via context-dependency for every optional level
  • conctext-sensititve summarizability
  • efficient physical design

3rd MNF

  • addresses implicit class hierarchies
  • defined via restrictions on context-dependencies
  • construction of explicit class hierarchies

DB and DW Design Process

    1. Requirements analysis and specification
    1. Conceptual design
    1. (Logical design)
    1. (Physical design)

2. Conceptual design

  • synthesize fact schema
  • synthesize hierarchies
  • add context dependencies
  • define summarizability constraints
    Resulting fact schema are in 3rd MNF