Please enable JavaScript.
Coggle requires JavaScript to display documents.
Advanced Dimensional Modelling (Examinable (5• Conformed Dimensions…
Advanced Dimensional Modelling
Examinable
1• Snowflake schemas雪花状
ER Models to Star Schemas星型
Dimension tables
‘Master data’ with hierarchies
Simple keys (create surrogate keys)
Fact table
measures
complex key
‘intersection entity
Embedded Hierarchies嵌入式等级制度(exp. customer)
Market segment hierarchy
Market Sector
Market Segment
Industry hierarchy
Industry Group
Industry sector
Industry class
Location hierarchy
country
state
city
Deriving Dimensional Models from
ER Models
Classify entities
Transaction entities (business events)
shipments, payments
payments
orders, shipments, payments
Component组件 entities
Who, what, where,
how and why data about the events
eg:customer product employee
Classification entities
eg:country state city
Define embedded hierarchies in components
High-level star schema design
Identity Star schemas required
transaction entities are related in a master-detail
structure
the master entity becomes degenerate dimension降维
eg:order, order item become order fact
Define level of summarisation
Level of granularity 粒度
Storage space / flexibility trade-off
Identify relevant dimensions
Select relevant dimensions
from component entities
Detailed fact table design
Define key
A composite key consisting of the keys of all dimension tables
Or
transaction
level key (such as a combination of orderID and OrderLineItemID) for the
lowest level of granularity
第三章的dw粒度特性
Define facts
analysed using numerical functions
Use additive facts, but may also be semi-additive
Fact Table Design
No data lost from original normalised model - master entity
attributes should be included in detailed entity
Detailed dimensional table design
Define dimensional key
May need to be generalised form underlying component key
Usually a simple numeric key
(surrogate key)
Collapse hierarchies
De-normalise hierarchies into component entities 用嵌入式等级制度
Replace codes and abbreviations by descriptive text 把code删了
2• Different types of fact tables
Semi-additive measures eg. balance amount 时间的balance不能求和
Additive measures
non-additive eg. ratios不能求和
3• Different types of facts
Snapshot (Balance, Transaction Count)
Periodic or accumulating view of business measures
Usually semi-additive measures
Factless
Event occurrence
No measures, just FKs
Transaction (Quantity, Extended Price)
Usually additive measures
Most common
4• Slowly Changing Dimensions
dimension table
underlying entity changes some important
attribute
For example the address of a customer
approaches
Overwrite覆盖 old values:History is lost;Error, history doesn’t
matter
•
Create a new dimension record
:Provides complete history;Customer address changes
• Create a current value field:Segments history between the old and the new description;Change in sales districts 销售区域的变化
Advantage: maintains and partitions history
Disadvantage: more complex than overwrite
5• Conformed Dimensions
Simple: seven plus or minus two dimensions
Understandable: user-oriented data names and values
(descriptive text rather than codes)
Complete: all relevant dimensions and facts
The appropriate level of granularity
Integrated: conformed dimensions
一致
Error proof: avoid errors in queries by using additive facts; not mixing levels of aggregation
6• Surrogate keys
Detailed Dimensional Table Design
Define dimensional key
• Usually a simple numeric key (surrogate key)通常是一个简单的数字键(代理键)
• May need to be generalised form underlying component key可能需要从底层组件键一般化