Please enable JavaScript.
Coggle requires JavaScript to display documents.
Week4 advanced dimensional modelling - Coggle Diagram
Week4 advanced dimensional modelling
star schemas
自上而下 (not normalized事实表和维度表:维度直接指的是事实表): 更多冗余数据,所以难修改维护,复杂度低,更少外键因此查询执行速度更快,花费时间更少 (适合做指标分析更适合,比如‘给定的一个客户她们的收入是什么?
Deriving Dimensional Models from ER Models
3. Detailed fact table design
Define key
A composite key consisting of the keys of all dimension tables(not
minimal in most cases)
Or transaction level key (such as combination of orderID and
OrderLineItemID) for lowest level of granulairty
Define facts
The measures that can be analysed using numerical functions
Use additive facts, but may also be semi additive
Wherever possible, additive facts should be used to prevent
errors in queries
4. Detailed dimensional table design
Define dimensional key
Usually a simple numeric key (surrogate key):
May need to be generalised form underlying component key
Collapse hierarchies折叠层次结构
De-normalise hierarchies into component entities:
Introduces transitive dependencies and therefore redundancy
2. High-level star schema design
Define level of summarisation
Level of granularity
Storage space / flexibility trade-off 存储空间/灵活性的权衡
Identify relevant dimensions
Select relevant dimensions from component entities
Identity Star schemas required
Each transaction entity is a candidate 候选
Classify entities
里到外
1. Classify entities into three types
Component entities
Who, what, where, how and why data about the events
3。
Classification entities
分类实体
Define embedded hierarchies in components
Transaction entities
(business events)
Eg. orders, shipments, payments
snowflake schema
(normalized):规范数据,也就是说数据在数据库内部是组织好,以便消除冗余,适合 比如针对特定的广告,哪些客户或者公司是在线的?
Why dimensional modelling work????
Chunking组块
each star schema is a chunk
Large amounts of information are “chunked” by humans
Human short-term memory limits
Hierarchical struturing
used to manage complexity
Each dimension in a star schema typically consists of one of more
hierarchies
provides the means to “roll up” and “drill
down” in OLAP tool
Dimensional Modelling and ER modelling
Fact table
An ‘intersection entity’ with complex key and measures
A star schema is just a restricted form of an ER model
Dimension tables
Master data’ with hierarchies
Simple keys (create surrogate keys
E-R Modeling is a model for OLTP, optimized for Operational database, namely insert, update, delete data and stressing on data relational integrity.
Dimensional Modeling is a model for OLAP, optimized for retrieving data because it's uncommon to update or delete the data