Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Analytics Best Practices - Coggle Diagram
Data Analytics Best Practices
Data Requirements
(1.1 p22-35)
Data Dictionary
Data Masking
Exclusion and Inclusion Criteria
Types of Dataset
(1.1 p28-35)
Demographics
Behavioural
temporal in nature, time stamp is important
Product Information
Derived
Third Party
Public
Data Pre-processing
Data Cleaning
Handle noisy / erroneous data
(1.2 p10)
Binning
Curve/Line Fitting
Ensemble methods
Managing outliers
(1.2 p11-13)
Handle missing data
(1.2 p5-9)
Exclusion
Data Imputation
Removal of unwanted observations
Deduplicate
(1.2 p14)
Fixing structural errors
Data Transformation
(1.2 p23-25)
Data Normalization
(1.2 p26)
Data Integration
(1.2 p18-21)
Might result in data redundancy
Encoding Categorical Variables
(1.2 p27-30)
One-hot / Dummy
Label Encoding
Feature Engineering
Feature Reduction
(1.2 p48-50)
Feature Selection
Feature Extraction
Feature Construction
(1.2 p46-47)
Data Warehouse
Subject Area and Business Query
(3.1 p32-36)
Dimensional Model
Fact Table
(3.1 p45-63)
Multi level facts
(3.1 p53-55)
1 fact table for each Level
Use level Indicator in dimension table
Contain the business measures
Always joined to a dimension table
Dimension Table
Slowly Changing Dimensions (SCD)
(3.1 p72-76)
Overwrite Previous Attribute Values
Add an Additional Dimension Record
Add New Field(s)
Conformed Dimensions
(3.1 p78-79)
Analytical Hierarchies
(3.1 p81-88)
Contain business perspective
Date-Time Dimension
(3.1 p68-70)
Exploratory Visualization
(1.2 p31-43)
Decision Engineering
(1.2 p56-
Model Deployment
(1.2 p60-66)
Model Maintenance
(1.2 p68-71)