Please enable JavaScript.
Coggle requires JavaScript to display documents.
Business Intelligence Week 1 & 2 (Steps to Designing DW/Data Mart…
Business Intelligence Week 1 & 2
Week 1: Introduction to Business Intelligence
Business Intelligence
Definition: an umbrella term that combines architectures, tools, databases, analytical tools, applications and methodologies
Objective: enable interactive access to data
Objective: enable manipulation of these data
Objective: provide business managers with the ability to conduct appropriate analysis
BI System
Objective: To facilitate closing the gap between the current performance of an organisation and its desired performance, as expressed in its mission, objectives and goals and the strategy to achieve them
Objective: To provide right information at the right time and in the right place
Objective: To help transform data, to information, to decisions and finally to action
4 Major Components
Data warehouse: cornerstone of any medium-to-large Bi system
Business analytics: tools that help keep users transform data into knowledge
Business Performance Management (BPM): emerging portfolio of applications within the BI framework that provides enterprises tools they need to better manage their operations
User interface: provides a comprehensive graphical/pictorial view of corporate performance measures, trends & exceptions
Business Analytics
Definition: broad category of applications and techniques for gathering, storing, analyzing and providing access to data to help enterprise users make better business and strategic decisions
Descriptive Analysis: use data aggregation and data mining to provide insight into the past and answer: "What has happened?"
Predictive Analytics: use statistical and forecasts techniques to understand the future and answer: "What could happen?"
Prescriptive Analytics: uses automated decision rules expert system to prescribe new ways to influence present and answer: "What should be done? "
Applications of Analytics
Customer Analytics
Descriptive: past buying patterns, customer profile
Predictive: predict if customer is likely to churn, or what a customer is likely to buy next and how much, or what promotion a customer would respond to
Prescriptive: prescribe new ways to influence sales
People Analytics
Descriptive: analyze performance review/feedback
Predictive: predict future patterns
Prescriptive: predict new ways to influence patterns
Operations Analytics
Descriptive: install sensors to collect data, reports
Predictive: predict future breakdowns
Prescriptive: prescribe service intervals
Financial Analytics
Descriptive: past data, customer profiles
Predictive: predict fraud, churn, claims
Prescriptive: prescribe different course if actions
Text Analytics
Descriptive: social media trending data
Predictive: predict result sentiment
Prescriptive: prescribe strategies/recommendations
Week 2: Data Warehouse & OLAP
Data Warehouse
Definition: collection of integrated, subject-oriented databases designed to support Decision Support Systems functions, where each unit is non-volatile and relevant to come moment in time
Characteristics
Subject-oriented
Data is organised aroundsubjects or business dimensions, such as sales, customers, orders, claims, accounts, employees etc
Integrated
Data is collected from several transactional databases and integrated in a way to provide a unified picture of each subject overtime
Data from different databases is transformed into a common schema, measurement, code, data type
Aggregated
Data stored is not transactional-level, but aggregated by products, regions, months/years, or some other business dimension
Historical
Data updated at some time interval: weekly, monthly etc
Data stored by weeks, months etc for historical comparison and trend analysis
Time variant
Data analysis includes a timestamp
Non-Volatile
Data is historical and does not change with time
Denormalized
Denormalized data is used to improve query performance
Works because historic data in the data warehouse is rarely updated
Process
Extraction: reading data from one or more databases
Transformation: converting the extracted data from its previous form into the form in which it needs to be for the data warehouse
Load: putting the converted data into the data warehouse
Types
Enterprise Data Warehouse (EDW): a large-scale data warehouse used across the enterprise for decision support
Operational Data Stores (ODS)
used as an interim staging area for a data warehouse
Contents are usually updated throughout the course of business operations
used form short-term decisions involving mission-critical applications
Data Marts
usually smaller and focuses on a particular subject or dept
Dependent data mart: a subset created directly from a data warehouse
Independent data mart: a small data warehouse designed for a strategic business unit or a department
Representation of Data - 3 Schemas
Star Schema
center of the star schema is the fact table
fact table contains measures & a foreign key for each dimension table
Snowflake Schema
logical arrangement of tables in such a way that the entity relationship resembles a snowflake
fact & dimension tables are related by foreign keys and are subject to the familiar PK/FK contraints
dimensions are normalized into multiple related tables through concept of hierarchies
foreign keys in the fact table point to the lowest level of each hierarchy
Constellation Schema
variation of the star schema
consists of more than one fact table
used in a data warehouse that supports several subjects of analysis
fact tables may share one or more dimension tables
Measures & Dimensions
Measure: property on which calculations can be made
Dimension: structure that categorizes measures in order to enable users to answer business questions
3 Types of Facts
Addictive: can be added
Semi-addictive: can be added from some dimensions but not others
Non-addictive: facts that cannot be combined to get a grand total
Dimension Hierarchy
structure made up of 2 or more levels of related dimensions
enable users to navigate to different levels of granularity within the measures of the data warehouse
Granularity: level of detail in a unit of data. The less detail the higher the level of granularity
Types of Dimensions
Type 1 Slowly Changing Dimensions
does not track changes
overwrites the previous dimension information with the current dimension information
impossible to determine state of the dimension members in the past
result in inaccurate analysis
Type 2 Slowly Changing Dimensions
track changes
4 supplementary attributes must be added to the dimension to track history of that dimension: SCD Original ID, SCD Start Date, SCD End Date, SCD Status
Type 3 Slowly Changing Dimension
do not track entire history of the dimension members
OLAP
Online Analytical Processing
most commonly used data analysis techniques in data warehouse
approach performed by end users to query the online system and to conduct analysis
operational structure is based on a concept called a cube - a multidimensional data structure that allows fast analysis capable of efficiently manipulating and analyzing data from multiple perspectives
support decision making
OLAP Activities
generating queries
requesting ad hoc reports
conducting statistical & other analysis
OLAP Operations
Slice: a subset of a multidimensional array
Dice: a slice on 2 or more than 2 dimensions
Drill Down/Up: navigating among levels of data ranging from the most summarized to the most detailed
Roll Up: computing all of the data relationships for 1 or more dimensions
Pivot: used to change the dimensional of a report or an ad hoc query-page display
Steps to Designing DW/Data Mart
Choose the Data Mart
Choose the table granularity
Identifying & Controlling the dimensions
Choosing the facts
Strong pre-calculations in the fact table
Rounding out the Dimension Tables
Choosing the duration of the database
The need to track slowly changing dimensions