Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 19: Correlation and Regression - Coggle Diagram
Chapter 19: Correlation and Regression
statistical methods used to identify and quantify relationships between process variables.
Fundamental Concepts
Purpose
These tools are used to calculate whether two factors are related, such as an input to an output or two separate inputs
Causation Warning
A critical rule is that strong correlation or regression does not indicate a causal relationship
Predictive Value
They allow teams to create equations that predict how a dependent element will perform when changes are made to an independent variable
Data Requirements
Both variables must be quantitative (continuous or ratio data) and cannot be qualitative labels like names
Correlation Analysis
Linear Association
Correlation specifically measures the linear relationship between two variable
Correlation Types
Positive Correlation
One variable increases as the other increases
Negative Correlation
One variable decreases as the other increases
No Correlation
Data points are scattered randomly, and the trend line is almost horizonta
Correlation Coefficient (R)
A number between -1 and 1
Values of 1 or -1 indicate a certain relationship where all data points fall exactly on the trend line
A value of 0 indicates no relationship
In most applications, correlation is considered significant if R is 0.4 or more (or -0.4 or less).
Linear Regression Analysis
The Model
Regression is used once a linear correlation is established to create a linear model for predictions
Coefficient of Determination
Calculated as the square of the correlation coefficient (R), ranging from 0 to 1
It represents how much variation in the dependent variable (Y) is related to changes in the independent variable (X)
Regression Equation
Statistical software generates an equation (y=mx+b) for the best fit line
Six Sigma Business Applications
DMAIC Phases
Typically utilized during the Measure, Analyze, or Improve stages
Problem Functions
Regression requires data that can be written as a function: y=f(x)
Verification
While it does not prove causation, a strong regression model helps validate assumptions arrived at through brainstorming or Fishbone diagrams.
Determining Optimum Values
Teams use the regression equation to solve for X to achieve a desired Y result
Implementation Tools
Scatter Diagrams
Used for initial graphical analysis to visually identify trends before running statistics
Excel Formulas
The CORREL formula is a simple function used to find the correlation coefficient (R)
Analysis ToolPak
The Analysis ToolPak is an Excel tool that helps Lean Six Sigma teams analyze data and identify problems in a process so they can make improvements based on statistics rather than assumptions.