EDA Exploratory Data Analysis
Exploratory Data Analysis
support the selection of appropriate statistical tools and techniques
suggest hypotheses about the causes of observed phenomena
determine relationships among variables
maximize insight into a data set
detect outliers and anomalies
extract important variables
test underlying assumptions
Typical graphical techniques
Principal component analysis
An approach to analyzing data sets to summarize their main characteristics.
EDA was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.
EDA is different from initial data analysis (IDA).
Tukey's EDA was related to robust statistics and nonparametric statistics. Tukey promoted the use of five number summary of numerical data.
Francis Galton emphasized order statistics and quantiles.
Arthur Lyon Bowley used precursors of the stemplot and five-number summary.
Andrew Ehrenberg articulated a philosophy of data reduction.
John W. Tukey wrote the book Exploratory Data Analysis in 1977.
Typical quantitative techniques
non-graphical or graphical
univariate or multivariate (usually bivariate)