Please enable JavaScript.
Coggle requires JavaScript to display documents.
Chapter 2: Business Problems and Data Science (Tasks (Clustering, Co…
Chapter 2: Business Problems and Data Science
Data Mining Process
CRISP-DM
Business Understanding -> Data Understanding -> Data Preparation -> Modeling -> Evaluation -> Deployment
Tasks
Clustering
Co-occurrence grouping
Similarity Matching
Profiling
Regression
Link Prediction
Classification and class probability estimation
Data Reduction
Causal Modeling
Other techniques and technologies
Data Base Query
SQL Structured Query Language
Data Warehousing
Statistics
Summary Statistics
Ex. Average income in the US: Mean average - $60,000. Median average - $48,000
Statistics
The field of study/ proper name
Machine Learning and Data Mining
Regression Analysis
Unsupervised
When there is no specified target, then data mining falls under this subset
Clustering
no guarantee that similarities are meaningful and useful outside the given data
The results of data mining
Mining the data to find patterns and build models
Using the results of data mining
Supervised
Data is being collected and questions are being asked with a specific target in mind
2 main subclassses
Classification
categorical (often binary) target
Regression
Numeric target
There must be data on the target