Please enable JavaScript.
Coggle requires JavaScript to display documents.
Business Problems (Data mining (understood process (CRISP-DM, Business…
Business Problems
Data mining
understood process
CRISP-DM
Business Understanding
Data Understanding
Data Preparation
Modeling
Evaluation
Deployment
pattern centered
task-oriented
sub-task levels
summed solution
algorithms
few fundamentals
Classification
also probability estimation
predict population class
where they belong
usually mutually exclusive
scoring
likelihood they belong
done for each
Regression
Value estimation
of certain variable
per individual
Clustering
group population
based on similarity
no specific grouping
find natural grouping
natural segmenting
used as input
decision-making processes
Co-occurence grouping
frequent itemset mining
association rule discovery
market-basket analysis
find associations
hot sauce and meat
new potential promotions
Link Prediction
predict connections
FB mutual friends
link strength prediction
Netflix movie rec's
Data reduction
replace data sizes
trade off
less info
more insight
only show genres
Casual modeling
what influences others
did advertisement work?
requires heavy investment
significant data needed
A/B tests
assumption driven
Similarity matching
identify similar people
similar companies (IBM)
firmographic data
product recommendations
assist other algorithms
classification
regression
clustering
Profiling
behavior description
fraud prevention
complex
phone minutes used
international
weekend/weekday
text minutes
anomaly detection
Methods
Supervised
target defined
likelihood inspired action
target-defined subclasses
classification
categorical/binary target
regression
numerical target
unsupervised
no target
natural groups form?
Results
find patterns
use patterns
Key skills
problem decomposition
pattern recognition
common solutions
common problems
Personality infusion
creativity & intelligence
software vs analytical
Software: formulate code
analytical: formulate problem
Analytical Techniques/Technologies
Statistics
summary statistics
quick biz info
mean
median
mode
analytical Statistics
foundation for data science
Database Querying
SQL (Structured Query Language)
GUI (Graphical User Interface)
helps prove hunches
OLAP (On-line analytical processing)
real time
Data Warehousing
diff data 1 place
centralized data
speeds up access
stronger abilities
Regression Analysis
econometric analysis
statistics
extract patterns
explanatory modeling
why someone left
predictive modeling
who will leave next
Machine Learning
extract predictive models
Applied Statistics
Pattern Recognition
KDD: Knowledge Discovery and Data Mining
application
performance improvement
robotics
computer vision