Please enable JavaScript.
Coggle requires JavaScript to display documents.
(machine learning workflow ((model, algorithm, training data, test data,…
machine learning workflow
model
algorithm
training data
test data
features
target variable
observations
supervised learning
regression
classification
labeled
unsupervised learning
unlabeled
clustering
patterns
garbage in garbage out
overfitting
memorisation
exploratory analysis
data cleaning
feature engineering
algorithm selection
model training
data wrangling
restructure data format
preprocessing
no idea
ensembling / combine models
data explore
overview
summary stats
visualisation
no of obs
no of features
features numerical / category
any target var?
histograms
outliers
binary
median
frequency
numeric features
barplots
category features
sparse classes
can cause overfit
box plots
median
min max
relationship btwn category and number features
corelations
heatmaps
positie
negative
python seaborn
data cleaning
typo
capital letter
supposed to be under same cat
missing data
category
label as missing
numeric
fill with null or zero?
outliers
warning: innocent until proven guilty
duplicates
combine datasets
scraping
irrelevant
feature
domain knowledge
interaction feature
multiply
divide
add
subtract
group spare classes
what are features?
set category as binary
remove ids
algorithms
linear
limited
tree
regularisation
model coefficient
size of coeff
square of coeff
feature shrinkage
feature selection
combination of both