Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Quality Problems, Data Preprocessing Techniques - Coggle Diagram
-
Outliers
Inconsistent/ Extreme (Very Low, Very High)
-
-
- Data Preprocessing Techniques
2.2 Data Cleaning
Filling Missing Values
-
-
Use a global constant (e.g. zero)
-
-
-
-
-
2.3 Data Reduction
Dimensionality Reduction
Benefits:
- eliminate irrelevant features and noise.
- reduce time and space.
- easily visualize data.
-
Numericity Reduction
-
Non-parametric methods
-
-
Represents data by a smaller, random sample.
Effective sampling if:
- the sample represents approximately the interest of the original data.
- The sample works almost as the entire data set.
-
2.4 Data Transformation
Smoothing
-
Include binning, regression and clustering
-
-
-
-
-
-
-