Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Preprocessing - Coggle Diagram
Data Preprocessing
-
Data Integration
-
sources may include multiple database, data cubes or flat files
-
Data Cleaning
Definition
Handling data that are incomplete, noisy and inconsistent
-
Data Transformation
Definition
-
A function that maps the entire set of values of a given attribute to a new set of replacement values (each old value can be identified with one of the new values)
-
Aggregation: summarization, data cube construction
-
NormalizationNormalization: scaled to fall within a small, specified range
-
-
-
-
Data Reduction
-
Why?
E.g. A database/data warehouse may store terabytes of data. Complex data analysis may take a very long time to run on the complete data set.