Please enable JavaScript.
Coggle requires JavaScript to display documents.
Data Mining (Major Issues in Data Mining (Efficiency and Scalability,…
Data Mining
Major Issues in Data Mining
Efficiency and Scalability
Diversity of data types
User Interaction
Data mining and society
Mining Methodology
GETTING TO KNOW YOUR DATA
Attribute Types:
Binary
Symmetric binary
Asymmetric binary
Ordinal
Nominal
Numeric
Interval-Scaled
Ratio-Scaled
Discrete Attribute
integer variables
Continuous Attribute
floating-point variables
Basic Statistical Description of Data
Measuring the central tendency
Mode
Median
Midrange
Mean
Measuring the dispersion of data
Variance (σ 2 )
Standard Deviation (σ )
Range
Quantiles
Percentiles: 100-quantiles
Quartiles: 4-quantiles
Interquartile range (IQR):
Five-Number Summary
Boxplot
Graphic displays of basic statistical descriptions of data
univariate distributions
Quantile-Quantile (q-q) Plot
Histogram
Quantile Plot
bivariate distribution
Scatter Plot
Data Visualization
Visualization methods:
Pixel-oriented
Geometric projection
Visualizing complex data and relations
Multi-Dimensional View of Data Mining
Knowledge view (Data mining functions)
What Kinds of Patterns Can Be Mined?
Data Mining Functionalities:
Class/concept description
Data characterization
Summarization
Data discrimination
Comparison
Mining frequent patterns, associations, and correlations
Frequent itemsets
Frequent substructures
Frequent subsequences
Classification and regression for predictive analysis
Cluster analysis
Outlier analysis
Application view
Method view
Data view
What Kinds of Data Can Be Mined?
Database Data (Relational database)
Data Warehouses
Transactional database
Advanced datasets
Multimedia database
Text databases
Graphs, social networks data