Please enable JavaScript.
Coggle requires JavaScript to display documents.
Introduction (What it is? (Knowledgediscovery in databases ( refers to the…
Introduction
What it is?
Knowledgediscovery in databases ( refers to the efficient process of through large volumes of raw data in databases to find potentiallyuseful information that is implicitly embedded in the data Data Mining is anintegral step of KDD that discovers hidden patterns from an input data set
-
Non trivial implicit information ; not the raw data, nor the
result of a simple data summary
-
-
The Background
Computerisation of operations in commercial, governmental and scientific organisations has resulted in large volumes of operational data, e.g.
- Itemised telephone bills
- Bank statements
- Supermarket transactions
- Share prices
- Scientific experimental data sets
- Published web pages
- CCTV video footages
-
Consequences:
-
Many commercial database management systems (DBMSs) are not equipped with data comprehension and analysis tools.
We may be data rich, but information poor .
Data Mining Objectives
Classification
Using existing data to form a classification model and then using the model to assign an appropriate class label for a data record(e.g. safe vs. risky
Estimation
Similar to classification but to assign a value to an output variable of a data record (e.g. estimated house value)
Prediction
Similar to classification and estimation, but more concerned with future outcome of the output (e.g. tomorrow’s weather)
-
Data, Information and Knowledge
Data (D)
- Isolated factual recording of separate objects and
events
- Enables the recording of the seen events
Information (I)
- Fact of meaningful context represented by
relationships between isolated data items
- Information enables the responding to the seen
events
Knowledge (K)
- Verified known information that is accommodated
into the business process
- Enable the anticipation of the unseen events