Please enable JavaScript.
Coggle requires JavaScript to display documents.
Week 5 - Classification & Prediction (5. Attribute Selection Method
…
-
Week 5.3.2 Classification & Prediction: SVM, Lazy Learners & Variants
-
-
-
-
-
-
Week 5.5 Semantic Web
-
-
-
3. The Web Ontology Language
- The Web Ontology Languagenguage (OWL) is a modellng language (like UML) that is logical, not specifically diagrammatic, and is expressed as logical
axioms over which deductive inference is supported
3.1 OWL Preliminaries3.2 OWL Classes3.3 Relating Classes and Properties3.4 OWL Individuals
5. Semantic Web Mining5.1 OWL Learning: Example5.2 Refinement-based Classification5.3 Summary
- RDF captures structure, categorical and numerical features.
- OWL captures highly expressive domain knowledge over RDF.
-
2. Classification
- Classification builds models that describe interesting classes of data;
- Supervised learning refers to classification, where the training data
(observations, measurements, etc.) is accompanied by labels indicating the class of the observations, and new data is classified based on the training set;
- Unsupervised learning there are no class labels in the training data and the learning algorthm must find some interesting classes, or classifications with which to classify new data. This is commonly called clustering; 是机器学习的一种方法,没有给定事先标记过的训练示例,自动对输入的数据进行分类或分群
- So classification can also be defined as supervised learning of categorical variables;
-
2.2 Evaluation
Learning algorithms (or learners) that build models built for classification and prediction are generally evaluated in the following ways:
- Accuracy;
- Speed and complexity;
- Scalability;
- Robustness;
- Interpretability;
2.1 Construct and Evaluate
- Step 1: Training phase or learning step: Build a model from the labelled training set.
- Step 2: Use the model to classify unseen objects
Data classification process:a) Learning: Training data is analysed by a classification algorithm. Here, the class label attribute is loan_decision, and the learned model or classifier is represented in the form of classification rules.b) Classification: Test data are used to estimate the accuracy of the classification rules. If the accuracy is considered acceptable, the rules can be applied to the
classification of new, unlabelled, data tuples. 统计分类的目标是根据已知样本的某些特征,判断一个新的样本属于哪种已知的样本类
-
-
-
4. Basic, greedy, decision tree algorithm
- Greedy (it makes decisions optimising the next step context and never backtracks to reconsider.
- The node are partitioned to sub-nodes based on selected attributes.
- Paritioning stops when:
- All samples for a given node belong to the same class; or
- There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf; or
- There are no samples left
-
-