Please enable JavaScript.
Coggle requires JavaScript to display documents.
COMP307 - Coggle Diagram
COMP307
Probability Rules
Probability Terms
Unconditional/Prior probability (I.e. P(Cause)):
Degrees of belief in propositions in the absence of any other information.
Conditional/Posterior probability (I.e. P(Cause|Effect)):
Degrees of belief in propositions given some more information (evidence). I.e. Posterior probability = prior probability + new evidence/likelihood.
Domain
The set of values that a variable can take. E.g. rainy, sunny, cloudy.
-
-
-
-
-
-
-
-
Bayes Rule
Naive Bayes Classifier: Using Bayes Rule with multiple variables, but assuming conditional independence.
Zero Occurrence: When initializing the probability table, add 1 to all cases to account for zero occurrence.
-
-
Bayesian Networks
Conditional Probability Table (CPT) Semantics:
- A node for each variable.
- connections showing the direct & indirect relations between the nodes.
- A conditional probability table for each node.
Simplified CPT:
Ignore the last possible value as it can be derived. The CPT size is the number of free parameters
I.e. P(!a) can be ignored if P(a) is there since 1 - P(a) = P(!a)
-
Advantages of Bayesian Networks:
- Model encodes dependencies among all variables. Readily handling situations with data entries are missing
- Can be used to learn causal relationships.
- Bayesian statistical methods used along with networks can be used in ways to avoid data overfitting.
-
Building a BN:
- Choose the set of relevant variables that describe the domain
- Choose an order for the variables
- While there are variables left
- Add the next variable X
- Add connections to X from the minimal set of nodes (parents) already in the network, such that the conditional independency property is satisfied
- Define the CPT for X
Goal: Build the BN network with the smallest CPT size possible using good node ordering (Compactness)
-
Inference
The basic task for any probabilistic inference system is to compute the posterior probability distribution for a set of query nodes, given values for some evidence nodes:
What is P(Burglary=true), if we know that (Alarm=true)?
Inference by Enumeration Example
- How likely is the Flu given positive Thermometer reading (with hidden node Temperature)?
- P(Flu|therm) = a * P(Flu,therm)
- P(Flu, therm) = P(Flu, temp, therm) + P(Flu, !temp, therm)
- P(Flu, therm) = use product rule
- Normalization
- P(flu|therm) = P(flu, therm) / (P(flu, therm) + P(!flu, therm))
-
Other Topics
Support Vector Machines (SVM)
- Uses by learning a "Hyperplane" for binary classification
- Uses simple concepts like Simple Perceptron
- Any amount of data can be separable in enough dimensions
- Can also be used for regression
- Good at avoiding over-fitting
- Works well on smaller datasets
Knowledge Based Systems (KBS or Expert Systems)
- Uses facts and heuristics of human expertise
- If <description of situation>
then <consequence>
Natural Language Processing
- Morphological processing: Find root form and category e.g. tenses, nouns, etc.
- Syntax processing: Parse sequence of words into proper structure.
- Semantic processing: Assign meanings to syntax structures.
- Discourse processing: Interpret sentence in context of discourse. E.g. Detecting sarcasm.
- Pragmatic analysis: Reinterpret structure to determine what was meant.
-
-
-
Data Mining and Knowledge Discovery of Data (KDD)
- The process of extracting knowledge from a large amount of data/Big Data
Big Data
- Collecting large amounts of data
- This data can be analysed for bigger decision making
Volume (Size, I.e. TB, GB of data)
-
-
-
-
Deep Learning
- Multiple layers of nonlinear processing units
- learning of feature representations & transformations in each layer
- Sufficient complex models for difficult problems
- Requires a large number of training instances/training
-
-
-
Planning And Scheduling
Classical Planning
- Deterministic
- Static
- Finite
- Fully observable
- Restricted goals
- Implicit time
-
Scheduling
-
-
Dispatching Rule
- Considers only the earliest applicable actions
- Intuitive
- Better than FSS in dynamic scheduling due to FSS search cost.
-
-
-
Routing
-
-
Further improvements can be made using genetic algorithms, tabu search, etc...
-