NLP

Classify

Extract

Summarize

Extractive

Abstractive

Zipf's law

Bag-of-Words (BOW model)

it is just a count vectorization

Milk is good and not expensive
Milk is expensive and not good
BOW model things both are the same information

Sequence Modeling

n-grams

Hidden Markov Model

Conditional Random fields

Conventional Neural Nets

Why Probality

Bayes rule:
P(A|B)=P(B|A)P(A)P(B)


where P(A) is the prior probability of A, P(B) is the prior probability of B , P(A|B) is the posterior probability of A given B , and P(B|A) is the likelihood of B given A .

p(the lady is beautiful) > p(beautiful the is lady)

\( p(w_i) = \frac{C(w_i)}{\sum_{w\in Vocab}C(w) } \)

Perplexity score
Perplexity score is used to determine how the model is confused with the given text. The usually score between 0 and 1. The lower the perplexity score, the better the model is.

Divide the data into 3 standard section

Training

Heldout

Testing

Smoothing

Smoothing

Backoff

class based models

Laplace smoothing

Add K smoothing

Interpolation
Mix of different ngrams with lower order like 4gram, trigram & unigam

Kneser-Ney Smoothing

Nelder–Mead method

Splitting dataset

Training

Heldout

Testing

to allow hyper parameters to be experimented with

Discriminative models

Mutual Information

Information Gain

Entropy

amount of uncertainty in a distribution

Logistic Regression


\( \sigma(z) = \frac{1}{1 + e^{-z}} \)

Loss function: cross-entropy

Optimization algorithm: gradient descent

Support Vector Machine (SVM)

SVM

Sequence Tagging

POS Tagging

Named Entity tagging/Named Entity Recognition (NER):

Dialogue Act tagging

noun, verb, pronoun, preposition, adjective, adverb,
conjunction, article

Semantics

First Order Logic Semantics

Logical symbols

Non-logical symbool

Quantifiers

eg: John, Mary, Vegetarian, Food

Model Consists the following elements

  1. Domain: a set of individuals/symbols
  2. Properties
    3.

Higher Order Logic

The Lambda Notation