Please enable JavaScript.
Coggle requires JavaScript to display documents.
Automatic Speech Recognition (Process (Analyse the audio (Find distinct…
Automatic
Speech Recognition
Introduction
Process of converting an
acoustic speech signal
into a sequence of
written words
Process
Load audio
Load list of words to be recognized
Analyse the audio
Find distinct sound characteristic
Separate the speech & background noise
Compare sound input to model based on word list
Return set of probable matches
Final answer = with highest probability
Data Preparation : Training
Raw data
Speech corpus
Transcription
Pronunciation model
Text corpus for **G2P &
language model**
Used by
decoder
when convert the speech into text
Modelling
Language
To
constraint
search during
decoding process
Limiting
the number of possible words
that need to be considered
Goal
Come up with a
representation
to provide
probability of the word
Language Model
A
database
that stored information from the modelling
Constraint
Absolutely
By
enumeration
some small subset of possible expansion
Have an
associated grammar
that compiled down into graph (parse tree)
Probabilistically
By computing a
likelihood search
for each possible successor word
Trained from a corpus