Please enable JavaScript.
Coggle requires JavaScript to display documents.
Accents and Languages Planning (Accents (infrastructure (pronunciation…
Accents and Languages Planning
Accents
Data
Accented data set (buy one?)
common voice
scottish/english DB existing
combilex lexicon
TTS
limited voices
How do we use untranscribed data?
unsupervised pre-training followed by supervised training
get more transcribed
need more volume
100 hours per accent (too low for specific accent models)
200+ for good models
send more through pipegood?
Transcriber team not great at identifying or transcribing accents
voice cloning, convert our data
Simon briefly experimented with this and it didn't work
Which ones
survey
clustering?
speaker classification for accents
What samples are we bad at and why?
infrastructure
augmenting base model vs accent specific models
experiment
LM and AM
LM more important with specific models
pre-processing pipeline to remove accent?
Simon says we can't do FMLR in realtime, need to know end of sentence, speaker info, etc.
auto encoder to remove accent "noise"
pronunciation model improvement
map to british pronunciations
different phoneme set
need to re-map OOVs
ivector training
accented samples and do training
languages
Which ones
How do we know which model to use
local based for V1
User setting
Japanese
Start identifying languages
UK English
Internal build
build a new language
new punctuator
new training data
subword units
audio data
lexicon/lexicographers
transcribers
new textproc, metrics
we have to support it going forward
DATA!
use contractors to help accelerate data processing
External APIs
do they exist for the languages we want to pursue?
would we be able to train on our domain/audio?