Please enable JavaScript.
Coggle requires JavaScript to display documents.
(Diega Chatbot) Recipes for building an open-domain chatbot - Coggle…
(Diega Chatbot) Recipes for building an open-domain chatbot
1 Introduction
This is an imporvement on prvious works because the use of novel:
Blending Skills
Generation Strategies
This chatbot is limited as it does not solve the problem of...
Open domain conversation
2 Model Architectures
Retriever
Poly-encoder from (humeau et al 2019)
Generative
GPT generative pre trained transformers
Retrieve-Refine
3 Training Objectives
Ranking for Retrieval
cross-entropy loss for logits
Likelihood Training For Generation
Use Maximum Likelihood Estimators
α-blending for Retrieve and Refine
an alhpa hyperparameters ensures that the retrieval happens instead of using the golden label response
Unlikelihood training for generation
a sum between likelihood and unlikely generation wherelikelihood pushes up the likelihood of a gold label.
4 Decoding
Beam Search
Sampling
Response Length
Minimum Length
Predictive Length
Subsequent Blocking
5 Training Details
Pre train the ranking models
Use Fairseq toolkit
8 Evaluation Methods
Self-chat Acute Eval
models are used for both sides of a conversation, instead of human-model chat
ACUTE-Eval
Two questions for evaluation
Engagingness question: “Who would you prefer to talk to for a long conversation?"
Humanness question: “Which speaker sounds
more human?”
6 Training Data
Pre-training
:check:Pushshift.io Reddit
Fine Tuning Using the following tasks:
ConvAI2
Focuses on personality and egaging with the other speaker
Empathetic Dialogues
Focuses on emapthy
Wizard of Wikipedia
Focuses on knowledge
Blended Skill Talk
A mix of Wizard of Wikipedia, empathetic dialogues, and ConvAI2
7 Safety Characteristics
There are still unresolved issues, the robot might learn biased or offensive dialog
9 Related works
Lots of references for other work
10 Results and Analysis
Automatic Evaluations
We cannot rely on these evaluations to assess performance by itself
Self Chat evaluations
Which retreiver or generator strategy to use?
Retrieve and Refine works best
Better use this than the automatic evaluators
11 Released code and models
12 Discussion
Limitations:
I Contradict or repeat themselves on occasion
II Tend to repeat the same phrases in separate conversations
III hallucinate knowledge (create statements out of nothing)
(Made worse by longer convos, cz the model was trained on short convos)
The bot is not well behaved
we expect bots to have more integrity than the average human (or to even be faultless), but they have much less understanding of what they are saying than humans.
with a given decoding scheme, human evaluation and perplexity are correlated
The lower perplexity does not imply the better human evaluation