Please enable JavaScript.
Coggle requires JavaScript to display documents.
Effective Feature Integration for Automated Short Answer Scoring …
Effective Feature Integration for Automated Short Answer Scoring
Sakaguchi et al. 2015
Features
Reference based
BLEU
word2vec cosine
word2vec alignment
WordNet alignment
Response based
sparse binary indicators of linguistic features
binned response length (e.g. the length-7
feature fires when the character contains 128 - 255 characters.)
word n-grams from n = 1 to 2
character n-grams from n = 2 to 5
syntactic dependencies in the form of ParentLabel-Child (e.g. boy-det-the for “the boy”)
semantic roles in the form of PropBank3 style
(e.g. say.01-A0-boy for “(the) boy said”)
Model (supervised)
Stacking
: Level 1) SVR with sparse response based features
Level 2) SVR with dense reference features + "response-based prediction" feature from Level 1 (single continuous value)
Results
Each similarity metric by itself does not always improve the performance remarkably from the baseline (i.e., the response length bin features)
when we incorporate all the similarity features, we obtained substantial gain in all 4 questions -> but response-based features were better
The performance of all models increased as training data grew
the models with response-based features outperform those with just reference-based features
the stacked model tended to outperform the other models for cases where the number of training examples was very limited :warning:
stacking enables learning better feature weights than a simple combination when the feature set contains a mixture of sparse as well as dense features, particularly for smaller data sizes
Room for improvement
to explore a more sophisticated model where the regression models in different layers are trained simultaneously by
back-propagating the error of the upper-layer, as in
neural networks
To explore using AdaBoost ?