Please enable JavaScript.

Coggle requires JavaScript to display documents.

Automatic Sarcasm Detection: A Survey APPROACHES (RULE-BASED (Veale and…

- - - - design pattern-based features that
        indicate presence of discriminative patterns as extracted from a large sarcasm-labeled corpus. To allow generalized patterns to be spotted by the classiﬁers, these pattern-based features take real values based on three situations: exact match, partial overlap and no match.
    - - use sentiment lexicon-based features. In addition, pragmatic features like emoticons and user mentions are also used.
    - - introduce features related to ambiguity, unexpectedness, emotional scenario, etc. Ambiguity features cover structural, morpho-syntactic, semantic ambiguity, while unexpectedness features measure semantic relatedness.
    - - use a set of patterns,specifically positive verbs and negative situation phrases, as features for a classiﬁer (in addition to a rule-based classiﬁer).
    - - introduce bigrams and trigrams as features.
    - - explore skip-gram and character n-gram-based features
    - - include seven sets of features. Some of these are maximum/minimum/gap of intensity of adjectives and adverbs, max/min/average number of synonyms and synsets for words in the target text, etc. Apart from a subset of these,
    - - use frequency and rarity of words as indicators.
    - - incorporate ellipsis, hyperbole and imbalance in their set of features.
    - - use features corresponding to the linguistic theory of incongruity. The features are classiﬁed into two sets: implicit and explicit incongruity based features.
    - - use word-shape and pointedness features given in the form of 24 classes.
    - - use extensions of words, number of ﬂips, readability features in addition to others.
    - - present features that measure semantic relatedness between words using Wordnet-based similarity.
    - - introduce POS sequences and semantic imbalance as features. Since they also experiment with Chinese datasets, they use language-typical features like use of homophony, use of honoriﬁcs, etc.
    - - conduct additional experiments with human annotators where they record their eye movements. Based on these eye movements, they design a set of gaze based features such as average ﬁxation duration, regression count, skip count, etc. In addition, they also use complex gaze-based features based on saliency graphs which connect words in a sentence with edges representing saccade between the words.
  - - - SVM with SMO and logistic regression. Chi-squared test is used to identify discriminating features
    - - Naive Bayes and SVM. They also show Jaccard similarity between labels and the features
    - - compare rule-based techniques with a SVM-based classiﬁer.
    - - use balanced window algorithm in order to determine high-ranking features.
    - - use Naive Bayes and decision trees for multiple pairs of labels among irony, humor, politics and education
    - - use binary logistic regression
    - - use SVM HMM in order to incorporate sequence nature of output labels in a conversation.
    - - compare several classiﬁcation approaches including bagging, boosting, etc. and show results on ﬁve datasets.
    - - experimentally validate that for conversational data, sequence labeling algorithms perform better than classiﬁcation algorithms. They use SVM-HMM and SEARN as the sequence labeling algorithms.