Please enable JavaScript.
Coggle requires JavaScript to display documents.
Detecting Sarcasm in Text: An Obvious Solution to a Trivial Problem…
Detecting Sarcasm in Text: An Obvious Solution to a Trivial Problem
Information
Recent advances in natural language sentence generation research have seen increasing interests in measuring negativity and positivity from the sentiment of words or phrases
However, accuracy and robustness of results are often affected by untruthful sentiments that are of sarcasm nature and this is often left untreated
Sarcasm detection is a very important process to filter out noisy data (in this case, sarcastic sentences) from training data inputs, which can be used for natural language sentence generation
Design a machine learning algorithm for sarcasm detection in text by leveraging the work done by Mathieu Cliche of www.thesarcasmdetector.com and improving upon it
Reference
Analysis on social media has attracted much interest in the research areas of NLP over the past decade (Pta´cek et al. ˇ , 2014).
Dataset and Features
Baseline Model Description
Aside from a value of 0.1 for the penalty parameter
C, all other configuration options are left as default.
Support vector machine (SVM) as implemented by the LinearSVC function from scikitlearn, a popular open source machine learning library in Python
Features are extracted from the raw Twitter data to create training examples that are fed into the SVM to create a hypothesis model.
The tweets were collected over a span of several months in 2014. The sanitation processed included removing all the hashtags, non ASCII characters, and http links
In addition, each tweet is tokenized, stemmed, and uncapitalized through the use of the Python NLTK library.
Baseline Features
For each tweet, features that are hypothesized to be crucial to sarcasm detection are extracted. The features fall broadly into 5 categories: n-grams, sentiments, parts of speeches, capitalizations, and topics.
N-grams
Individual tokens (i.e. unigrams) and bigrams are placed into a binary feature dictionary.
Bigrams are extracted using the same library and are defined as pairs of words that typically go together. Examples include artificial intelligence, peanut butter, etc.
Sentiments
A tweet is broken up into two and three parts
Sentiment scores are calculated using two libraries (SentiWordNet and TextBlob)
Positive and negative sentiment scores are collected for the overall tweet as well as each individual part. Furthermore, the contrast between the parts are inserted into the features
SentWordNet
TextBlob
Parts of Speech
The parts of speech in each tweet are counted and inserted into the features.
Capitalizations
A binary flag indicating whether the tweet contains at least 4 tokens that start with a capitalization is inserted into the features
Topics
The python library gensim which implements topic modeling using latent Dirichlet allocation (LDA) is used to learn the topics.
The collection of topics for each tweet is then inserted into the features.
Methods and Analysis
. Analysis of Baseline Model
Initial analysis of the baseline model quickly reveals that the testing error far exceeds the training error. The large gap between training and testing error suggests that the model is suffering from high variance
Model Improvement Methods
NAIVE BAYES
ONE CLASS SVM
GAUSSIAN KERNEL
Targeted Areas for Improvement
Also, non-sarcastic sentences can still have both positive and negative sentiments.
Also, we should test whether adding sentiments improves our classification by a significant factor. While it is true that some sarcastic sentences have words words with negative sentiments and others with positive sentiments, many other sentences do not have this property. Thus, adding this feature might not be useful.
In both cases, we think it is important to reduce the dimension of feature space and use relevant features. For instance, the benefit of adding some features such as bigrams, sentiments and topics is not clear. Bigrams might have the same effect as unigrams.
The high testing error of the model at hand implies that we are fitting noise. This problem could be caused by the fact that we have a high dimensional feature space. Another possibility is that there are features that are not relevant for detecting sarcasm
We also think that finding the sentiments in each training example takes a lot of time.
For each training example, we have to look for the sentiment of each word in a dictionary, which takes a lot of time.
We also want to investigate the topics that are added. Topic modeling using LDA might be returning similar words as the unigrams of the training example, and we might end up getting redundant information.
However, we think that categorizing the training examples into a set of topics can be useful in a different way than it is used. Instead of adding topics as a separate feature, we might split our classifier to n-classifiers, where n is the number of topics in the training set. In other words, we build a classifier for each topic.
Meta
Author
Chun-Che Peng
Mohammad Lakis
Jan Wei Pan
Year
2015
Goals
Result
Naive Bayes and one-class SVM. Both of them misclassify most of the sarcastic data.
We also see that the accuracy greatly depend on a mixture of feature types. Unigrams and bigrams alone are insufficient in designing an accurate classifier. When combined with other types such as topic modeling, the accuracy is greatly increased
We found more questions than answers but that in of itself is a small step in the right direction.