IEEE Computational Intelligence Magazine - November 2019 - 79

dealing with tweets is the enormity of the
data leading to the Big Data solutions. For
solving this issue, our approach uses
Apache Spark and its Machine learning
Library (MLlib).
The remainder of this paper is organized as follows: Section 2 describes
state-of-the-art work within the domain
of irony, sarcasm, and figurative language
detection. Section 3 gives a brief introduction to the methodologies used by
the proposed approach for extracting
semantic features. Section 4 presents the
proposed framework. Section  5 introduces the datasets used for tests and how
they have been obtained and processed,
while, Section 6 shows the experimental setup and the obtained results. Finally,
Section 7 concludes the paper and discusses future directions.
2. Related Work

With the adoption of social networks, the
users post their comments, opinions and
emotions on-line.These trends breed new
challenges [11], [12] and opportunities for
analyzing their text in order to detect sentiments and emotions [1]-[4], [13].
2.1. Hybrid Approaches

These approaches include statistical
methods combined with knowledge
bases. [1] is one of the first methods to
target the problem of SA using statistical
approaches on top of pre-processed textual data. Later on, the rapid growth of
Semantic Web techniques has greatly
affected the SA methods by improving
the results over classical statistical
approaches taking into account the
semantics [14]. For example, in [2], [3],
[15] the authors proposed an approach
based on the neo-Davidsonian assumption that events and situations are the
primary entities for contextualizing
opinions. This allowed distinguishing
holders, main topics, and sub-topics of
an opinion by employing a machine
reader tool that leverages NLP and
Knowledge Representation components
jointly with cognitively-inspired frames.
Another example that leverages frame
semantics and lexical resources within
the SA was published in [16], where
authors employed frame semantics and

Figurative language such as irony complicates opinion
mining, attributing a negative meaning to positive
statements. This work shows that semantic frames may
be important clues in determining the subjectivity of a
text, important for figurative language detection.
conceptual information (BabelNet synsets) detected by Framester to extract
semantic features from social media for
polarity detection showing a remarkable
improvement in F-measure when using
semantics. This paper extends the previous study for the figurative language
detection problem within social media.
The above mentioned studies employing semantics use Sentic Computing [17]
techniques to bridge the gap between
statistical NLP and linguistics, common
sense reasoning, and affective computing; furthermore, they enable the analysis of text not only at document, page or
paragraph level, but also at sentence,
clause and concept level.
2.2. Methods for Irony Detection

Irony and its theory about its usage in
human interaction have been discussed
in detail as a sophisticated and complex
mode of communication [18], where
irony markers and motivations that
speakers have for using irony are indicated. Usually, reasons for using irony lie
in its social and rhetorical functions
whereas the function of markers of
irony is to make its processing simpler.
Several attempts for targeting this problem have been made and, in particular,
a challenge, related to the polarity
detection of tweets containing figurative language has been presented at the
prestigious SemEval 2015 workshop2.
15 teams participated in the task and a
total of 35 runs have been submitted.
The best reported system achieved the
score of 0.758 using the Cosine Similarity measure, and a score of 2.117
using the MSE. The score of each system ranged from 0.059 to 0.758 using
cosine similarity and from 11.274 to
2.117 using MSE.

A similar challenge for irony polarity
detection has been proposed for the Italian language at SENTIPOLC3, indicating a growing interest in irony detection
in the international NLP community.
Similar challenges, not involving directly
an irony detection task, but in which irony
detection may prove useful, have been
organized also for French (DEFT20154)
and Spanish (TASS20155).
It can be noticed that the use of figurative language can be peculiar for each
language. Authors in [19] investigated the
automatic detection of irony and humor
in social networks. They proposed a rich
set of features for text interpretation and
representation to train classification procedures. Decision trees have been employed
focusing on lexical and semantic information that characterize each word, rather
than the words themselves as features. The
used features took into account frequency,
written/spoken differences, sentiments,
ambiguity, intensity, and synonymy.
Other methods to try to automatically detect irony and humor are discussed in [20], [21], where authors
identify figurative uses of language. In
particular, authors have considered features to represent a different type of patterns from a text such as ambiguity,
polarity, unexpectedness, and emotional
content: they represent low and high
level properties of figurative language
based on formal linguistic elements. Patterns have been evaluated on a corpus of
50k tweets. The research has shown that
all the features together provide a useful
linguistic inventory for detecting humor
and irony at textual level.
ContextuAl SarCasm DEtector
(CASCADE) [22] performs context and




IEEE Computational Intelligence Magazine - November 2019

