IEEE Computational Intelligence Magazine - August 2019 - 38
[31].These dense clusters were considered
social media events. In another work, a
system named DYNDENS was developed which quantified the magnitude of
change based on updates in edge weights.
The system incrementally computed
dense subgraphs to detect event stories
[32]. DYNDENS is efficient and scalable
to rapidly evolving datasets. Although the
detection methods proposed by [31], [32]
are efficient, despite rapid changes in
microblog streams, they suffer from the
loss of single-entity events.
Traditional event detection methods
are not designed to process and detect
events efficiently from such dynamic
data, particularly when the data stream is
noisy and consists of diverse events. In
addition, most of the state-of-the-art
approaches depend on highly weighted
and frequent patterns to detect events
[23], [25]-[27]. These approaches ignore
the dominating nature of burstiness over
small events in the data.
The proposed approach differs from
existing approaches because it highlights
dominating patterns at an early stage in
the text stream and handles post-event
effects by suppressing those patterns in
the subsequent time interval, which provides an opportunity to discover new
emerging events. Figure 5 visualizes the
pre-event, event, post-event graphs to
show the characteristics of the proposed
approach. Instead of focusing on burstiness, we considered change in temporal
frequency with respect to time which
we named displaced temporal frequency.
It captured the change in the frequencies of words appearing in text stream at
an early stage and later suppressed their
burstiness to highlight other topics.
These characteristics are an inherent
part of the proposed approach, which
lead to a better performance in the
event detection process.
5. Conclusion
In this paper, we presented a novel, sensitive and efficient Weighted Dynamic
Heartbeat Graph (WDHG) method to
detect events from a text stream. The
text stream was systematically transformed into a series of temporal graphs.
These graphs inherited temporal fre-
quencies and co-occurrence relationships of the words appearing in the text
stream. Each graph was further used to
extract a heartbeat score using two features: growth factor and aggregated centrality. A rule-based classifier labeled the
graphs as event candidates. Multiple
event candidates were merged together
to extract a list of ranked topics. For the
performance evaluation of the proposed
approach, three benchmarks: FA Cup,
Super Tuesday, and the US Election
were used. The quantitative evaluation
showed that the proposed approach outperformed the state-of-the-art methods.
The empirical evaluation showed that
the proposed approach is computationally efficient and scalable. In the future,
we plan to explore user participation
and social network based features, as
well as test the proposed approach on
live text streams.
