Computational Intelligence - February 2014 - 38

Table 1 Details of the cybercrime corpora and latent topics.

TwiTTer

Online
FOrums

TOTal

# cybercrime messages

28,114

25,695

53,809

# cybercrime sentences

87,153

131,045

218,198

# messages with two users

5,112

3,294

8,406

# annotated messages

4,283

2,830

7,113

# collaborative messages

1,045

972

2,017

# transactional messages

314

677

991

# ambiguous messages

2,924

1,181

4,105

# selected topics

# transactional concepts

# collaborative concepts

cybercriminal relationship is undefined. The classification
threshold ~ rel = 0.2E - 8 was empirically established based
on a subset of our cybercrime message corpus. Alternatively,
the proposed inferential language model can perform a ranking-based classification of the relationship labels of messages.
In that case, messages are ranked according to the their probabilities of having a specific relationship label. Accordingly, a
binary classification threshold is not needed.
After relationship classification, a frequency count Rel j (v x, v y)
for a specific type of relationship j ! {transactional,
collaborative} is developed for each pair of cybercriminals
(v x, v y). These frequency values are then subject to a linear normalization Rel nor = (Rel - Rel min) / (Rel max - Rel min) to
develop the final relationship scores for all the pairs. If a pair (v x, v y)
has both a transactional relationship and a collaborative relationship
at the same time (i.e., Rel tran (v x, v y)> 0 and Rel coll (v x, v y)> 0 ), our
system simply assigns the more specific transactional relationship to
the pair. Finally, a cybercriminal network is composed based on the
identified relationships among all the valid pairs pertaining to a
specific period.
4. System Evaluation

To the best of our knowledge, a benchmark data set for the
evaluation of cybercriminal network mining algorithms is not
yet available at the time of this writing. To evaluate the effectiveness of the proposed cybercriminal network mining
method, we first needed to retrieve cybercrime related messages from online social media. For the construction of our
evaluation corpora, we made use of two kinds of social media
sources, namely micro blogs and online forums. We accessed to
the largest micro blog service, Twitter, and a dozen of online
forums (e.g., hacktalk.net, blackhatworld.com, pastebin.com,
etc.) to develop two cybercrime related corpora. We manually
identified a list of 35 well-known cybercriminals as the seeding
users. Then, we used these seeding users as the starting points to
perform breadth-first crawling to retrieve the messages posted
by other suspects of cybercrimes. As for Twitter, we retrieved
the relevant tweets via a publicly available API called Topsy.3
3

http://topsy.com/

IEEE ComputatIonal IntEllIgEnCE magazInE | FEbruary 2014

For instance, for the well-known cybercriminal group Anonymous (also known as FawkesSecurity on Twitter) who claimed to
be responsible for a series of cyber-attacks against HSBC in
October 2012, our crawler first identified all the followers of
this account, and then invoked an internal filter to extract all
cybercrime related tweets from different followers or friends.
For online forums, our crawler identified the related user
accounts that appeared in the same thread of messages or
directly embedded in the message contents. A total of 28,114
cybercrime related messages covering the period from January
2009 to December 2012 were retrieved from Twitter in January 2013. In addition, a total of 25,695 cybercrime related messages of the same period were retrieved from various Internet
forums. To distinguish cybercrime related messages from ordinary online chatting, our internal filter simply utilized a list of
21 common cybercrime keywords to determine the nature of
each conversational message. These keywords were provided by
a group of six cyber-security experts who were the employees
of a cyber-security consulting firm.
For each cybercrime message corpus, a subset of messages
with at least two cybercriminals mentioned in each message was
manually inspected and annotated by a group of three cybersecurity experts so as to determine the specific cybercriminal relationship captured in the message. For the experiments reported in this
paper, we only focus on two types of cybercriminal relationships
namely transactional relationship and collaborative relationship. A
transactional relationship refers to buying or selling cyber-attack
tools between two parties, whereas a collaborative relationship
simply implies the sharing of information or tools between
cybercriminals and it does not involve any monetary exchange
between the two parties. Only if all three experts agreed on a specific type of relationship captured in a message, would that message be annotated with the corresponding relationship label. The
average inter-rater agreement of six annotators as measured by
Cohen's Kappa is K = 0.75 which indicates a relatively consistent and reliable expert judgment for the construction of our
evaluation corpora. The details of our cybercrime message corpora applied to our experiments are given in Table 1. Common
performance evaluation measures such as Precision (P), Recall
(R), F-measure (F), and Accuracy (A) were applied to our experiments. Moreover, the Receiver Operating Characteristic (ROC)
curve (Hand and Till, 2001) was also adopted to assess the performance of all systems such that the results are independent of any
particular classification threshold value chosen.
Apart from the context-sensitive LDA (CSLDA) experimental system that is underpinned by context-sensitive Gibbs
sampling for latent cybercriminal relationship mining, we also
implemented several baseline systems to perform a comparative evaluation. Other probabilistic generative models such as
Partially Labeled Dirichlet Allocation (Ramage et al., 2011)
and Latent Dirichlet Allocation (Blei et al., 2003) were also
adopted as baseline systems. For both of these baseline systems,
a classical Gibbs sampler (Geman and Geman, 1984) were
employed. PLDA requires human assigned tags for documents.
Unfortunately, unlike Web pages, these tags are not normally

http://www.hacktalk.net http://www.blackhatworld.com http://www.pastebin.com

Table of Contents for the Digital Edition of Computational Intelligence - February 2014