Computational Intelligence - August 2015 - 27

actual, experimentally confirmed inactive
Tanimoto ELM with N hidden neurons can perfectly
compounds (true negative samples). As a
result, using a dataset consisting of only
learn any set of N binary vectors using only sparse,
experimentally confirmed samples has the
binary weights.
great flaw of disrupting the samples class
ratios (as in real life there are much more
interesting that the ELM using sigmoid activation functions
inactive compounds than active ones). To overcome this issue,
also achieved very weak generalization scores in the experifor each protein, we consider a dataset consisting of ligands
ments. Only the use of SubFP or EstateFP resulted in a
(compounds) of experimentally confirmed biological activity
GMean above 90%.
(positive samples) and negative samples generated using the
Weak results for the ELM were probably caused by the high
DUD methodology [18] to obtain a more realistic positivedimension and sparsity of the remaining fingerprints, for which
negative ratio. This phase of the experiment was carried out
scalar product based-methods are not well suited. It is further
under supervision of experts in the field of computational
supported by the fact that once the hidden neurons are selected
chemistry from the Institute of Pharmacology of the Polish
from the training set instead of being completely random,
Academy of Sciences to ensure correctness from the perspecscores increase significantly (although they remain much worse
tive of the drug design process. The exact number of particular
than those for other methods). The RBF-based ELM behaved
samples generated for each protein is summarized in TableĀ  1.
very well, with surprisingly bad results using the ExtFP repreFor each such dataset we investigated 5 popular fingerprints ({
sentation; this was possibly also due to the high dimension of
embeddings) generated using PaDEL descriptors software [17]:
such data, which would require a more careful choice of the
EstateFP, ExtFP, MACCSFP, PubchemFP and SubFP. Such a
distributions from which biases are drawn (e.g., methods based
procedure results in 40 distinct datasets with binary labeling.
on Random Projections [44]). There are some existing heurisWe use a GMean8 (geometric mean of accuracy over positics for the choice of such values; in the case of SVM, it is comtive and negative samples) as an evaluation metric. Due to its
mon practice to choose a c parameter such as 1/d (the
balanced nature and usage in previous works regarding Weighted ELM [32]. We also include results obtained with Balanced
heuristic used in libSVM) or to use some statistics of the data
Accuracy9 used in previous works regarding compounds actividistribution [45]. Possibly, something similar could be used for
seeding the distribution for RBF-based ELMs.
ty prediction [43]. One could obviously analyze many other
The proposed method achieved best results in most of the
classification quality metrics; however, this would require a
datasets. The results are comparable to those obtained by SVM
change in the weighting schemes in all the balanced methods
with an RBF kernel (however, T-WELM is still slightly better),
(WELM, T-WELM, SVM with class weighting, wRF, and
while they are significantly better than all of the other ELMwAdaBoost), which is beyond the scope of this work.
based models as well as RF and AdaBoost. In the case of our
Experiments were performed in the repeated 10-fold crossmethod, selecting W from the training set led to a slight
validation manner with fixed maximum evaluation time at
10 minutes per training. Each model was tuned in terms of
increase in overall scores, but more importantly, such an
hyperparameters; thus, for ELMs, we tuned the number of hidapproach has the important advantage of removing the possibly
den neurons (h = 250, 500, 1000, 1500, 2000) and regularizatunable parameter of the distribution from which we draw
parameters, which simplifies working with a model. The results
tion parameter C = 10 -1, ..., 10 9; for SVM with the RBF
are also more stable in the sense of the standard deviation of
kernel, we tuned c = 10 -10, ..., 10 0 and the regularization
the GMean (and BAC) between folds than the results of the
parameter C = 10 -1, ..., 10 9; and for Tanimoto SVM we tuned
kernel sampling size (h = 250, 500, 1000, 1500, 2000) and regularization parameter C = 10 -4, ..., 10 3 . Biases and weights of
Table 1 Comparison of datasets used during evaluation. n k
the hidden ELM neurons (in both the sigmoid and RBF cases)
denotes size of the k class, d is the representation dimension
are drawn from the uniform distribution on [0, 1]. Random
^ {: C " R d h and m
r = | C1 | / ! C { (c) 1 is mean sparsity of
Forest and AdaBoost use decision trees as base estimators (with
particular fingerprint among all used compounds C .
tuned number of trees/estimators t = 10, 50, 100, 200, 500),
proTein
fingerprinT
d
m
N +1
N -1
r
and RF used the Gini index for node splitting.
5HT
1835
9019
EstateFP
79
10
2A
Table 2 summarizes the results of all considered models for
5HT2C
1210
97459
ExtFP
1024
282
GMean, BAC scores and time used for training. First, it is quite
natural that the imbalanced ELM completely failed for data
5HT6
1490
135723
MACCSFP
166
52
with highly skewed positive/negative distributions. It is quite
5HT7
704
56653
PubchemFP
881
151
C

8

9

GMEAN ^ TP, FP, TN, FN h =
BAC ^ TP, FP, TN, FN h =

TP
TN
.
$
TP + FN TN + FP

1
` TP + TN j .
2 TP + FN
TN + FP

M1

759

27604

H1

635

48608

HIV integrase

101

6395

HIV protease

3155

21163

SubFP

307

15

august 2015 | IEEE ComputatIonal IntEllIgEnCE magazInE

27



Table of Contents for the Digital Edition of Computational Intelligence - August 2015

Computational Intelligence - August 2015 - Cover1
Computational Intelligence - August 2015 - Cover2
Computational Intelligence - August 2015 - 1
Computational Intelligence - August 2015 - 2
Computational Intelligence - August 2015 - 3
Computational Intelligence - August 2015 - 4
Computational Intelligence - August 2015 - 5
Computational Intelligence - August 2015 - 6
Computational Intelligence - August 2015 - 7
Computational Intelligence - August 2015 - 8
Computational Intelligence - August 2015 - 9
Computational Intelligence - August 2015 - 10
Computational Intelligence - August 2015 - 11
Computational Intelligence - August 2015 - 12
Computational Intelligence - August 2015 - 13
Computational Intelligence - August 2015 - 14
Computational Intelligence - August 2015 - 15
Computational Intelligence - August 2015 - 16
Computational Intelligence - August 2015 - 17
Computational Intelligence - August 2015 - 18
Computational Intelligence - August 2015 - 19
Computational Intelligence - August 2015 - 20
Computational Intelligence - August 2015 - 21
Computational Intelligence - August 2015 - 22
Computational Intelligence - August 2015 - 23
Computational Intelligence - August 2015 - 24
Computational Intelligence - August 2015 - 25
Computational Intelligence - August 2015 - 26
Computational Intelligence - August 2015 - 27
Computational Intelligence - August 2015 - 28
Computational Intelligence - August 2015 - 29
Computational Intelligence - August 2015 - 30
Computational Intelligence - August 2015 - 31
Computational Intelligence - August 2015 - 32
Computational Intelligence - August 2015 - 33
Computational Intelligence - August 2015 - 34
Computational Intelligence - August 2015 - 35
Computational Intelligence - August 2015 - 36
Computational Intelligence - August 2015 - 37
Computational Intelligence - August 2015 - 38
Computational Intelligence - August 2015 - 39
Computational Intelligence - August 2015 - 40
Computational Intelligence - August 2015 - 41
Computational Intelligence - August 2015 - 42
Computational Intelligence - August 2015 - 43
Computational Intelligence - August 2015 - 44
Computational Intelligence - August 2015 - 45
Computational Intelligence - August 2015 - 46
Computational Intelligence - August 2015 - 47
Computational Intelligence - August 2015 - 48
Computational Intelligence - August 2015 - 49
Computational Intelligence - August 2015 - 50
Computational Intelligence - August 2015 - 51
Computational Intelligence - August 2015 - 52
Computational Intelligence - August 2015 - 53
Computational Intelligence - August 2015 - 54
Computational Intelligence - August 2015 - 55
Computational Intelligence - August 2015 - 56
Computational Intelligence - August 2015 - 57
Computational Intelligence - August 2015 - 58
Computational Intelligence - August 2015 - 59
Computational Intelligence - August 2015 - 60
Computational Intelligence - August 2015 - 61
Computational Intelligence - August 2015 - 62
Computational Intelligence - August 2015 - 63
Computational Intelligence - August 2015 - 64
Computational Intelligence - August 2015 - 65
Computational Intelligence - August 2015 - 66
Computational Intelligence - August 2015 - 67
Computational Intelligence - August 2015 - 68
Computational Intelligence - August 2015 - 69
Computational Intelligence - August 2015 - 70
Computational Intelligence - August 2015 - 71
Computational Intelligence - August 2015 - 72
Computational Intelligence - August 2015 - 73
Computational Intelligence - August 2015 - 74
Computational Intelligence - August 2015 - 75
Computational Intelligence - August 2015 - 76
Computational Intelligence - August 2015 - 77
Computational Intelligence - August 2015 - 78
Computational Intelligence - August 2015 - 79
Computational Intelligence - August 2015 - 80
Computational Intelligence - August 2015 - Cover3
Computational Intelligence - August 2015 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202311
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202308
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202305
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202302
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202211
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202208
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202205
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202202
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202111
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202108
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202105
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202102
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202011
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202008
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202005
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_202002
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201911
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201908
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201905
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201902
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201811
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201808
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201805
https://www.nxtbook.com/nxtbooks/ieee/computationalintelligence_201802
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring17
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring16
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring15
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring14
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_summer13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_spring13
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_winter12
https://www.nxtbook.com/nxtbooks/ieee/computational_intelligence_fall12
https://www.nxtbookmedia.com