IEEE Computational Intelligence Magazine - May 2021 - 54

B. Pre- and Post-Processing

C. Feature Selection

1)	Cross-Validation: To insure generality of our models we
used 5-fold cross-validation for both model design and
for final predictions. We considered both static and
dynamic partitioning of the data. In order to maintain a
homogeneous experimental setting, we considered a
fixed partitioning so that the models created across the
experiments in the various experimental stages could
be comparable. Thus, this static partition was applied
across all the variables. However, we also considered
dynamic partitioning when evaluating new models, feature selection algorithms, and techniques before the
final comparison of performance, which was done on
the statically partitioned sets. Imputations were also
done using the static 5-fold partitioning by imputing
data in a per-fold approach to avoid overfitting due to
data-leakage at all levels.
2)	Outliers: We performed an outlier analysis where out of
range errors for sensing streams were analyzed for extraction issues (the most common case), resulting in integrity
checks and script validation from the raw sensing streams.
We verified the measurements of sleep time in wrong
ranges (fixed via computations of sleep and awake times),
negative commute times (due to beacons wrongly placed),
and various other aspects. Error sources included typos in
the enrollment process, re-assigning of devices from
dropped participants to newly enrolled participants, and
several edge conditions with respect to the enrollment/
ingestion process.
3)	Data Range Transformation: We applied systematic verification
to ensure the predicted values are not outside of the prescribed ground truth ranges. Code corrections were applied
to properly bound prediction results.

Heart Rate











FIGURE 2 Discretization, Stabilization, Regularization. Blue continuous lines = the time series; red continuous lines = mean value of the
series; the dashed curve arrow = example of a sequential long-term
dependency that could be lost without the use of HONs; blue thick
arrows = set of higher order dependencies considered by HON.



A sequential exploration of various combinations of features to
identify a set of predictive features per construct was conducted. The result was a curated subset of features. First, we introduced the social media feature selection process. It is worth
noting that raw social media data is neither shared nor processed for privacy purposes. We only used reformatted features
to remove personally identifiable information (PII).
The relevant social media features were selected using principal component analysis (PCA) to identify the top 200 latent
features for predicting the ground-truth variables-we aimed
to capture complex behaviors latent in the data which are not
directly observable in the raw signals. As detailed in Section IV,
we used psycholinguistic features, n-grams, sentiment, social
engagement that lead to 5075 features per individual and
applied PCA to reduce dimensionality to 100 features that
account for multicollinearities.
The features from other sources were treated under the
same selection policy to define the set of models (components).
This involved five stages of selection, in addition to the feature
pre-selection and social media selection. First, features were
selected based on correlations per fold during cross-validation
(the features selected are the ones that overlapped across folds,
but for each fold only the training folds are used to compute
correlations, to avoid leakage across folds). Second, features
were selected by the individual candidate models. Third, a
selection was done on the overall final training by the best
model. Fourth, a subset of latent features was mapped using
PCA for specific feature sets. Lastly, we ranked the models on
predictive performance and chose the best model.
D. Higher Order Networks (HON) of Temporal Data

Most real sequential data does not fulfill the Markov property
[41]. HON are powerful tools that allow us to overcome this
challenge by representing high order dependencies. Non--
Markovian patterns provide unique information about the problem under study. For this reason, we use a HON algorithm to
provide a multi-scale representation of sequential data on a
-per-feature basis (e.g., heart rate and sensor-measured stress).
When extracting features in sequential data, conventional methods (e.g., Markov model) might lead to information loss on the
state transition with the assumption that the next status only
depends on the current status. To address this limitation, we utilized a HON method to make a sufficient representation by
exploring higher order dependencies in sequential data. Building
the HON model consists of the following steps.
We applied discretization to the time series as shown in
Figure 2. The discretization step works as a pattern recognition
technique that identifies regularities in the time series that are
grouped to remove high frequency components. Since the network representation of the time series (e.g., heart rate) is not
directly available, the discretization of the raw data can be used
to construct a network. We divided time into equal-size (half
hour) time slots. x i is the state in i-th time slot, i.e., the mean
value for the heart rate during the corresponding slot.


IEEE Computational Intelligence Magazine - May 2021

Table of Contents for the Digital Edition of IEEE Computational Intelligence Magazine - May 2021

IEEE Computational Intelligence Magazine - May 2021 - Cover1
IEEE Computational Intelligence Magazine - May 2021 - Cover2
IEEE Computational Intelligence Magazine - May 2021 - Contents
IEEE Computational Intelligence Magazine - May 2021 - 2
IEEE Computational Intelligence Magazine - May 2021 - 3
IEEE Computational Intelligence Magazine - May 2021 - 4
IEEE Computational Intelligence Magazine - May 2021 - 5
IEEE Computational Intelligence Magazine - May 2021 - 6
IEEE Computational Intelligence Magazine - May 2021 - 7
IEEE Computational Intelligence Magazine - May 2021 - 8
IEEE Computational Intelligence Magazine - May 2021 - 9
IEEE Computational Intelligence Magazine - May 2021 - 10
IEEE Computational Intelligence Magazine - May 2021 - 11
IEEE Computational Intelligence Magazine - May 2021 - 12
IEEE Computational Intelligence Magazine - May 2021 - 13
IEEE Computational Intelligence Magazine - May 2021 - 14
IEEE Computational Intelligence Magazine - May 2021 - 15
IEEE Computational Intelligence Magazine - May 2021 - 16
IEEE Computational Intelligence Magazine - May 2021 - 17
IEEE Computational Intelligence Magazine - May 2021 - 18
IEEE Computational Intelligence Magazine - May 2021 - 19
IEEE Computational Intelligence Magazine - May 2021 - 20
IEEE Computational Intelligence Magazine - May 2021 - 21
IEEE Computational Intelligence Magazine - May 2021 - 22
IEEE Computational Intelligence Magazine - May 2021 - 23
IEEE Computational Intelligence Magazine - May 2021 - 24
IEEE Computational Intelligence Magazine - May 2021 - 25
IEEE Computational Intelligence Magazine - May 2021 - 26
IEEE Computational Intelligence Magazine - May 2021 - 27
IEEE Computational Intelligence Magazine - May 2021 - 28
IEEE Computational Intelligence Magazine - May 2021 - 29
IEEE Computational Intelligence Magazine - May 2021 - 30
IEEE Computational Intelligence Magazine - May 2021 - 31
IEEE Computational Intelligence Magazine - May 2021 - 32
IEEE Computational Intelligence Magazine - May 2021 - 33
IEEE Computational Intelligence Magazine - May 2021 - 34
IEEE Computational Intelligence Magazine - May 2021 - 35
IEEE Computational Intelligence Magazine - May 2021 - 36
IEEE Computational Intelligence Magazine - May 2021 - 37
IEEE Computational Intelligence Magazine - May 2021 - 38
IEEE Computational Intelligence Magazine - May 2021 - 39
IEEE Computational Intelligence Magazine - May 2021 - 40
IEEE Computational Intelligence Magazine - May 2021 - 41
IEEE Computational Intelligence Magazine - May 2021 - 42
IEEE Computational Intelligence Magazine - May 2021 - 43
IEEE Computational Intelligence Magazine - May 2021 - 44
IEEE Computational Intelligence Magazine - May 2021 - 45
IEEE Computational Intelligence Magazine - May 2021 - 46
IEEE Computational Intelligence Magazine - May 2021 - 47
IEEE Computational Intelligence Magazine - May 2021 - 48
IEEE Computational Intelligence Magazine - May 2021 - 49
IEEE Computational Intelligence Magazine - May 2021 - 50
IEEE Computational Intelligence Magazine - May 2021 - 51
IEEE Computational Intelligence Magazine - May 2021 - 52
IEEE Computational Intelligence Magazine - May 2021 - 53
IEEE Computational Intelligence Magazine - May 2021 - 54
IEEE Computational Intelligence Magazine - May 2021 - 55
IEEE Computational Intelligence Magazine - May 2021 - 56
IEEE Computational Intelligence Magazine - May 2021 - 57
IEEE Computational Intelligence Magazine - May 2021 - 58
IEEE Computational Intelligence Magazine - May 2021 - 59
IEEE Computational Intelligence Magazine - May 2021 - 60
IEEE Computational Intelligence Magazine - May 2021 - 61
IEEE Computational Intelligence Magazine - May 2021 - 62
IEEE Computational Intelligence Magazine - May 2021 - 63
IEEE Computational Intelligence Magazine - May 2021 - 64
IEEE Computational Intelligence Magazine - May 2021 - 65
IEEE Computational Intelligence Magazine - May 2021 - 66
IEEE Computational Intelligence Magazine - May 2021 - 67
IEEE Computational Intelligence Magazine - May 2021 - 68
IEEE Computational Intelligence Magazine - May 2021 - 69
IEEE Computational Intelligence Magazine - May 2021 - 70
IEEE Computational Intelligence Magazine - May 2021 - 71
IEEE Computational Intelligence Magazine - May 2021 - 72
IEEE Computational Intelligence Magazine - May 2021 - 73
IEEE Computational Intelligence Magazine - May 2021 - 74
IEEE Computational Intelligence Magazine - May 2021 - 75
IEEE Computational Intelligence Magazine - May 2021 - 76
IEEE Computational Intelligence Magazine - May 2021 - 77
IEEE Computational Intelligence Magazine - May 2021 - 78
IEEE Computational Intelligence Magazine - May 2021 - 79
IEEE Computational Intelligence Magazine - May 2021 - 80
IEEE Computational Intelligence Magazine - May 2021 - 81
IEEE Computational Intelligence Magazine - May 2021 - 82
IEEE Computational Intelligence Magazine - May 2021 - 83
IEEE Computational Intelligence Magazine - May 2021 - 84
IEEE Computational Intelligence Magazine - May 2021 - 85
IEEE Computational Intelligence Magazine - May 2021 - 86
IEEE Computational Intelligence Magazine - May 2021 - 87
IEEE Computational Intelligence Magazine - May 2021 - 88
IEEE Computational Intelligence Magazine - May 2021 - 89
IEEE Computational Intelligence Magazine - May 2021 - 90
IEEE Computational Intelligence Magazine - May 2021 - 91
IEEE Computational Intelligence Magazine - May 2021 - 92
IEEE Computational Intelligence Magazine - May 2021 - 93
IEEE Computational Intelligence Magazine - May 2021 - 94
IEEE Computational Intelligence Magazine - May 2021 - 95
IEEE Computational Intelligence Magazine - May 2021 - 96
IEEE Computational Intelligence Magazine - May 2021 - 97
IEEE Computational Intelligence Magazine - May 2021 - 98
IEEE Computational Intelligence Magazine - May 2021 - 99
IEEE Computational Intelligence Magazine - May 2021 - 100
IEEE Computational Intelligence Magazine - May 2021 - Cover3
IEEE Computational Intelligence Magazine - May 2021 - Cover4