IEEE Systems, Man and Cybernetics Magazine - April 2023 - 40
result and the format of the result. The dataset and scripts
are available at the following link: https://github.com/
dipuk0506/UQ-Data.
Datasets with a standard splitting between train
and test often become a platform for machine learning
(ML) and deep learning (DL) developers to test the effectiveness
of novel prediction methods. With the help of
such a dataset, researchers can test the effectiveness of
their methods with less computation [1]. Such datasets
are very common in image vision and natural language
processing tasks [2], [3]. However, few datasets exist with
train and test subsets for applications of ML and DL
methods with UQ in regression [4], [5]. UQ is gaining vast
popularity due to its demand in ML and DL methods [6],
[7], [8], [9]. In most of the previous studies, researchers in
UQ are splitting the dataset randomly [10], [11]. The performance
of the trained NN varies based on the data
splitting [12], [13]. Researchers often randomly split data
many times and then train NNs for each split and usually
get average performance [14]. That process is computationally
expensive [15]. Single training on train and validation
datasets requires less computation.
Researchers usually test a proposed model on several
datasets to show the nature of their model [16]. One image
classification method can be good at distinguishing natural
images. Another model can be good at distinguishing
handwritten digits while another one can perform well on
medical images. Similarly, in the regression problem,
researchers verify their methods on economic, social,
engineering, and so on, datasets. Some datasets have higher
uncertainty than others [17]. A model can perform outstandingly
on a deterministic dataset while another one
may perform outstandingly well on a highly uncertain
dataset [18], [19]. We provide an exact level of uncertainty
and relation between input and outputs in a synthetic dataset.
Therefore, researchers can observe the performance
of their models for different input-output relations and the
nature of uncertainties. In this regard, we provide several
synthetic datasets for numeric UQ in regression. According
to our literature search, this is the first study introducing
synthetic datasets with training, validation, and test
split for UQ.
Datasets and Their Characteristics
Researchers are developing newer and newer models
for different existing and novel applications [20]. De -
veloped models are often applied to many real-world
applications [21], [22]; monitoring [13]; and performance
evaluations [23]. However, real-world quantities have
uncer taint ies, coming from numerous known and
unknown sources [24].
Many real-world quantities have partly random and
partly deterministic parts [25]. The following equation represents
such an uncertain quantity:
ty
ii i=+e
40 IEEE SYSTEMS, MAN, & CYBERNETICS MAGAZINE April 2023
(1)
where ti
i
is the target, yi
is the true regression mean, and
;
e is the aleatoric uncertainty for the ith sample. i N! t,
y, e ! R . Therefore, we prepare proposed datasets with
deterministic equations and pseudorandom variables.
Datasets With Different Input-Output Relations:
Dataset-1 to Dataset-5
Future researchers can investigate the strength of their
models in terms of different input-output relations. One
model can be good for one input-output relationship,
where another model can be good at another input-output
relationship. Therefore, we generate the first five datasets
with different input-output relations. The following five
equations, respectively, represent the relationship between
the input and the output in Dataset-1 to Dataset-5:
yX Mr21hhe " =+ 01 0521
,.
e
e
,
yX
Mr ,=+ $ 01 22^^ `hh j.
=+ $ 01 22++
sin^^ ,
sinsin X +
4
r
yX Mr XX21sinsin^^ `hh ^ h
e $ {, } 2
4 03#
r
j.
yX XM rX XX X
X
=- +- -
+
^^ ^^
h
#
yMe {, }
+=
XroundXX
01
r^
h
^
hh#
+^
11 03
2
6sign Xround X hh22 1@
2
^ -+
^
(6)
where y is the output; X1, X2, X3 are the first, second, and
third inputs; Me
random value between zero and one inclusive.
$
is the noise magnitude; and r(0, 1) is any
represents
the modulus, round .^hrepresents the rounding of
number, and sign .^h represents the sign of the number.
The distribution of the random number generation is uniform
over the range [0,1].
Dataset-1 has a single variable input and single variable
output with homoscedastic uncertainty. The noise
magnitude ()Me
for this dataset is 0.1, representing lower
uncertainty. This dataset has symmetric uncertainty, and
there exists a trigonometric relationship between input
and output.
Dataset-2 has two inputs and one output. The first
variable (X1) determines the predictive portion. The
second variable (X2) determines the direction and magnitude
of the noise. The uncertainty is heteroscedastic
and changes over X2. As the random number generator
provides uniform distribution over the range, the uncertainty
is symmetric. The noise magnitude
( )Me
for this
dataset is 0.5, representing medium uncertainty. There
exists a trigonometric relationship between inputs and
output. Figure 1(a) and (b), respectively, represents the
relationship between target (T) and input (X1) on Dataset-1
and Dataset-2.
Dataset-3 has three inputs and one output. The third
input is multiplied by zero (0) in (4) to show no relationship
between X3 and y. The first variable (X1) determines the
predictive portion. The second variable (X2) determines
1101hh hh
^03
01 21 21 10 1
(5)
(2)
(3)
(4)
https://github.com/dipuk0506/UQ-Data
https://github.com/dipuk0506/UQ-Data
IEEE Systems, Man and Cybernetics Magazine - April 2023
Table of Contents for the Digital Edition of IEEE Systems, Man and Cybernetics Magazine - April 2023
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover1
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover2
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 1
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 2
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 3
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 4
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 5
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 6
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 7
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 8
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 9
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 10
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 11
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 12
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 13
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 14
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 15
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 16
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 17
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 18
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 19
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 20
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 21
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 22
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 23
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 24
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 25
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 26
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 27
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 28
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 29
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 30
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 31
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 32
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 33
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 34
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 35
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 36
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 37
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 38
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 39
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 40
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 41
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 42
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 43
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 44
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 45
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 46
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 47
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 48
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 49
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 50
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 51
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 52
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 53
IEEE Systems, Man and Cybernetics Magazine - April 2023 - 54
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover3
IEEE Systems, Man and Cybernetics Magazine - April 2023 - Cover4
https://www.nxtbook.com/nxtbooks/ieee/smc_202310
https://www.nxtbook.com/nxtbooks/ieee/smc_202307
https://www.nxtbook.com/nxtbooks/ieee/smc_202304
https://www.nxtbook.com/nxtbooks/ieee/smc_202301
https://www.nxtbook.com/nxtbooks/ieee/smc_202210
https://www.nxtbook.com/nxtbooks/ieee/smc_202207
https://www.nxtbook.com/nxtbooks/ieee/smc_202204
https://www.nxtbook.com/nxtbooks/ieee/smc_202201
https://www.nxtbook.com/nxtbooks/ieee/smc_202110
https://www.nxtbook.com/nxtbooks/ieee/smc_202107
https://www.nxtbook.com/nxtbooks/ieee/smc_202104
https://www.nxtbook.com/nxtbooks/ieee/smc_202101
https://www.nxtbook.com/nxtbooks/ieee/smc_202010
https://www.nxtbook.com/nxtbooks/ieee/smc_202007
https://www.nxtbook.com/nxtbooks/ieee/smc_202004
https://www.nxtbook.com/nxtbooks/ieee/smc_202001
https://www.nxtbook.com/nxtbooks/ieee/smc_201910
https://www.nxtbook.com/nxtbooks/ieee/smc_201907
https://www.nxtbook.com/nxtbooks/ieee/smc_201904
https://www.nxtbook.com/nxtbooks/ieee/smc_201901
https://www.nxtbook.com/nxtbooks/ieee/smc_201810
https://www.nxtbook.com/nxtbooks/ieee/smc_201807
https://www.nxtbook.com/nxtbooks/ieee/smc_201804
https://www.nxtbook.com/nxtbooks/ieee/smc_201801
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1017
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0717
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0417
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0117
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1016
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0716
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0416
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0116
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_1015
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0715
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0415
https://www.nxtbook.com/nxtbooks/ieee/systems_man_cybernetics_0115
https://www.nxtbookmedia.com