The phosphatidylinositol signaling pathway plays an important
role in the growth, survival and metabolism of cancer cells, and
targeting this pathway has the potential to lead to treatments for
colorectal cancer [66], [67].VEGF and ErbB may be valid therapeutic targets for patients with colorectal cancer [68]-[71].
IV. Discussion and Conclusion

DNA methylation may be associated significantly with complex diseases and many genomic regions are differentially
methylated in various cancers, comparing to normal samples.
In this study, we presented a method to identify combinatorial
effects of DNA methylation at multiple sites. From a systematic
perspective, the relationship between DNA methylation
regions and a specific disease is learned by the presented probabilistic evolutionary learning method. The fitness value of a
DNA methylation module measures the level of its responses
to the cancer. In a computational view, our method can solve a
large number of feature problems by identifying modules with
both compactness and high coverage of cancer-related genes.
Applying our method to breast cancer and colorectal cancer
data produced by high-throughput technologies, we detected
cancer-related modules that were confirmed by the literature
and functional enrichment analysis. Interestingly, we observed
that the selected regions were located around genes that are
significantly enriched in cancer-related gene set categories,
which provided evidence that the identified modules in our
study are biologically meaningful.
Moreover, from the result for the array-based dataset, we could
obtain a good accuracy with a very small number of random features. However, the specificity was very low in the experiments
with random features. The result suggested that our method could
generate well-balanced classification performance even with a
highly imbalanced dataset, although conventional classifiers would
not work well with imbalanced circumstances. Also in the second
experiment using the NGS-based dataset with large number of
features and small sample size, our method could find the informative DNA methylation sites with good classification performances,
even though the decision tree, necessary to be discretized in each
value, showed relatively lower results.
Studies on DNA methylation could reveal the process of
tumorigenesis as well as identify biomarkers. Our approach,
which identifies multiple DNA methylation sites that might be
epigenetically regulated, could provide a useful strategy to detect
the epigenetic association related to cancer. By applying our
method to array- and NGS-based data, we showed that it is
applicable to a variety of data types and various disease contexts.
Moreover, recent studies suggest a complex relationship between
genetic variation and DNA methylation. Systems genetics and
epigenetics approaches are required to examine these relationships. Although our framework is based on DNA methylation
profile datasets, it could be used to identify the combinatorial
association of various factors, including gene expression levels,
microRNAs, copy number variations, genetic variations, and
environmental factors. The integration of a variety of data would
provide the basis for new hypotheses and experimental


approaches in the model of a complex disease. Moreover, the systematic identification of causal factors and modules would provide insights into mechanisms underlying complex diseases and
help to develop efficient therapies or effective drugs.
In summary, we presented a method for searching the higherorder interaction of DNA methylation sites using a probabilistic
evolutionary learning method. We also examined the potential
for the combined effects of various sites on the genome. The
results suggested that the alteration of DNA methylations at
multiple sites affects cancer. Similar to genome-wide association
studies, our approach provided an opportunity to capture the
complex and multifactorial relationships among DNA methylation sites and to find new factors for future study. Therefore, our
approach would facilitate a comprehensive analysis of genomewide DNA methylation datasets and help the interpretation for
the effects of DNA methylation on multiple sites.

This work was supported by the National Research Foundation of
Korea (NRF) grant funded by the Ministry of Science and ICT,
Republic of Korea (grant no. NRF-2015R1C1A1A01053824,
NRF-2018R1C1B6005304, NRF-2016R1D1A1B03935676,
and NRF-2018R1D1A1B07050393).

