TY - JOUR
T1 - Benefiting from the intrinsic role of epigenetics to predict patterns of CTCF binding
AU - Villaman, Camilo
AU - Pollastri, Gianluca
AU - Saez, Mauricio
AU - Martin, Alberto J.M.
N1 - Funding Information:
This work was funded by: FONDECYT Regular Project [ 1181089 ] and Centro Ciencia & Vida , FB210008 , Financiamiento Basal para Centros Científicos y Tecnológicos de Excelencia de ANID. Powered@NLHPC: this research was partially supported by the supercomputing infrastructure of the NLHPC ( ECM-02 ).
Publisher Copyright:
© 2023 The Authors
PY - 2023/1
Y1 - 2023/1
N2 - Motivation: One of the most relevant mechanisms involved in the determination of chromatin structure is the formation of structural loops that are also related with the conservation of chromatin states. Many of these loops are stabilized by CCCTC-binding factor (CTCF) proteins at their base. Despite the relevance of chromatin structure and the key role of CTCF, the role of the epigenetic factors that are involved in the regulation of CTCF binding, and thus, in the formation of structural loops in the chromatin, is not thoroughly understood. Results: Here we describe a CTCF binding predictor based on Random Forest that employs different epigenetic data and genomic features. Importantly, given the ability of Random Forests to determine the relevance of features for the prediction, our approach also shows how the different types of descriptors impact the binding of CTCF, confirming previous knowledge on the relevance of chromatin accessibility and DNA methylation, but demonstrating the effect of epigenetic modifications on the activity of CTCF. We compared our approach against other predictors and found improved performance in terms of areas under PR and ROC curves (PRAUC-ROCAUC), outperforming current state-of-the-art methods.
AB - Motivation: One of the most relevant mechanisms involved in the determination of chromatin structure is the formation of structural loops that are also related with the conservation of chromatin states. Many of these loops are stabilized by CCCTC-binding factor (CTCF) proteins at their base. Despite the relevance of chromatin structure and the key role of CTCF, the role of the epigenetic factors that are involved in the regulation of CTCF binding, and thus, in the formation of structural loops in the chromatin, is not thoroughly understood. Results: Here we describe a CTCF binding predictor based on Random Forest that employs different epigenetic data and genomic features. Importantly, given the ability of Random Forests to determine the relevance of features for the prediction, our approach also shows how the different types of descriptors impact the binding of CTCF, confirming previous knowledge on the relevance of chromatin accessibility and DNA methylation, but demonstrating the effect of epigenetic modifications on the activity of CTCF. We compared our approach against other predictors and found improved performance in terms of areas under PR and ROC curves (PRAUC-ROCAUC), outperforming current state-of-the-art methods.
KW - Binding Prediction
KW - CTCF
KW - Histone Marks
KW - Random Forests
UR - http://www.scopus.com/inward/record.url?scp=85160015780&partnerID=8YFLogxK
U2 - 10.1016/j.csbj.2023.05.012
DO - 10.1016/j.csbj.2023.05.012
M3 - Article
AN - SCOPUS:85160015780
SN - 2001-0370
VL - 21
SP - 3024
EP - 3031
JO - Computational and Structural Biotechnology Journal
JF - Computational and Structural Biotechnology Journal
ER -