Abstract
Motivation: One of the most relevant mechanisms involved in the determination of chromatin structure is the formation of structural loops that are also related with the conservation of chromatin states. Many of these loops are stabilized by CCCTC-binding factor (CTCF) proteins at their base. Despite the relevance of chromatin structure and the key role of CTCF, the role of the epigenetic factors that are involved in the regulation of CTCF binding, and thus, in the formation of structural loops in the chromatin, is not thoroughly understood. Results: Here we describe a CTCF binding predictor based on Random Forest that employs different epigenetic data and genomic features. Importantly, given the ability of Random Forests to determine the relevance of features for the prediction, our approach also shows how the different types of descriptors impact the binding of CTCF, confirming previous knowledge on the relevance of chromatin accessibility and DNA methylation, but demonstrating the effect of epigenetic modifications on the activity of CTCF. We compared our approach against other predictors and found improved performance in terms of areas under PR and ROC curves (PRAUC-ROCAUC), outperforming current state-of-the-art methods.
Original language | English |
---|---|
Pages (from-to) | 3024-3031 |
Number of pages | 8 |
Journal | Computational and Structural Biotechnology Journal |
Volume | 21 |
DOIs | |
State | Published - 2023 |
Bibliographical note
Funding Information:This work was funded by: FONDECYT Regular Project [ 1181089 ] and Centro Ciencia & Vida , FB210008 , Financiamiento Basal para Centros Científicos y Tecnológicos de Excelencia de ANID. Powered@NLHPC: this research was partially supported by the supercomputing infrastructure of the NLHPC ( ECM-02 ).
Publisher Copyright:
© 2023 The Authors
ASJC Scopus subject areas
- Biotechnology
- Biophysics
- Structural Biology
- Biochemistry
- Genetics
- Computer Science Applications