Identifying clusters of cis-regulatory elements underpinning TAD structures and lineage-specific regulatory networks [METHOD]

Cellular identity relies on cell-type–specific gene expression controlled at the transcriptional level by cis-regulatory elements (CREs). CREs are unevenly distributed across the genome, giving rise to individual CREs and clusters of CREs (COREs). Technical and biological features hinder CORE identification. We addressed these issues by developing an unsupervised machine learning approach termed clustering of genomic regions analysis method (CREAM). CREAM automates CORE detection from chromatin accessibility profiles that are enriched in CREs strongly bound by master transcription regulators, proximal to highly expressed and essential genes, and discriminating cell identity. Although COREs share similarities with super-enhancers, we highlight differences in terms of the genomic distribution and structure of these cis-regulatory units. We further show the enhanced value of COREs over super-enhancers to identify master transcription regulators, highly expressed and essential genes defining cell identity. COREs enrich at topologically associated domain (TAD) boundaries. They are also preferentially bound by the chromatin looping factors CTCF and cohesin, in contrast to super-enhancers, forming clusters of CTCF and cohesin binding regions and defining homotypic clusters of transcription regulator binding regions (HCTs). Finally, we show the clinical utility of CREAM to identify COREs across chromatin accessibility profiles to stratify more than 400 tumor samples according t...
Source: Genome Research - Category: Genetics & Stem Cells Authors: Tags: METHOD Source Type: research