Genomic origin, fragmentomics, and transcriptional correlation of long cell-free DNA molecules in human plasma [RESEARCH]

Recent studies have revealed an unexplored population of long cell-free DNA (cfDNA) molecules in human plasma using long-read sequencing technologies. However, the biological properties of long cfDNA molecules (> 500 bp) remain largely unknown. To this end, we investigated the origins of long cfDNA molecules from different genomic elements. Analysis of plasma cfDNA using long-read sequencing revealed uneven distribution of long molecules from across the genome. Long cfDNA molecules showed overrepresentation in euchromatic regions of the genome, in sharp contrast to short DNA molecules. We observed a stronger relationship between the abundance of long molecules and mRNA gene expression levels, compared with short molecules (Pearson's r = 0.71 versus -0.14). Moreover, long and short molecules demonstrated distinct fragmentation patterns surrounding CpG sites. Leveraging the cleavage preferences surrounding CpG sites, the combined cleavage ratios of long and short molecules could differentiate patients with hepatocellular carcinoma (HCC) from non-HCC subjects (AUC = 0.87). We further investigated knockout mice in which selected nuclease genes had been inactivated, in comparison with wild-type mice. The proportion of long molecules originating from transcription start sites were lower in Dffb-deficient mice but higher in Dnase1l3-deficient mice, compared to that of wild-type mice. This work thus provides new insights into the biological properties and potential clinical applic...
Source: Genome Research - Category: Genetics & Stem Cells Authors: Tags: RESEARCH Source Type: research