GRCz11 – the latest zebrafish reference genome assembly

After 2.5 years of assembly curation, the GRC is proud to present the new zebrafish reference genome assembly,GRCz11.This latest assembly has been refined by the addition of nearly 1000 finished clone sequences and by the resolution of more than 400 assembly issues. This resulted in a significant reduction in scaffold numbers (3399 to 1905) and increase in scaffold N50 (2.18 Mb to 7.5 Mb) whilst the overall genome size was not affected. Figure 1 shows an overview of contig and scaffold N50s over time, indicating the advance in assembly curation.Figure 1: Contig vs. scaffold N50s for zebrafish reference genome assemblies. Release dates: Zv7: 2008, Zv8: 2009, Zv9: 2010, GRCz10: 2014, GRCz11: 2017.Alignments of 16133 RefSeq sequences showed a further improvement over past assemblies: only 31 sequences remained not found (down from 34), 105 transcripts are still split between locations (down from 205) and only 441 exhibit less than 95% CDS coverage (down from 566). Figure 2 shows an example of an improved region, correcting the representation of two genes.Figure 2:gEVAL screenshot of thesupt4h1 gene (red arrow) in GRCz10 (top) and GRCz11 (bottom). In GRCz10 thesupt4h1 gene on chromosome 5 is incomplete, missing its first exon, and surrounded by a truncated supplicated copy ofrnf150b (blue arrow). In GRCz11, thesupt4h1 gene is complete and neighbouring the hsf5 gene, as seen in other vertebrates, whereas thernf150b gene is now complete and located singularly on chromosome 23....
Source: GenomeRef - Category: Genetics & Stem Cells Source Type: blogs
More News: Genetics