Description | Cancer genomes contain thousands of somatic point mutations, chromosome copy alterations and more complex structural variants, which contribute to tumour growth and therapy response. Whole genome sequencing is a well established approach for somatic variant identification, but its broad application comes with complications, particularly in how proposed calls are quality assessed. To address this issue, we present CNAqc, a quantitative framework to quality control somatic mutations and allele-specific copy numbers, both in clonal and subclonal settings while accounting for variations in tumour purity, as commonly seen in bulk sampling. We test the model via extensive simulations, validate it using low-pass single-cell data, and apply it to 2778 single-sample PCAWG whole-genomes, 10 in-house multi-region whole-genomes and 48 TCGA whole-exomes. CNAqc is compatible with common bioinformatic pipelines and designed to support automated parameterization processes that are crucial in the era of large-scale whole genome sequencing. |
---|