Detection of mosaic chromosomal alterations in children with severe developmental disorders recruited to the DDD study

Purpose Structural mosaicism has been previously implicated in developmental disorders. We aimed to identify rare mosaic chromosomal alterations (MCAs) in probands with severe undiagnosed developmental disorders. Methods We identified MCAs in genotyping array data from 12,530 probands in the Deciphering Developmental Disorders study using mosaic chromosome alterations caller (MoChA). Results We found 61 MCAs in 57 probands, many of these were tissue specific. In 23 of 26 (88.5%) cases for which the MCA was detected in saliva in which blood was also available for analysis, the MCA could not be detected in blood. The MCAs included 20 polysomies, comprising either 1 arm of a chromosome or a whole chromosome, for which we were able to show the timing of the error (25% mitosis, 40% meiosis I, and 35% meiosis II). Only 2 of 57 (3.5%) of the probands in whom we found MCAs had another likely genetic diagnosis identified by exome sequencing, despite an overall diagnostic yield of ∼40% across the cohort. Conclusion Our results show that identification of MCAs provides candidate diagnoses for previously undiagnosed patients with developmental disorders, potentially explaining ∼0.45% of cases in the Deciphering Developmental Disorders study. Nearly 90% of these MCAs would have remained undetected by analyzing DNA from blood and no other tissue.


INTRODUCTION
Genetic mosaicism is the presence of two or more genetically distinct lineages of cells in one individual, arising from post-zygotic mutations. Mosaic variation can consist of single nucleotide variants (SNVs) and indels or it may involve larger stretches of the genome, including copy number variants (CNVs) and aneuploidies. Mosaicism has been associated with diseases including neurodevelopmental disorders [1][2][3][4] . The clinical consequences of mosaicism vary according to the nature of the mosaic event, the stage of development at which this event occurs, and the tissue types in which this event is present 5 .
Very large chromosomal abnormalities, such as complete autosomal aneuploidy, are generally incompatible with life, with the exception of trisomy 21. However, mosaic aneuploidies are better tolerated and have been identified in many autosomes including chromosomes 7,8,9,14,16,17,19 and 22 [6][7][8] . Moreover, children with mosaic trisomies of chromosomes 13 and 18 live for much longer than those with constitutive trisomies, with 80% and 70% of patients with mosaic trisomy 13 and 18 respectively surviving for at least a year compared to 8% with non-mosaic trisomies 9,10 . Additionally, mosaic uniparental disomy (UPD, where two copies of one chromosome are inherited from one parent) has also been associated with several developmental disorders 5,11-14 .
Mosaic chromosomal alterations (MCAs) can be detected in SNP genotyping array data by identifying differences from the expected log R ratio (LRR) and B-allele frequency (BAF).
LRR gives a measure of the intensity at any given position on the array and deviations from the expected LRR indicate an abnormal copy number. BAF is a normalized measure of the intensity ratio of two alleles (A and B), such that a BAF of 1 or 0 indicates the complete . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Patient cohort
A total of 13,612 probands with developmental disorders, and their parents, were recruited to the DDD study by all 24 Regional Genetics Services in the UK and Republic of Ireland.
Blood-extracted DNA and/or saliva samples were collected from all probands, and where possible saliva samples were collected from parents. SNP genotyping array data was generated for 12,530 of the probands in this study. Probands were systematically phenotyped by consultant clinical geneticists using the Human Phenotype Ontology (HPO) 22 and a structured questionnaire in DECIPHER (www.deciphergenomics.org) 23 .

Mosaic chromosomal alteration detection from array
Samples from 1,465 probands were genotyped on the Illumina HumanOmniExpress chip, and samples from 11,065 probands were genotyped on the Illumina HumanCoreExome chip.
Intensity data was converted into VCF format including BAF and LRR using gtc2vcf 24 .
Mosaic chromosomal alterations were detected using MoChA 19,20 . The output was filtered to remove samples with: BAF phase concordance across phased heterozygous sites underlying the call of >0.51, calls <100kbp, calls with a LOD score of <10 for the model based on BAF and genotype phase, calls flagged by MoChA as likely germline CNVs, and calls with an estimated cell fraction of >50%. More stringent filters were subsequently applied to identify rare MCAs of likely clinical significance: events which occur in more than 1% of the cohort . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint were removed, events overlapping CNVs previously identified in the cohort were removed, and events which were <1Mb in length were removed unless they overlapped genes known to cause developmental disorders (https://www.ebi.ac.uk/gene2phenotype) 25 . All MCAs remaining after these filters were manually reviewed to evaluate data quality; events with low deviation in BAF, events in regions where the genotyping array had sparse SNPs and events in samples with noisy data were removed.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Tissue specificity was observed for the majority of clinically relevant MCAs
A total of 42 MCAs were detected in saliva from 38 probands (Supplementary table 1). For nine of the probands where a total of 11 MCAs were found in saliva, we also had genotyping array data from blood; the MCA was also detected in only three of these. In an additional 14 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint probands where15 MCAs were detected in saliva, we also had WES and/or array CGH (aCGH) data from blood; there was no evidence of the MCA in any of these. For the remaining 16 MCAs from 15 probands, no other tissue type was available for testing. A total of 22 MCAs were detected in blood from 22 probands. There was SNP array data available from saliva in three of these, and in all three cases the MCA was also detected in saliva. In one additional case, although WES data was available from saliva, this MCA was a mosaic UPD and we are currently unable to detect events of this nature in WES data. For the remaining 18 MCAs detected in blood, no other tissue was available. In the three cases where the MCAs were detected in both saliva and blood, the cell fractions differed by up to 2.3-fold depending on the tissue tested. The mosaic deletion in ID 259003 and the mosaic UPD in ID 274396 were both detected in higher levels in saliva than in blood (ID 259003: saliva 53% blood 32%, ID 274396: saliva 41%, blood 18%), in contrast the mosaic loss-ofheterozygosity in ID 257978 is observed at a higher level in blood (30%) than in saliva (24%) (Supplementary table 1).
The origin of a trisomy can be determined by examination of the B-allele frequency pattern 6 .
The absence of a third haplotype indicates that five of these events (three in chromosome 12 p-arm, one in chromosome 8 and one in chromosome X) have a mitotic origin. Eight events (one in chromosome 5 p-arm, one in each of chromosomes 9, 13, 14, 18 and 20, and two in chromosome 21) have a BAF pattern consistent with occurrence in meiosis I, where three haplotypes are observed near to the centromere. A pattern consistent with occurrence in . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. MoChA is unable to distinguish between mosaic trisomies, mosaic tetrasomies or other mosaic polysomies. The six mosaic polysomies involving chromosome 12p are likely to be Pallister-Killian syndrome, where an isochromosome comprising two copies of chromosome 12p is present 30 . We also identify a case which is likely to be mosaic tetrasomy 5p. Only five cases of mosaic tetrasomy 5p, where an isochromosome consisting of two copies of the p arm of chromosome is present, have been reported to date 31 . In addition a case of an isochromosome consisting of two partial copies of 5p has been reported 32 . A small number of live-born cases of mosaic isochromosome 18p have been reported in the literature 33-35 , we identify a likely mosaic tetrasomy 18p.
Mosaic trisomy can occur by meiotic non-disjunction in the oocyte or sperm followed by trisomy rescue, or by mitotic nondisjunction at a later stage of development. Using genotyping array data, we were able to distinguish between mosaic polysomies occurring via nondisjunction at mitosis or meiosis, and 15/20 trisomies detected (75%) were meiotic in origin. The timing of the event has implications for counselling families, as some women have a higher rate of meiotic nondisjunction and therefore a greater recurrence risk 36,37 . This estimate is somewhat higher than Conlin et al., who found that 10/20 (50%) of trisomies had a meiotic origin 6 , and may reflect ascertainment differences between the cohorts.
The mosaic monosomies we detected all arose by mitotic nondisjunction, rather than monosomy rescue, which would result in homozygosity. This finding has important implications for recurrence risk, as for the former this is negligible whereas the latter raises the potential for gonadal mosaicism. We were unable to detect monosomy rescue as the method used is phase-based and therefore cannot detect events in runs of homozygosity 20 ; however Conlin et al. also only reported mitotic events 6 . Similarly, our study can only detect . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint mosaic UPD arising from trisomy rescue and resulting in heterodisomy as any events arising from monosomy rescue will result in isodisomy and lack heterozygous regions. Two of the MCAs described here, a mosaic monosomy and a mosaic UPD, are in chromosome 7 and include the SAMD9 gene. In both patients pathogenic SAMD9 variants have previously been reported. Loss of chromosome 7 and UPD of 7q have previously been described in patients with MIRAGE syndrome, this is believed to be an adaptation to the growth-suppressing effect of the SAMD9 variants 38,39 .
We found MoChA to be a highly effective tool for detecting clinically relevant MCAs.
Smaller subsets of the DDD cohort have previously been analysed for MCAs using alternative methods. Previously published analysis of structural mosaicism in SNP arrays from 1,303 DDD probands using MAD and triPOD described MCAs in 12 probands 3 . However, neither MAD or triPOD detected all 12 of these events and it was shown that a combination of algorithms was necessary to maximise diagnostic yield. We tested 11 of these probands, and found nine of the previously reported events. MoChA identifies events found by MAD and missed by triPOD and vice versa. One of the events missing in our filtered MoChA dataset was a genome-wide paternal UPD. This event was found by MoChA, however the sample was removed by the MoChA default filters designed to exclude samples which are either contaminated or low quality DNA based on high phased BAF autocorrelation. The second event which is not found by MoChA was a UPD of chromosome 14 present in around two-thirds of cells; it is not clear why this event was not found by MoChA, however no mosaic UPDs with a cell fraction of >0.4 were detected. Additionally, a duplication in chromosome 17 not previously identified using MAD and triPOD was detected using MoChA. These results show that using more than one tool will increase the number of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. Importantly, 23/26 (88.5%) of MCAs detected from saliva where blood was also available for testing could not be detected in blood-derived DNA. This result contrasts with our previous observation that mosaic de novo SNVs were observed at similar variant allele fractions in both blood and saliva 40 , and may suggest stronger negative selection against MCAs within blood lineages. One limitation of our study is that we only have data from two tissues, blood and saliva. While study of saliva yields more mosaic events than blood, mutations occurring later in embryonic development are likely to be present in a narrower range of tissues and we may therefore miss potentially diagnostic events by not having more tissue types available to study. Nonetheless, our observations highlight the importance of testing saliva (or other tissues) where possible to avoid missing mosaic structural events CytoGenomics laboratory had a higher yield than our study (30/2019, 1.5%) 6 . Our results add . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint to this body of literature, but are likely to be an underestimate of the true diagnostic yield from MCAs in developmental disorders, due to under-ascertainment in the DDD study of cases who would have been previously diagnosed using prior clinical genetic testing (such as karyotyping and microarray analysis).
Our results show that rare mosaic chromosomal alterations are an important source of diagnoses in severe developmental disorders. The meiotic or mitotic origin of the mutation can often be determined through careful analysis of genotyping array data and has important implications for recurrence risk. This work suggests that routinely analysing SNP genotyping array data could provide potential diagnoses that are currently difficult to detect via WES, and that diagnostic yield will be increased by the analysis of saliva samples. We recommend that clinical teams consider the use of saliva-derived DNA for SNP array analysis for the investigation of neurodevelopmental disorders to complement genome-wide sequencing using blood-derived DNA. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Figure 2
The distribution of MCAs in the genome. Each bar represents one event; deletions are shown in orange, duplications in green and loss of heterozygosity in blue. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint Figure 3 Origin of mosaic trisomies. A) A mosaic trisomy that has occurred during mitosis. B) A mosaic trisomy that has occurred during meiosis I. C) A mosaic trisomy that has occurred during meiosis II.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint A B C . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint

Supplementary figure 3
Complex MCAs A) A deletion (red) in chromosome 13 flanked by copy number neutral loss-of-heterozygosity (cyan). B) Chromosome 1 duplication (red) followed by UPD (cyan) of the majority of the q-arm. C) Duplication in chromosome 7 and polysomy in chromosome X comprising both mosaic (red) and non-mosaic (cyan) regions in the same patient. a dup duplication, del deletion, upd uniparental disomy, upd (pat) paternal uniparental disomy b w whole chromosome, p p-arm, q q-arm c nd not done . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 7, 2022. ; https://doi.org/10.1101/2022.03.28.22273024 doi: medRxiv preprint