Description

This annotation shows regions detected as putative copy number polymorphisms (CNP) and sites of detected intermediate-sized structural variation (ISV). The CNPs and ISVs were determined by various methods, displayed in individual subtracks within the annotation:

Display Conventions and Configuration

CNP and ISV regions are indicated by solid blocks that are color-coded to indicated the type of variation detected:

Sharp subtrack

On the details pages for elements in this subtrack, the table shows value/threshold data for each individual in the population. "Value" is defined as the log2 ratio of fluorescence intensity of test versus reference DNA. "Threshold" is defined as 2 standard deviations from the mean log2 ratio of all autosomal clones per hybridization. The "Disease Percent" value reflects the percent of the BAC that lies within a "rearrangement hotspot", as defined in Sharp et al. (2005) (the rationale used to choose BACs for the array construction). A rearrangement hotspot is defined by the presence of flanking intrachromosomal duplications >10 kb in length with >95% similarity and separated by 50 kb - 10 Mb of intervening sequence.

Tuzun subtrack

Items are labeled using the following naming convention:

Conrad subtrack

The method used to identify these deletions approximates the breakpoints of each event; therefore, a set of minimal and maximal endpoints is associated with each deletion. Thick lines delineate the minimally deleted region; thin lines delineate the maximally deleted region.

Methods

Sharp BAC microarray analysis

All hybridizations were performed in duplicate incorporating a dye-reversal using a custom array consisting of 2,194 end-sequence or FISH-confirmed BACs, targeted to regions of the genome flanked by segmental duplications. The false positive rate was estimated at ~3 clones per 4,000 tested.

Iafrate BAC microarray analysis

All hybridizations were performed in duplicate incorporating a dye-reversal using proprietary 1 Mb GenomeChip V1.2 Human BAC Arrays consisting of 2,632 BAC clones (Spectral Genomics, Houston, TX). The false positive rate was estimated at ~1 clone per 5,264 tested.

Further information is available from the Database of Genome Variants website.

Sebat ROMA

Following digestion with BglII or HindIII, genomic DNA was hybridized to a custom array consisting of 85,000 oligonucleotide probes. The probes were selected to be free of common repeats and have unique homology within the human genome. The average resolution of the array was ~35 kb; however, only intervals in which three consecutive probes showed concordant signals were scored as CNPs. All hybridizations were performed in duplicate incorporating a dye-reversal, with the false positive rate estimated to be ~6%.

Note that CNP intervals, as detailed by Sebat et al. (2004), were converted from the April 2003 human genome assembly (NCBI Build 33) to the July 2003 assembly (NCBI Build 34) using the UCSC liftOver tool.

Tuzun fosmid mapping

Paired-end sequences from a human fosmid DNA library were mapped to the assembly. The average resolution of this technique was ~8 kb, and included 56 sites of inversion not detectable by the array-based approaches. However, because of the physical constraints of fosmid insert size, this technique was unable to detect insertions greater than 40 kb in size.

McCarroll genotype analysis

A segregating deletion can leave "footprints" in SNP genotype data, including apparent deviations from Mendelian inheritance, apparent deviations from Hardy-Weinberg equilibrium and null genotypes. Using these clues to discover true variants is challenging, however, because the vast majority of such observations represent technical artifacts and genotyping errors.

To determine whether a subset of "failed" SNP genotyping assays in the HapMap data might reflect structural variation, the authors examined whether such failures were physically clustered in a manner that is specific to individuals. Consistent with this hypothesis, the rate of Mendelian-inconsistent genotypes was elevated near other Mendelian-inconsistent genotypes in the same individual but was unrelated to Mendelian inconsistencies in other individuals.

The authors systematically looked for regions of the genome in which the same failure profile appeared repeatedly at nearby markers in a manner that was statistically unexpected based on chance. A set of statistical thresholds was tailored to each mode of failure, genotyping center and genotyping platform used in the project. The same procedure could readily apply to dense SNP data from any platform or study.

Conrad genotype analysis

SNPs in regions that are hemizygous for a deletion are generally miscalled as homozygous for the allele that is present. Hence, when a deletion is transmitted from parent to child, the genotypes at SNPs within the deletion region will often appear to violate the rules of Mendelian transmission. The authors developed a simple algorithm for scanning trio data for unusual runs of consecutive SNPs that, in a single family, have genotype configurations consistent with the presence of a deletion.

Hinds haploid hybridization analysis

Approximately 600 Mb of genomic DNA from 24 unrelated individuals was obtained from the Polymorphism Discovery Resource. Haploid hybridization was used to identify genomic intervals showing a reduced hybridization signal in comparison to the reference assembly. PCR amplification was performed on 215 candidate deletions. 100 deletions were selected that were unambiguously confirmed.

Validation

McCarroll genotype analysis

Four methods of validation were used: fluorescent in situ hybridization (FISH), two-color fluorescence intensity measurements, PCR amplification and quantitative PCR.

The authors performed fluorescent in situ hybridization (FISH) for five candidate deletions large enough to span available FISH probes. In all five cases, FISH assays confirmed the deletions in the predicted individuals.

The authors examined two-color allele-specific fluorescence data from SNP genotyping assays from a data subset available at the Broad Institute, looking for a reduction in fluorescence intensity in individuals predicted to carry a deletion. At most SNPs in the genome, fluorescence intensity measurements cluster into two or three discrete groups corresponding to homozygous and hetrozygous genotypes. At 15 of 17 candidate deletion loci, fluorescence intensity data for one or more SNPs clustered into additional groups that corresponded to the predicted deletion genotypes.

The authors used PCR amplification to query 60 loci for which the pattern of genotypes suggested multiple individuals with homozygous deletions. Variants were considered confirmed if the pattern of amplication success and failure matched prediction across a set of 12-24 individuals. The authors confirmed 51 of 60 candidate variants by this criterion.

The authors performed quantitative PCR in all 269 HapMap DNA samples for 11 candidate deletions that overlapped the coding exons of genes and that were discovered in many individuals. At 10/11 loci, the authors observed three discrete clusters, identifying individuals with zero, one and two gene copies. All 60 trios displayed Mendelian inheritance for the ten deletions, as well as Hardy-Weinberg equilibrium in all four populations surveyed, and transmission rates close to 50%. This suggests that the deletions behave as a stable, heritable genetic polymorphism.

Conrad genotype analysis

The authors first tested 12 predicted deletions using quantitative PCR. For all 12 deletions they observed DNA concentrations consistent with transmission of a deletion from parent to child.

To provide more extensive validation by comparative genome hybridization (CGH), the authors designed a custom oligonucleotide microarray comprised of 380,000 probes that tile across all 134 candidate deletions identified in nine HapMap offspring (8 YRI and 1 CEU). The results of this CGH analysis indicate that the majority (about 85%) of candidate deletions detected by the method are real.

References

Conrad, D., Andrews, T.D., Carter, N.P., Hurles, M.E., Pritchard, J.K. A high-resolution survey of deletion polymorphism in the human genome. Nature Genet 38(1), 75-81 (2006).

Hinds, D., Kloek, A.P., Jen, M., Chen, X., Frazer, K.A. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nature Genet 38(1), 82-85 (2006).

Iafrate, J.A., Feuk, L., Rivera, M.N., Listewnik, M.L., Donahoe, P.K., Qi, Y., Scherer, S.W. and Lee, C. Detection of large-scale variation in the human genome. Nature Genet 36(9), 949-51 (2004).

McCarroll, S.A., Hadnott, T.N., Perry, G.H., Sabeti, P.C., Zody, M.C., Barrett, J.C., Dallaire, S., Gabriel, S., Lee, C., Daly, M.J., Altshuler, D.M. Common deletion polymorphisms in the human genome. Nature Genet 38(1), 86-92 (2006).

Sebat, J., Lakshmi, B., Troge, J., Alexander, J., Young, J., Lundin, P., Maner, S., Massa, H., Walker, M., Chi, M. et al. Large-scale copy number polymorphism in the human genome. Science 305(5683), 525-8 (2004).

Sharp, A.J., Locke, D.P., McGrath, S.D., Cheng, Z., Bailey, J.A., Samonte, R.V., Pertz, L.M., Clark, R.A., Schwartz, S., Segraves, R. et al. Segmental duplications and copy number variation in the human genome. Am J Hum Genet 77(1), 78-88 (2005).

Tuzun, E., Sharp, A.J., Bailey, J.A., Kaul, R., Morrison, V.A., Pertz, L.M., Haugen, E., Hayden, H., Albertson, D. Pinkel, D. et al. Fine-scale structural variation of the human genome. Nature Genet 37(7), 727-32 (2005).