Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples
Date
2012-01-11Author
Mullen, Michael P.
Creevey, Christopher J.
Berry, Donagh P.
McCabe, Matt S.
Magee, David A.
Howard, Dawn J.
Killeen, Aideen P.
Park, Stephen D.
McGettigan, Paul A.
Lucy, Matt C.
MacHugh, David E.
Waters, Sinead M.
Metadata
Show full item recordAbstract
Background: The central role of the somatotrophic axis in animal post-natal growth, development and fertility is
well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an
attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants
underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large
numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA
approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify
polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in
150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility.
Results: In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes.
Nineteen percent (n = 952) of variants were located within 5’ and 3’ UTRs. Seventy-two percent (n = 3,612) were
intronic and 9% (n = 464) were exonic, including 65 indels and 236 SNPs resulting in non-synonymous
substitutions (NSS). Significant (P < 0.01) mean allele frequency differentials between the low and high fertility
groups were observed for 720 SNPs (58 NSS). Allele frequencies for 43 of the SNPs were also determined by
genotyping the 150 individual animals (Sequenom® MassARRAY). No significant differences (P > 0.1) were observed
between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total).
Conclusions: The results of the current study support previous findings of the use of DNA sample pooling and
high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation.
Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and
related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and
fertility. We have identified a large number of variants segregating at significantly different frequencies between
cattle groups divergent for calving interval plausibly harbouring causative variants contributing to heritable
variation. To our knowledge, this is the first report describing sequencing of targeted genomic regions in any
livestock species using groups with divergent phenotypes for an economically important trait.
Collections
The following license files are associated with this item: