Refining the molecular genetic tools
Molecular genetic techniques have evolved very fast over the last three decades. This is normally the quest of universities and research laboratories. But sometimes, genetic markers need to be identified for specific applied projects, such as in the case of red foxes. We also had to test the reliability of a particularly large data set. The problems of large genetic data sets are relatively new as the technological advances of genetic screening together with decreasing costs has only recently allowed to compile large data sets. We faced the problem whilst analysing the population structure and gene flow of domestic horses from across the world.
- Wandeler, P., Funk, S.M., 2006. Short microsatellite DNA markers for the red fox (Vulpes vulpes). Molecular Ecology Notes 6, 98–100.
Seven short microsatellite loci (< 165 bp) were characterized for red foxes for the amplification of degraded DNA extracted from historical samples. Following polymerase chain reaction (PCR) using primers developed in the domestic dog, red fox‐specific primers were designed within the flanking regions. The number of detected alleles ranged between six and 15 alleles and the expected heterozygosities ranged between 0.67 and 0.92. No deviations from Hardy–Weinberg equilibrium were detected for any of the markers.
- Funk, S.M., Guedaoura, S., Juras, R., Raziq, A., Landolsi, F., Luís, C., Martínez, A.M., Musa Mayaki, A., Mujica, F., Oom, M. do M., Ouragh, L., Stranger, Y.-M., Vega-Pla, J.L., Cothran, E.G., 2020. Major inconsistencies of inferred population genetic structure estimated in a large set of domestic horse breeds using microsatellites. Ecol Evol 10, 4261–4279.
STRUCTURE remains the most applied software aimed at recovering the true, but unknown, population structure from microsatellite or other genetic markers. About 30% of STRUCTURE‐based studies could not be reproduced (Molecular Ecology, 21, 2012, 4925).
Here we use a large set of data from 2,323 horses from 93 domestic breeds plus the Przewalski horse, typed at 15 microsatellites, to evaluate how program settings impact the estimation of the optimal number of population clusters Kopt that best describe the observed data. Domestic horses are suited as a test case as there is extensive background knowledge on the history of many breeds and extensive phylogenetic analyses.
Different methods based on different genetic assumptions and statistical procedures (DAPC, FLOCK, PCoA, and STRUCTURE with different run scenarios) all revealed general, broad‐scale breed relationships that largely reflect known breed histories but diverged how they characterized small‐scale patterns. STRUCTURE failed to consistently identify Kopt using the most widespread approach, the ΔK method, despite very large numbers of MCMC iterations (3,000,000) and replicates (100). The interpretation of breed structure over increasing numbers of K, without assuming a Kopt, was consistent with known breed histories. The over‐reliance on Kopt should be replaced by a qualitative description of clustering over increasing K, which is scientifically more honest and has the advantage of being much faster and less computer intensive as lower numbers of MCMC iterations and repetitions suffice for stable results.
Very large data sets are highly challenging for cluster analyses, especially when populations with complex genetic histories are investigated.