Document Type



Motivation: Population allele frequencies are correlated when populations have a shared history or when they exchange genes. Unfortunately, most models for allele frequency and inference about population structure ignore this correlation. Recent analytical results show that among populations, correlations can be very high, which could affect estimates of population genetic structure. In this study, we propose a mixture beta model to characterize the allele frequency distribution among populations. This formulation incorporates the correlation among populations as well as extending the model to data with different clusters of populations. Results: Using simulated data, we show that in general, the mixture model provides a good approximation of the among-population allele frequency distribution and a good estimate of correlation among populations. Results from fitting the mixture model to a dataset of genotypes at 377 autosomal microsatellite loci from human populations indicate high correlation among populations, which may not be appropriate to neglect. Traditional measures of population structure tend to over-estimate the amount of genetic differentiation when correlation is neglected. Inference is performed in a Bayesian framework.


Reprinted from Bioinformatics, Vol. 21, Issue 8, April 15 2005, pp. 1516-1529 This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Bioinformatics following peer review. The definitive publisher-authenticated version, from Bioinformatics, Vol. 21, Issue 8, April 15 2005, pp. 1516-1529 is available online at: