summary from pp. 516 - 517 of Haldane's "The Cost of Natural Selection"
Let pn be the frequency of an allele A in any generation n, and qn be the frequency of an alternative allele a at the same locus for a diploid organism.
Such an organism would have three different genotypes for this locus
: AA, Aa, and aa. The expected frequencies of these
genotypes, based upon the frequency of the two alleles (A and a) are:
AA - pn2
Aa - 2pnqn
aa - qn2
Using the usual relative fitnesses, (1 - s) of individuals with
the aa genotype survive for every one with the AA, and (1 - s*h) individuals
having the Aa genotype survive for each AA individual. In the second case,
the factor h is >= 0 and <= 1. This is the dominance factor. If h =
1, the a allele is dominant; if h = 0, the A allele is dominant; and for
intermediate values of h, we have incomplete dominance. If the frequencies
of the three genotypes (AA, Aa, and aa) and the relative fitness of each
genotype is 1, 1 - hs, and 1 - s, then after a round of selection, the
relative frequencies of the three genotypes will be:
AA - 1 * pn2 / [ 1 - 2shpnqn
- sqn2 ]
Aa - (1 - sh) * 2pnqn / [ 1
- 2shpnqn - sqn2 ]
aa - (1 - s) * qn2 /
[ 1 - 2shpnqn - sqn2 ]
In his calculations of the substitution cost, Haldane chose to ignore
the denominator (1 - 2shpnqn - sqn2
)
for small s. It can be shown that ignoring the denominator introduces
a maximal error of s in one generation. If the selection coefficient
is 0.01, the error will be less than 1% per generation when q is large
(nearly 1.0), and it will go down as q is reduced. Haldane felt this could
be ignored, but others have disagreed with Haldane on this point.
Under these conditions, the fraction of selective deaths due to
natural selection for a single generation is given approximately as follows:
dn = 2shpnqn
+ sqn2
This becaue (ignoring the denominator) the frequency of the Aa genotype
will be reduced by 2shpnqn and the frequency of the
aa genotype will be reduced by sqn2 . These reductions
are the fraction of selective deaths required for the change.
If q n+1 is defined as the frequency of the a allele
after a round of selection,
qn+1 = 1/2 * (1 - sh) *2 pnqn
/ [ 1 - 2shpnqn - sqn2 ] +
(1 - s) * qn2 / [ 1 - 2shpnqn
- sqn2 ]
= [(1 - sh) * pnqn + (1 - s) * qn2
]/
[ 1 - 2shpnqn - sqn2 ]
The change in q due to a single generation of selection (Dqn)
is given by:
Dqn =qn+1
- qn
Dqn =[-pnqn/(sh(pn
- qn)]/[1 - 2shpnqn - qn2]
Dqn =-spnqn[(h(1
- 2h)qn)] approximately (ignoring the denominator). Follow this
link for details: Derivation of
Dqn for a Diploid
Therefor, D, the total of the deaths (as a fraction of the population
size) over the course of a substitution is given by:
D = S¥n=0
[2hsqn + sqn2]
(This is the summation from n=1 to infinity of 2shsqn + sqn2.)
After placing the constant s term outside the summation, we have:
D = s*S¥n=0
[2hqn + qn2]