# [EM] CONFIRMATION SAMPLE SIZE

Joe Weinstein jweins123 at hotmail.com
Thu Nov 20 00:11:01 PST 2003

```CONFIRMATION  SAMPLE SIZE   (WAS Re: Re: Touch Screen Voting Machines)

THE  QUESTION.  In EM message 12737, Wed. 19 Nov 03, Ken Johnson asked:

“Suppose you have a two-candidate election with 10,000,000 voters, and the
computer
says that candidate A beats candidate B 51% to 49%. How many
randomly-selected ballots would you need ...to confirm the election result
with 99.99% confidence?”

Ken asked the question for two different cases: one where you not only
inspect each sampled ballot but also ‘correlate’ it to a database, and the
other where you don’t correlate (because presumably you can’t or needn’t).

I will treat both cases together.  As I understand the matter (or maybe
don't), the required sample-size N should be the same in both cases;  the
cases differ only in the kind of validation demanded, to ensure that an
object in the sample may be accepted as an authentic individual of the
population being sampled.  (Here in each case the population comprises 10
million ballots.)

For a discussion and solution, we need a few definitions.  Let P be the
proportion of ballots for A, and NTOT the total number of ballots cast: here
NTOT = 10 million.  Let N be the size (to be chosen) of the random sample of
ballots, and let p be the sample’s proportion of ballots for A.

The answer N to Ken’s question depends on what, precisely, one means by ‘the
election result’ - and also (though I won't go explicitly very far into the
issue here) by  ‘confirm with 99.99% confidence’.  Namely, either of two
hypotheses (i.e. conditions) may be deemed ‘the result’ of interest here.
The hypothesis ‘P=0.51’ expresses the computer’s finding, whereas the
hypothesis ‘P>0.50’ expresses the operative result that A wins.

Remember that unless the sample is total and exhaustive, the sample data
cannot unshakeably confirm a given value or restricted range of values for
P.  Also - with very few exceptions- a reasonably small sample cannot
unshakeably reject or invalidate a value or range either.  (One of the few
exceptions is the value P=0, in case the sample does contain one or more

What a sample CAN do is give data which may show that one given value of P -
e.g. P=0.51 - is more likely (maybe much more likely) than another such
value - e.g. P=0.49 or P=0.50.  In other words, sample data can help us
decide among two hypotheses H0 and H1 (statisticians typically call these
the ‘null hypothesis’ and the ‘alternative hypothesis’), each of form ‘P=X’.

THE QUESTION, INTERPRETED, IN  PRINCIPLE AND IN PRACTICE.  I therefore
interpret the question in principle as asking each of a host of questions of
the following form, one for each value X less than or equal to 0.50 .

Please bear briefly with this theoretical host.  In a short while, for the
practical solution, the host of questions will reduce to a single question -
for the case X=0.50 .    Namely (and I will not show this, but it can
readily be shown), a value N big enough for that case will work for any
other case.

Consider the following two hypotheses:

H0: P=0.51 and H1(X): P=X (where X is some given value at most 0.50).

H0 says that the computer is accurate.  H1 says that the computer is
inaccurate and worse: it is  operationally wrong, in that A actually loses,
and has a (given) minority (X) of the votes.

The question for given X then asks us to choose sample size N large enough
so that:

If H0 is in fact true, then with probability at least 99.99% the resulting
sample data will be more likely under H0 than under H1(X); and
If H1(X) is in fact true, then with probability at least 99.99% the
resulting sample data will be more likely under H1(X) than under H0.

So, in effect, the sample size will allow us with high (99.99%) ‘confidence’
to ‘confirm’  whichever of H0, H1(X) is true, at least vs. the other
hypothesis.

(Comment: if M is any desired multiplier >1, and if in the displayed
requirement statements you want to replace the condition ‘will be more
likely’ by the condition ‘will be at least M times more likely’, you can
still satisfy the statements, by a suitable larger choice of N.  My solution
below will not go further into this, but it could readily be modified to do
so.)

In other words, on the question (defined for each nonnegative X up to 0.50)
as to which of two possibilities is really more credible - the computer
being right, or A losing with P=X - sample size N will be large enough to
enable us almost surely to find out.

As stated above, we need only choose N large enough for the case X=0.50.

SOLUTION  CALCULATIONS.  Given that in fact P=X, and with N of sufficient
size, the observed proportion p of pro-A ballots, considered over all
possible size-N samples, will follow nearly a bell-curve (i.e. Gaussian
normal) distribution, with mean X and standard deviation S = SQRT
(X(1-X)/N).  Here, we use instances X=51% and X=50%.  In both instances for
practical purposes S = (1/2)*SQRT (1/N). (This is exact for X=0.50; S is
very slightly less when X=0.51 .)

For X=0.51 the idea is to take N so large that 99.99% of the distribution of
p will be closer to 0.51 than to 0.50.  The same N will work for X=0.50:
then 99.99% of the distribution will be closer to 0.50 than to 0.51.  For
both cases at once, it suffices to ensure that N is big enough to make a
suitable multiple zS of standard deviation S be smaller than
(1/2)(0.51-0.50), = 0.005.

Recall, S = (1/2)*SQRT(1/N)).  So we need N big enough to make z*SQRT(1/N)
<0.01, or equivalently so that:

N>(10^4)*z^2.

Here, z^2 denotes z-squared, and z is defined as that number (or any
convenient larger value) for which in a standard bell-curve distribution
(which has mean 0 and standard deviation 1) 99.99% of the distribution mass
(i.e. area under the bell curve) is less than z.

To find z, one may resort to any usual table of standard bell-curve tail
areas, or use an equivalent computation routine.  (Temporarily lacking both,
I first used an inequality readily derived from the analytic definition of
the standard bell curve: namely that the tail area under the curve beyond z
is less than - but nearly - (1/z) times the height of the curve at z.)
Result is that it suffices to take z=3.9 or any larger value, and then from
the above displayed inequality N may be taken any convenient value >
152,100.

Actually N may be taken slightly smaller, because in the above formula for S
we omitted the factor SQRT(1-f), where f is the finite-population ‘sampling
fraction’ N/NTOT.  Here, to first cut, f is about 150,000 / 10 million, or
0.015; so that a refined calculation allows us to take N approximately
(1-0.015)*152,100,  i.e. just under 150,000.

PROVOCATIVE COMMENT.  This sample size N=150,000 in effect will with high
confidence confirm any majority margin of 51-49 or better, regardless of
whether that margin has been ‘counted’ by a computer or ‘predicted’ by a
prior poll or simply happens quietly to be the voters' mass preference.

One may argue that if the true margin is closer than 51-49, there is no
effective ‘mandate’ anyhow from the total electorate.  Hence, why not simply
use this sample size N=150,000, randomly choose N electors, and go with
their decision?  At least you then have only 150,000 ballots to authenticate
and count, not millions.  Also, mass ‘campaigning’ could at one and the same
time be both far cheaper and yet more directly and meaningfully involve the
electors.

More in this vein another time.  I still plan in a later posting to show
that in fact average voter power, counting  both sampled and unsampled
voters,  is typically enhanced, not reduced, by replacing a mass electorate
by a much smaller random sample.

Joe Weinstein
Long Beach CA USA

_________________________________________________________________
Need a shot of Hank Williams or Patsy Cline?  The classic country stars are
always singing on MSN Radio Plus.  Try one month free!