[EM] 0-info approval voting, repeated polling, and adjusting priors

Tue Aug 2 12:47:00 PDT 2005

Dear folks!

Forest's suggestion to perform an approval poll and use it to determine
a default lottery for use in the actual election made me think about
what happens in approval polls in the first place, especially when there
are repeated polls.

In the following thoughts, I assume that (i) there is a sequence of
approval polls for some fixed set of voters and candidates, (ii) voters
answer the poll using 0-info approval strategy, and (iii) after each
poll all voters adjust their priors in some way to the poll's result.
The main question is, of course, whether such a process will eventually
converge to some kind of equilibrium.

Recall that 0-info approval strategy means the following. Each voter  i
 assigns to each candidate  x  some cardinal utility  u(x,i).  For the
poll at time  t,  each voter  i  assigns to each candidate  x  some
prior winning probability  p(x,i,t),  and then  i  approves of  x  if
and only if  u(x,i)  is larger than the expected utility  E(i,t)  (= the
sum of  p(y,i,t)*u(y,i)  over all candidates  y).

The crucial point is how voters adjust their priors over time. I will
study two different natural scenarios here, one of which results in
guaranteed convergence, while the other may lead to cyclic behaviour.

I.
Let us first assume that voters only use the information about the
winner and winning approval of the last poll to adjust their priors. In
other words, they ignore all information about the success of the
non-winning candidates relative to each other. This means they
essentially adjust the winner's prior probability and leave the
conditional prior distribution of the rest as is. Formally, this means
that the adjusted priors before (t) and after (t+1) the poll fulfil the
equations

   p(x,i,t+1)/p(y,i,t+1) = p(x,i,t)/p(y,i,t)

for all candidates  x,y  which both differ from  w(t),  the winner at
time  t.  (In case there is an approval tie  t,  let us assume that the
tie is broken using the previous poll's data.)

If this is the case, then no matter what the precise adjustment
technique of the different voters is, the winner can change only a
finite number of times and will eventually become constant!

To see this, note that the expected utility after the poll is a convex
combination of the expected utility before the poll and the utility of
the winner at time  t:

  E(i,t+1) = lambda*E(i,t) + (1-lambda)*u(w(t),i)

with lambda>0.
This implies that  E(i,t+1)<u(w(t),i)  if and only if  E(i,t)<u(w(t),i).
 Therefore, a voter  i  approves of  w(t)  at time  t+1  if and only if
she approved of  w(t)  at time  t.  In other words, the approval score
of the winner  w(t)  in the next poll at time  t+1  is the same as at
time  t.  Hence whenever the winner changes, the new winner has a larger
approval score than the old one. Obviously, this can happen only a
finite number of times, thus the winner is eventually constant and no
infinite cycles are possible.

II.
Let us now assume that all voters instead use the following adjustment
procedure:

  p(x,i,t+1) = (1-alpha)*p(x,i,t) + alpha*a(x,t)/s(t)

where  alpha  is some constant,  a(x,t)  is  x's  approval score at time
 t  and  s(t)  is the total approval score at time  t.  That is, the
priors are moved a fixed amount towards the relative approval scores of
the last poll.

Although this seems natural, too, it need not converge but can easily
produce cycles. For example, assume that  alpha=2/3  and that there are
3 voters  i,j,k  and 3 candidates  x,y,z  with utilities

         x   y   z
     i  16   7   0
     j   0  16   7
     k   7   0  16

and common initial priors

  p(x,1)=11/26,  p(y,1)=8/26,  p(z,1)=7/26.

Then for the first poll, the expected utilities are

  E(i,1) = (11*16+8*7)/26 = 232/26 > 7,
  E(j,1) =  (8*16+7*7)/26 = 177/26 < 7,
  E(k,1) = (11*7+7*16)/26 = 189/26 > 7.

This results in the 0-info strategy approval

         x   y   z
     i   +   -   -
     j   -   +   +
     k   -   -   +
  score  1   1   2  (sum 4)

That is,  z  wins the first poll, giving rise to the adjusted priors

  p(x,2) = 1/3*11/26 + 2/3*1/4 =  8/26,
  p(y,2) =  1/3*8/26 + 2/3*1/4 =  7/26,
  p(z,2) =  1/3*7/26 + 2/3*1/2 = 11/26.

This is a cyclic permutation of the original priors, hence is is easy to
see that the subsequent winners are  y,x,z,y,x,z,y,x, and ever so on
without converging to a constant winner.

What if we assume a slower adjustment, putting  alpha=1/2,  for example?
Then the adjusted priors are

  p(x,2) = 1/2*11/26 + 1/2*1/4 = 35/104,
  p(y,2) =  1/2*8/26 + 1/2*1/4 = 29/104,
  p(z,2) =  1/2*7/26 + 1/2*1/2 = 40/104.

Then for the second poll, the expected utilities are now all larger than 7:

  E(i,2) = (35*16+29*7)/104 = 7.336... > 7,
  E(j,2) = (29*16+40*7)/104 = 7.153... > 7,
  E(k,2) = (35*7+40*16)/104 = 8.509... > 7.

This results in the 0-info strategy approval

         x   y   z
     i   +   -   -
     j   -   +   -
     k   -   -   +
  score  1   1   1  (sum 3)

>From this point on, the adjusted priors get closer and closer to 1/3 and
the polls only repeat the above result. This suggests the conjecture
that  alpha  can always be chosen small enough to ensure convergence.

Any other ideas how one might adjust one's 0-info priors using poll data?

Jobst