[EM] Re: Multi-Winner Approval Strategy

Tue Feb 3 17:20:06 PST 2004

For 0-info, I am almost convinced that "Vote for the above-mean
candidates" is the strategy to use.  I'll start from the beginning.  (Note 
that what I'll be doing won't be exactly how Weber presents his 
calculations inpublications.)

In order to work out what a voter should put on an Approval ballot, Weber 
used a technique where he pitted each candidate head-to-head with the 
other candidates.  In order to work out the result of each head-to-head, 
he used the following equation, which works out the "Weber score" (my 
terminology, not his) of how one candidate does against another.

Wij = Pi * Pj * (Ui - Uj)

Wij is Candidate i's Weber score against candidate j
Pi is the probability/chance of candidate Pi winning the Approval vote
Pj is the probability/chance of candidate Pj winning the Approval vote
Ui is the voter's utility of candidate i
Uj is the voter's utility of candidate j

Since this is a 0-info race, the field is wide open.  Nobody knows who is 
going to win.  This means that the probabilities (Pi and Pj) are the same 
for all candidates.  As no division or multiplication will be done with 
this formula, the Pi * Pj can be "divided away" from all the 
head-to-heads.  This leaves behind:

Wij = Ui - Uj

Next I'll bring in six candidates (A to F) who I'll mark out of 10.  10 is 
given to the best candidate, 0 to the worst.

A = 10
B = 5
C = 4
D = 3
E = 2
F = 0

OK.  So those are the utilities of the candidates sorted it out.

Now I can pit each candidate against each other in order to work out the 
Weber scores.  This is shown below in the matrix below (which needs a 
fixed width font to see properly):

    i = A        i = B       i = C       i = D       i = E        i = F
A[ 0 = 10-10] [-5 = 5-10] [-4 = 4-10] [-3 = 3-10] [-2 = 2-10] [-10 = 0-10]
B[ 5 = 10- 5] [ 0 = 5- 5] [-1 = 4- 5] [-2 = 3- 5] [-3 = 2- 5] [ -5 = 0- 5]
C[ 6 = 10- 4] [ 1 = 5- 4] [ 0 = 4- 4] [-1 = 3- 4] [-2 = 2- 4] [ -4 = 0- 4]
D[ 7 = 10- 3] [ 2 = 5- 3] [ 1 = 4- 3] [ 0 = 3- 3] [-1 = 2- 3] [ -3 = 0- 3]
E[ 8 = 10- 2] [ 3 = 5- 2] [ 2 = 4- 2] [ 1 = 3- 2] [ 0 = 2- 2] [ -2 = 0- 2]
F[10 = 10- 0] [ 5 = 5- 0] [ 4 = 4- 0] [ 3 = 3- 0] [ 2 = 2- 0] [  0 = 0- 0]

The above is basically a pairwise matrix.  Each element in the matrix 
contains the equation Wij = Ui - Uj.

(I suppose the Weber equation and pairwise matrix could be used to analyze 
Condorcet stuff, like the Condorcet Criteria.  I suppose an IRV Weber 
equation could even be used to analyze IRV.  But I expect using Weber 
would be fiddly and I get the feeling that the structure produced won't be 
a matrix.  Anyway, back to Approval...)

Weber then adds up each column to find out how good each candidate is in 
comparison with one another.  This results in the following Weber vector 
(again my terminology, not Weber's)...

     A            B           C           D            E           F
[36 = 60-24] [ 6 =30-24] [ 0 =24-24] [-6 =18-24] [-12 = 12-24] [-24 =0-24]

Weber represents an Approval ballot using the following "vector":

A    B    C    D    E    F
[vA] [vB] [vC] [vD] [vE] [vF]

Each element of the vector is either a 1 (a Yes vote) or 0 (No vote).  
Weber then merges the above two vectors together by multiplying together 
the columns of each vector.  This results in the following...

A         B         C         D         E          F
[36 * vA] [ 6 * vB] [ 0 * vC] [-6 * vD] [-12 * vF] [-24 * vF]

Values vA to vF are then selected so that when all of the elements of the 
merged vector are added up, the total is as high as possible (i.e. is a 
maximum).

As positive values (i.e. values greater than 0) have been given to 
candidates A and B in the Weber vector, vA and vB should be 1 in order for 
the values to contribute as much as each value can to the total.

As negative values (i.e. values less than 0) have been given to candidates 
E and F in the Weber vector, vE and vF should be 0 in order for the 
negative values to contribute nothing to the total.

Candidate C is a special case.  Candidate C's value is 0 in the Weber 
vector.  This means that any number multiplied by this value 0 is going to 
be 0.  Therefore, vC can be either 0 or 1.  It does not matter.

Therefore, the best ballot "vectors" are...

A   B   C   D   E   F
[1] [1] [0] [0] [0] [0]

...or...

A   B   C   D   E   F
[1] [1] [1] [0] [0] [0]

And these define how it is best to vote in the 0-info case on an Approval 
ballot.

The reason why candidate C is a special because C's utility is exactly the 
same as the average utility of the candidates is (10 + 5 + 4 + 3 + 2 + 0) 
/ 6 = 4.  Candidates A and B have above average utilities, and the best 
ballot vectors show that I should vote for them.  Candidates D, E and F 
have below average utilities, and the best ballot vectors show that I 
should not vote for them.

Notice that all the calculations that have been done only use utilities. 
 Utilities remain the same for all the candidates, regardless of whether 
top 2, 3, 4, etc... are elected because they are the "marks out of 10" I 
personally gave to the candidates.

Also note that in the Approval ballot, voting is for each candidate 
individually.  It is not for a slate of candidates.  For each candidate, 
Weber attempts to make the best independent use of its score total.

>From all of this, it seems to me that regardless of whether the top 2, 3, 
4, etc... are elected, in the 0-info Approval case, it is best to vote for 
the candidates that are above average.

Initially, I was not satisfied enough with this and went on to imagine 
comparing all the combinations of candidates in a slate using Weber's 
equation instead of candidates, similar to what Mike Ossipoff did.  I more 
or less came to the same conclusion as I did above except with a more 
vague argument.

Say there is an Approval vote where to top 3 are elected.  I'll take two 
slates of candidates AEF and BEF, and try to put them head-to-head.  In 
order to do this, the sum of the utilities of the candidates AEF are added 
up and put head-to-head against the utilities of candidates BEF, which 
were also added up.

(While reading about PAV, I noticed that Joe Weinstein posted a message in 
Jan/Feb 2001(?) saying that for a slate of candidates, the utility of a 
slate is not the same of the utilities of candidates in the slate all 
added up.  However, in order to keep things simple, I'll assume this is 
not the case.)

Notice that both slates are the same apart from the fact that one slate 
has candidate A while the other has candidate B.  Therefore, when applying 
Weber's equation to the slates, the candidates ELF on the left cancels out 
the EF on the right.  All the equation would do is then A - B, which is 
the same as just comparing two individual candidates rather than two 
slates.

At this point, I was quite satisfied that from this I could conclude that 
it is best to vote for the candidates that are above average in 0-info 
Approval.  Though, with hindsight, I think it is quite a flimsy argument.

However, I could go on and compare different slates of candidates that 
involves the candidate I want to find out how I should vote for:

AEF - BCD
:
:
ACD - BEF

Notice above the EF and the CD.  As I am adding/subtracting when I turn 
the above into a one line ballot vector, the EFs and CDs cancel each other 
out.  And I can go on.

:
:
ADE - BCF
:
:
ACF - BDE
:
:

Here, the same thing has happened.  However, this time it is with CF and 
DE.  Admittedly, I don't think this is as good as the very first argument 
I made.

Unless someone can come up with something "cunning", I think full-info 
Approval strategy is nasty to work out.  I initially thought of finding 
the probability of candidates coming 1st and from this get the probability 
of coming 2nd, then 3rd, then 4th etc...

But I then thought about Weber's equation.

Wij = Pi * Pj * (Ui - Uj)

Pi is the probability that candidate i wins.  Pj is the probability that 
candidate j wins.  Therefore, Pi * Pj is the probability that candidates i 
and j both win (i.e. there is a tie).

Why not push this a bit further?  As Mike Ossipoff mentioned, the 
probability of a slate of candidates winning needs to be calculated.  
Again, I'll take the candidates we had before and say that there is an 
Approval vote where the top 3 are elected.

I'll compare two slates head-to-head, candidates acf and candidates bde.  
To work out the Weber score by using Weber's equation, I'll use Pi as the 
probability that acf win and Pj as the probability that bde wins.  These 
are easy to work out.

Pi = Pa * Pc * Pf
Pj = Pb * Pd * Pe

The utilities are also easy to work out by adding up the utilities of the 
candidates in each slate.  Now these values can be plugged into Weber's 
equation to get a Weber score for this head-to-head.

That was the easy bit.  The hard bit is that I THINK this needs to be done 
for every combination of slates.  In this case it is 120 times, if I've 
got that correct.

Say the top 200 of an Approval vote will be used to fill in 200 seats.  
The number of combinations here is extremely massive.  Also, there would 
at least be literally hundreds of candidate probabilities to use.

Referring to posts a while back when there was a discussion about having 
very simple voting methods so that hand counting would be easy for a 
multi-seat election.  Though just plain top-N Approval seems susceptible 
to a very large faction of voters voting for their own block of 
candidates, causing them to be virtually guaranteed to be elected, I don't 
think it would do too badly.

What I picture in my head is the (stereotypical) left-right wing spectrum. 
 In one situation, there would be just a single normal distribution 
histogram in the centre.  When a seat is allocated, a square of the 
histogram randomly flashes to represent where in the spectrum the 
candidate stands.  As it is a normal distribution histogram, the flashes 
tend to be concentrated in the middle of the spectrum.

This could be an advantage.  As most of the candidates tend to be near the 
centre, therefore there will be less disagreements.  Therefore legislation 
could be passed with less argument and therefore more speed.  To 
compensate for the "narrowness" of thought of the people elected, 
elections could held more often.

On the other hand, there may be a country where two normal distribution 
curves would be best in representing who is elected.  One curve is on the 
left of the spectrum while the other is on the right.

I must admit, I don't know how good these mental pictures are.

Thanks,
Gervase.