[EM] Condorcet Flavored PR Methods

Thu Nov 7 14:27:19 PST 2002

Adam, thanks for your interest and comments.  I'll try to answer your
questions below.

On Thu, 7 Nov 2002, Adam Tarr wrote:

> Forest, I finally got around to reading this series of posts.  It's very
> interesting stuff and you've obviously made a lot of progress on this.  A
> few comments:
>
> - I'd imagine you're aware of this, but this approach passes the "sanity
> check" of reducing to a regular pairwise matrix when the size of the
> "circles" is only one candidate. if A>B on a ballot, then kA = .5, kB = 0,
> jA = 0, jB = .5, kA+jB=1, kB+jA=0, so 1/1=1 votes for A, and 1/0 -> 0 votes
> for B.  So, only one method need be defined for single and multi-winner,
> just as for IRV->STV or Approval->PAV.
>

Yes, I considered the "true generalization" feature to be essential.

> - The one thing that bugged me about this approach was the decision to use
> the medians of the symmetric differences as the "exemplar" of each
> set.  (By exemplar, I mean the two candidates that produce the "M" and "m"
> candidates.)  This seemed sort of arbitrary, even if it also seems
> reasonable.  You note that the "sanity check" of matching PAV when all of A
> is preferred to all of B, but this would also be true if you just picked
> the best candidate in A-B and B-A as the exemplars.  Obviously, you also
> get the same results in your second example where only one candidate
> differs in each set.
>

I experimented with various possibilities, and so much water has gone
under the bridge since then I don't remember them all.  One was to just
use the means of sets A and B instead of A-B and B-A.  This would still
preserve symmetry and pass the "sanity checks," but in the case of large
overlap of A and B would water down the proportionality.

> But on the other hand, you do get different results in your
> A1>B1>B2>B3>B4>A2>A3>A4 example; choosing the exemplar as the best
> candidate in A-B and B-A makes you give 1/4 more votes to the A set, rather
> than 1/2 + 1/3 + 1/4 more votes to the B set.  Seems like your approach is
> a clear winner here.
>
> Was this choice of examplar the only one that gave you the reverse-symmetry
> property?  It seems to be that way but I'm not sure.  If that's the case
> then I'm sold on the merits of this choice.
>
> - There may be one small numerical issue to tackle with the medians,
> though.  You say, "Let m1 and m2 be the respective medians of the sets
> (A-B) and (B-A)."  Later you say, "When M or m falls right on a member of A
> or B (rather than in the space between two candidates), then such a member
> adds 1/2 to the count..."  But what about this case:
>
> A1>B1>B2>B3>A2>B4>A3>A4
>
> mB is between B2 and B3.  But mA is between A2 and A3... which falls on B4.
>
> A1>B1>B2>M>B3>A2>B4=m>A3>A4
>
> kA = 1
> kB = 2
> jA = 2
> jB = 1/2
>
> kB + jA = 4
> kA + jB = 1.5
>
> Essentially, since one median falls between candidates, and one falls
> exactly on a candidate, all the 1/2's don't add up in a tidy manner.  So
> the number of votes to cast for the A slate is the sum of 1/n for n ranging
> from zero to 1.5, which doesn't really make sense.  Now, this can be worked
> around by using logarithms, and we know that this should be a number around
> 1.3 or so.  Is this the right way to approach this, or am I missing something?
>

You didn't miss anything. I didn't want to get too complicated in that
posting.  In a previous posting, a few weeks earlier when I was still
groping around for the best M and m, I suggested using logarithms to
approximate sums of the form  1+...+1/n ,  but only because that is
simpler than a more precise formula:  For each whole number n,

                    The sum from k=1 to n of 1/k

                       is precisely equal to

               The integral from x=0 to 1 of (1-x^n)/(1-x)

So this formula is the natural way to extend the formula to halfintegral
values.

For example, when n=1.5 the formula yields the precise value

                   (8/3) - ln(4)

which is (to ten significant figures)  1.280372306   .

This problem won't happen when the set A-B (hence also B-A) has an odd
number of members, since then both M and m will fall exactly on candidate
positions.

Another way to get around the problem is to double the values before
summing.  This won't affect the margin too much, but it does water down
the "PR_ness" of the method slightly when the number of candidates is
small, which (unfortunately) is precisely the case that is computationally
feasible.

In your example, doubling gives  2*4=8 and 2*1.5=3,  the new margin is

                1/4 + 1/5 + 1/6 + 1/7 + 1/8 = 743/840 ,

                    or approximately  .88

compared with the result w/o doubling the numbers

         1 + 1/2 + 1/3 + 1/4  -  (8/3 - ln(4)) =  ln(4) - 7/12 ,

                    or approximately  .80

If we were using simple logarithms the margins would be exactly equal:

           log(2*x) - log(2*y) =  log(x) - log(y)

But the appropriate log approximation is more like

                        1 + ... + 1/n

               is approximately proportional to

                         log(1+2*x)

Well enough said on that.

Forest

----
For more information about this list (subscribe, unsubscribe, FAQ, etc), 
please see http://www.eskimo.com/~robla/em