[EM] Three Category Symmetric MJ

Forest Simmons forest.simmons21 at gmail.com
Wed Apr 27 18:10:38 PDT 2022


A brief summary: The three judgment categories are Good, Middle, and Bad,
meaning Satisfactory to Excellent, Mediocre, and Unsuitable (for whatever
reasons, including inability to elicit an opinion from the voter, so
blank=Bad).

Candidates marked Good on more than half of the ballots are judged to be in
the Good category. Candidates marked Bad on more than half of the ballots
are judged to be in the Bad category. All other candidates, including those
marked Mediocre on more than half of the ballots, are judged to be in the
Middle category.

It is desirable for election purposes to establish a finish order that
respects these judgment categories ... with all candidates judged Good,
ahead of the the other candidates, and all candidates judged Bad behind the
other candidates.

Within the Good category, candidate k finishes ahead of candidate j if the
number of ballots g(k) on which k is marked Good exceeds the corresponding
number g(j) for candidate j. If this rule needs further resolution because
j and k are marked good on the same number of ballots, then k is ahead of j
in the finish orde if k is marked Bad on fewer ballots than j is so marked.

Similarly, within the Bad category, k finishes ahead of j if k is marked
Bad on more ballots than k is, and when j and k are marked Bad on the same
number of ballots k is ahead of j if k is marked Good on more ballots than
j is.

So far this is all clear and logical. But how to establish a logical,
consistent finish order within the Middle category is not so obvious ...
until we have a good graphical representation of the three categories and
how they fit together.

How they fit together is important because (among other reasons) a small
change in the number of ballots can easily bump a borderline candidate from
one category into another. If k goes from (barely) Good to Middle, it
should enter the Middle category at the upper end of the finish order
within the Middle category. Otherwise, the method has no hope of passing
any reasonable form of the Participation criterion.

The graphical representation will make this continuity requirement clear.
One definition of "topology" is the science of continuity. Without respect
for the relevant topology, it isn't possible to have the continuity needed
for the Participation compliance alluded to above.

So here's the picture: Let N be the total number of ballots submitted. For
each candidate k, let g(k), b(k), and m(k) be the number of ballots on
which k is marked Good, Bad, or Mediocre. For graphical purposes let the
three dimensional Cartesian coordinates x,  y, and z be defined as
x=b(k)/N, y=g(k)/N, and z=m(k)/N, so that (no matter the candidate k ...
hence suppression of k in the notation) we have the constraint equation
x+y+z=1.

The graph of this equation is the convex hull of the three points (1,0,0),
(0,1,0), and (0, 0, 1), which are the vertices of an equilateral triangle.

It is convenient to project this graph vertically onto the x,y plane, the
planar region R given by the planar inequality x+y<=1. To understand this
picture rewrite the constraint equation as x+y=1-z, which reveals z in the
role of "slack variable" for the planar inequality.

The intersection of R with the inequality y>50%, is a right triangle
representing the set of candidates in the Good category:
{k | g(k)>50% of N}. The set of candidates tied for y=constant, is a
horizontal line segment in the Good triangle.  In other words, the "level
curves" or "contour lines" of the equations y=c, for .5<c<1-x fill up the
Good candidate region with horizontal level curves.

Similarly the vertical segments given by x=c for .5<x<1-y fill up the bad
region.

To fill up the Middle region we need a family of line segments that
smoothly transition from the vertical segment V forming the leftmost
boundary of the Bad region to the horizontal segment H at the lower
boundary of the Good region.

The only way to accomplish this is to rotate the segment V  clockwise 90
degrees about the point (.5, .5).

At time t let V(t) make a counter clockwise angle with the diagonal y=x of
t degrees, t goes from -45 to +45 degrees. Let (x, y) be a point on the
segment V(t). Then the angle t between the diagonal y=x and the segment
connecting V(t) is 45 degrees minus the angle theta of V(t) with the
negative X axis. We have tan(theta) = slope of V(t), which equals (.5
-y)/(.5 -x) since both (x,y) and (.5, .5) are on V(t).

So t = 45 deg - arctan((.5 -y)/(.5 -x))

Taking the tangent of both sides of this equation, and making use of the
formula for the tangent of a difference, we get

tan(t) as (tan 45deg - tan theta) divided by (1 +tan 45deg * tan theta).

Since tan 45 deg equals 1, and tan theta equals the slope  (.5 -y)/(.5-x),
we get

tan(t)= (1 - slope)/(1+slope)

Substituting slope = (.5 -y)/(.5-x), and simplifying algebraically, we
get...

tan(t)=(y-x)/[2( 1-x-y)].

Recognizing (1 - x -y) as the slack variable z, we see that tan(t) is just
(y-x)/(2z). In terms of g, b, and m,
tan(t)= .5 (g(k)-b(k)/m(k) ....
which shows where the mysterious formula for the finish order within the
Middle category came from in my previous message.

In particular, tan 45deg = 1, which is the same as (.5 -x)/(2(1-x-.5))
given by the formula, and tan(-45 deg) is -1, which is the same as (y-x)/(2z),
for any point on V=V(0), where x=.5, and z=1-.5 -y, for any y between 0
and. 5.

It all checks out and fits together at the boundaries of the three regions.

That's it for now!

-Forest







El mar., 26 de abr. de 2022 7:20 p. m., Forest Simmons <
forest.simmons21 at gmail.com> escribió:

> Good Work!
>
> I have come to the same conclusion about MJ being much closer to IIA than
> Range.
>
> So I've been trying to improve MJ to make it more symmetrical, satisfy
> some kind of participation, improve tie breaking all around, etc.
>
> I have a good 3-slot version that is as decisive as possible for a reverse
> symmetry method.
>
> I am reminded of our flurry of 3-slot methods from twenty years ago. Back
> then the EM list was very sure that the best proposals were Condorcet and
> Approval, but not at all sure which would be most viable. We thought a
> majoritarian 3-slot method would be plenty simple, easy to count, and have
> room to distinguish Roosevelt, Stalin, and Hitler. Three slot Bucklin was a
> popular suggestion with various tie breaking rules, but no version was
> symmetrical, or any better than (the still unheard of) MJ with three
> judgment categories.
>
> As I have recently come to understand, our lack of design success (i.e.
> inability to get reverse symmetry into three slot Bucklin) was due to
> ignorance of the underlying topology.
>
> To make a long story short, I will finish this message by cutting straight
> to the chase, saving the full explanations for tomorrow:
>
> Suppose the three categories are Good, Middle, and Bad. Good means a
> mixture of desirable qualities from competent to excellent. Bad means
> incompetent or otherwise unsuitable. Middle includes some good qualities
> but not unalloyed with baser metals ...  which is different from "no
> opinion" the same way that "average" is different from "no basis for a
> grade." To avoid "dark horse" problems, we will count blank as Bad. (I'm
> sure some philosopher or lawyer could explain why a candidate able to
> generate neither appreciable support nor opposition, would be unsuitable.)
>
> For each candidate k, let g(k), m(k), & b(k) be the number of ballots on
> which candidate k is categorized as Good, Middle, or Bad, respectively.
>
> If some candidate is judged to be Good on more than half of the ballots,
> then among the candidates tied for the greatest g(k) value, elect the one
> categorized as Bad on the fewest ballots.
>
> In other words (and symbols) ....
>
> elect argmin{b(j)| j is in argmax[g(k)]}.
>
> If (in the other extreme) every candidate is judged to be Bad on more than
> half of the ballots,  then from among the candidates tied for the least
> b(k) value, elect the one categorized as Good on the most ballots. In
> symbols ...
>
> elect argmax{g(j)| j is in argmin[b(k)]}.
>
> If neither of the two above cases obtains, then elect from among the
> candidates in the set argmax[(g(k)-b(k))/m(k)] the candidate j categorized
> as Bad on the fewest ballots or the one categorized as Good on the most
> ballots, depending on whether or not g(j) is greater than b(j). In symbols
> ... elect
>
> argmax{T(j)| candidate  j is among the tied candidates, i.e. j is a member
> of argmax[(g(k)-b(k))/m(k)]},
>
> where the tie breaker function T to be maximized is given by ...
>
> T(j) = min(b(j),g(j))sign(b(j)-g(j))
>
> [Here we have made use of the fact that b(j) is minimized when its
> opposite -b(j) is maximized.]
>
> In my next message I will unfold all of these mysteries into plain view!
>
> At least now you have the complete 3-slot reverse symmetry compliant MJ
> recipe safely in the EM cloud.
>
> -Forest
>
>
>
>
>
> El mar., 26 de abr. de 2022 2:13 a. m., Kristofer Munsterhjelm <
> km_elmet at t-online.de> escribió:
>
>> On 26.04.2022 00:51, Forest Simmons wrote:
>> >
>> >
>> > El lun., 25 de abr. de 2022 4:25 a. m., Kristofer Munsterhjelm
>> > <km_elmet at t-online.de <mailto:km_elmet at t-online.de>> escribió:
>> >>
>> >>     So the M-W strategy is: let
>> >>             v_i be the strategic rating we want to find
>> >>             u_i be the public utility of candidate i
>> >>             p_ij be the voter's perceived probability that i and j will
>> >>     be tied.
>> >>
>> >
>> > I could be wrong but I think it should be "tied for winning."
>>
>> You're right. I was looking at the paper just now, and it says:
>>
>> "For each pair of candidates i and j, the /pivot probability/ p is the
>> probability (perceived by a voter) of the event that candidates i and j
>> will be tied for first place in the election."
>>
>> I imagine you could refine it a little by letting the p.p. be
>> parameterized by the vote to submit. E.g. if it's Range voting and i's
>> score minus j's score is 1, then you could flip the win from i to j by
>> voting j 10 and i 0. But this would complicate the strategy a lot at
>> (probably) only very slight benefit.
>>
>> > It is interesting that this strategy can actually result in non-solid
>> > approval coalitions on ballots ... sometimes it requires you to approve
>> > X while leaving unapproved some candidate Y rated above X on the same
>> > ballot ... i.e. insincere strategy.
>> >
>> > Furthermore, if estimates of both the utilities u_i and u_j, as well as
>> > of the probabilities p_ij in question were known with a high degree of
>> > precision, you might get away with those insincere gaps in the approval
>> > order.
>> >
>> > These facts reflect the fragility (anti-robustness) of the winning tie
>> > probability based strategy.
>>
>> Yes, I think Warren observed something similar: under imperfect
>> information, the optimal Range/Approval strategy might have you
>> approving of X and not Y even though you rank Y ahead of X. Under
>> perfect information, there's always some kind of cutoff where you
>> approve everybody above it and don't everybody below it.
>>
>> > Nevertheless, your result is highly relevant because it shows that on a
>> > fundamental level there is a meaningful, experimental way of defining
>> > individual utilities that are just as good as the theoretical utilities
>> > invoked as a basis for Approval strategy.
>>
>> I keep harping on the problem of Range and Approval to fail "de facto
>> IIA" despite passing it de jure, and I suspect it's related to this. If
>> we can't standardize a and b, then if the method behaves differently
>> when given u_i and up_i values, then you can get strange behavior. So
>> the guidelines about how to vote (mean utility, etc) are just
>> preprocessing steps that make your ballot expression no longer depend on
>> what a and b are. Then it's much more honest to attach these guidelines
>> to the method itself so it does so for the voter, so that voters don't
>> have to care about what society's a and b values are supposed to be, and
>> so that the method doesn't get away with sweeping de-facto failures
>> under the rug.
>>
>> At least MJ recognizes this and says "the only way we're going to get
>> IIA is if we have a and b values that are close enough to commensurable
>> that the problem doesn't occur". And then the point of using grades
>> instead of scores, and using order statistics, is to make the whole
>> process relatively insensitive to what a and b are, so that (hopefully)
>> a common grade standard can be established.
>>
>> > It is equally true for the not as sensitive strategy of approving the
>> > candidates k with above expectation utilities:
>> > u_k >sum P_i u_i,
>> > based on estimates of (non tie based) winning probabilities P_i, which
>> > are still sketchy because of rampant misinformation, not to mention
>> > intentional disinformation.
>>
>> Those are zero-info strategies, and indeed, they're also insensitive to
>> a and b.
>>
>> SARVO tries to get around the fragility/chaos problem by averaging over
>> a lot of vote orders. But it's somewhat of a hack; it's not particularly
>> elegant, and it fails scale invariance. Perhaps better is finding a
>> voting equilibrium where the mixed strategy is so that the distribution
>> of the M-W votes are stable, and then electing the candidate with the
>> highest expected score. I haven't read the M-W paper in detail, though,
>> so I don't know if finding this equilibrium is easy.
>>
>> (Another possibility, inspired by counterfactual regret minimization, is
>> to do M-W strategy by every voter, and then once everybdoy has submitted
>> a vote, pulling one of the voters from the list and having him readjust
>> his strategic ballot. Keep doing so over a long enough timeline and the
>> average of scores should converge to an equilibrium.)
>>
>> For the zero-info strategies, I tried to figure out what the optimum
>> zero info strategy is for Lp cumulative voting. I didn't get all the way
>> there, but this is what I figured:
>>
>> Under zero information, p_ij is equal for all pairs, and is (I think)
>> 1/n^2. So the objective for a zero-info voter is to maximize
>>         SUM i=1..n v_i R_i
>> with R_i = SUM i != j: 1/(n^2) (u_i - u_j).
>>
>> We also have the constraint that SUM i=1..n |v_i|^p = 1 (due to Lp
>> normalization).
>>
>> So to use a Lagrangian:
>>         max SUM i=1..n R_i v_i + lambda (1 - SUM i=1..n |v_i|^p)
>> i.e.
>>         max SUM i=1..n (R_i v_i - lambda |v_i|^p) + lambda
>>
>> Now do a hack and use v_i^p instead because it's easier to differentiate
>> (might not be sound?), and let's consider one particular v, say v_1.
>>
>> The derivative wrt v_1 is
>>         v_1 = ( -R_1/(lambda*p) )^(1/(p-1))
>> and wrt lambda
>>         sum i=1..n: v_i^p = 1.
>>
>> So what that means is that the optimum is at
>>         v_i = (R_i/k)^(1/(p-1))
>> where k is a constant set so that the pth powers of the voting variables
>> sum to one. (I.e. lambda is set so that -lambda p = k, because the
>> derivative wrt lambda itself places no constraint on lambda.)
>>
>> In particular, for max norm (Range), the calculation involves an 1/infty
>> norm, i.e. 0 norm, so that the scores only depend on the sign values of
>> the R variables. I don't *quite* get the right result here (it seems to
>> indicate the optimum vote would be +1 or -1 for every candidate), which
>> I think is because I turned |v_i| into v_i above.
>>
>> For ordinary cumulative voting (l1-cumulative), all R_i are raised to
>> some power that's approaching infinity. So as this power approaches
>> infinity, the k term grows to satisfy the constraint that the pth power
>> sums must be 1. This means that everything except the v_i corresponding
>> to the greatest R_i will approach zero, whereas the remaining one
>> approaches one. So the best zero-info strategy is to give max score to
>> your favorite and nobody else.
>>
>> For the quadratic norm, v_i = R_i/k, so only here is the zero info vote
>> directly proportional to R_i.
>>
>> And R_i - R_j is proportional to u_i - u_j with the same constant of
>> proportionality throughout, because:
>>         R_i - R_j = 1/(n^2) (SUM i!=k (u_i - u_k) - SUM j!=k (u_j - u_k))
>>                   = 1/(n^2) ( (n-1) u_i - SUM k: (u_k) + u_i - (n-1) u_j
>> + SUM k:
>> (u_k) - u_j)
>>                   = 1/(n^2) (n (u_i - u_j))
>>                   = 1/n (u_i - u_j)
>>
>> Hence for quadratic voting, so are the optimal zero info scores v_i.
>> Looking at R_i - R_j removes the b factor, which is probably why I can't
>> show that R_i is proportional to u_i directly.
>>
>> Again, it's not entirely sound but it indicates the general direction.
>> Do improve my calculations if you can, as they're very rough.
>>
>> (The problem with quadratic voting is that it isn't cloneproof. I
>> suspect that only Range itself is, because for every other p-norm >= 1,
>> you can imagine a two-candidate election where A gets 1+epsilon points,
>> B gets 1, then clone A to make A lose, if you just make epsilon small
>> enough.)
>>
>> -km
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20220427/54c3241e/attachment-0001.html>


More information about the Election-Methods mailing list