# [EM] Three Category Symmetric MJ

Forest Simmons forest.simmons21 at gmail.com
Thu Apr 28 18:03:38 PDT 2022

```Andy,

Great dissertation!

I look forward to spending more time with it.

I started thinking about the three category version because there is an
important possible application in the Universal Domain context of ranked
choice ballots where equal ranks and truncations are allowed: instead of
Approval Sorted Margins, do MJ Sorted Margins ...using three category
reverse symmetry MJ (with our helical ramp continuity function), where the
three categories Good, Bad, and Middle, respectively, are simply Too,
Bottom, and Middle ballot listings. Of course, Top ranked on a ballot means
not outranked on the ballot. Bottom on the ballot means "does not outrank
any candidate" on the ballot, and Middle means "out ranks at least one
candidate and is outranked by at least one, as well."

This reverse symmetry basis for MJ Sorted Margins allows that Sorted
Margins method to satisfy Reverse Symmetry in a way that Implicit Approval
Sorted Margins does not.

And the (extended) tie breaking rule is so decisive it does not suffer from
the potential indecisiveness of Implicit Approval.

The one case where the tie breaking rule is not decisive... where two
candidates j and k have Top(k)=Bot(k) and Too(j)=Bot(j) ... is extremely
rare.  In that case I think the tie should be broken in favor of the
candidate whose common Too, Bot value puts it closest to the midpoint (25%,
25%) of the segment connecting points (0,0) and (50%, 50%)  representing
luke warm and controversial candidates, respectively.

If the j and k judgments are the same distance from the midpoint, then that
gives us an opportunity to use the pairwise defeat information inherent in
the ranked choice ballots.

If that is the only use of the pairwise preferences (stopping short of the
Sorted Margins application), then  how much incentive could there be for
burial or compromise?

Best Wishes,

Forest

El jue., 28 de abr. de 2022 12:51 p. m., Andy Jennings <
elections at jenningsstory.com> escribió:

> Forest,
>
> I like this rule. My thesis has a section about MJ tie-breaking rules and
> I came to the same conclusion:
> http://ajennings.net/dissertation.pdf
> pages 24-31
>
> ~ Andy
>
> On Thu, Apr 28, 2022 at 10:56 AM Forest Simmons <
> forest.simmons21 at gmail.com> wrote:
>
>> contourplot (.5(y-x)/(1-x-y) for x=0... .5, y=0... .5)
>>
>> Paste the above command into Wolfram Alpha to get a contour plot of the
>> Middle region.
>>
>> El mié., 27 de abr. de 2022 6:10 p. m., Forest Simmons <
>> forest.simmons21 at gmail.com> escribió:
>>
>>> A brief summary: The three judgment categories are Good, Middle, and
>>> Bad, meaning Satisfactory to Excellent, Mediocre, and Unsuitable (for
>>> whatever reasons, including inability to elicit an opinion from the voter,
>>>
>>> Candidates marked Good on more than half of the ballots are judged to be
>>> in the Good category. Candidates marked Bad on more than half of the
>>> ballots are judged to be in the Bad category. All other candidates,
>>> including those marked Mediocre on more than half of the ballots, are
>>> judged to be in the Middle category.
>>>
>>> It is desirable for election purposes to establish a finish order that
>>> respects these judgment categories ... with all candidates judged Good,
>>> ahead of the the other candidates, and all candidates judged Bad behind the
>>> other candidates.
>>>
>>> Within the Good category, candidate k finishes ahead of candidate j if
>>> the number of ballots g(k) on which k is marked Good exceeds the
>>> corresponding number g(j) for candidate j. If this rule needs further
>>> resolution because j and k are marked good on the same number of ballots,
>>> then k is ahead of j in the finish orde if k is marked Bad on fewer ballots
>>> than j is so marked.
>>>
>>> Similarly, within the Bad category, k finishes ahead of j if k is marked
>>> Bad on more ballots than k is, and when j and k are marked Bad on the same
>>> number of ballots k is ahead of j if k is marked Good on more ballots than
>>> j is.
>>>
>>> So far this is all clear and logical. But how to establish a logical,
>>> consistent finish order within the Middle category is not so obvious ...
>>> until we have a good graphical representation of the three categories and
>>> how they fit together.
>>>
>>> How they fit together is important because (among other reasons) a small
>>> change in the number of ballots can easily bump a borderline candidate from
>>> one category into another. If k goes from (barely) Good to Middle, it
>>> should enter the Middle category at the upper end of the finish order
>>> within the Middle category. Otherwise, the method has no hope of passing
>>> any reasonable form of the Participation criterion.
>>>
>>> The graphical representation will make this continuity requirement
>>> clear. One definition of "topology" is the science of continuity. Without
>>> respect for the relevant topology, it isn't possible to have the continuity
>>> needed for the Participation compliance alluded to above.
>>>
>>> So here's the picture: Let N be the total number of ballots submitted.
>>> For each candidate k, let g(k), b(k), and m(k) be the number of ballots on
>>> which k is marked Good, Bad, or Mediocre. For graphical purposes let the
>>> three dimensional Cartesian coordinates x,  y, and z be defined as
>>> x=b(k)/N, y=g(k)/N, and z=m(k)/N, so that (no matter the candidate k ...
>>> hence suppression of k in the notation) we have the constraint equation
>>> x+y+z=1.
>>>
>>> The graph of this equation is the convex hull of the three points
>>> (1,0,0), (0,1,0), and (0, 0, 1), which are the vertices of an equilateral
>>> triangle.
>>>
>>> It is convenient to project this graph vertically onto the x,y plane,
>>> the planar region R given by the planar inequality x+y<=1. To understand
>>> this picture rewrite the constraint equation as x+y=1-z, which reveals z in
>>> the role of "slack variable" for the planar inequality.
>>>
>>> The intersection of R with the inequality y>50%, is a right triangle
>>> representing the set of candidates in the Good category:
>>> {k | g(k)>50% of N}. The set of candidates tied for y=constant, is a
>>> horizontal line segment in the Good triangle.  In other words, the "level
>>> curves" or "contour lines" of the equations y=c, for .5<c<1-x fill up the
>>> Good candidate region with horizontal level curves.
>>>
>>> Similarly the vertical segments given by x=c for .5<x<1-y fill up the
>>>
>>> To fill up the Middle region we need a family of line segments that
>>> smoothly transition from the vertical segment V forming the leftmost
>>> boundary of the Bad region to the horizontal segment H at the lower
>>> boundary of the Good region.
>>>
>>> The only way to accomplish this is to rotate the segment V  clockwise 90
>>> degrees about the point (.5, .5).
>>>
>>> At time t let V(t) make a counter clockwise angle with the diagonal y=x
>>> of t degrees, t goes from -45 to +45 degrees. Let (x, y) be a point on the
>>> segment V(t). Then the angle t between the diagonal y=x and the segment
>>> connecting V(t) is 45 degrees minus the angle theta of V(t) with the
>>> negative X axis. We have tan(theta) = slope of V(t), which equals (.5
>>> -y)/(.5 -x) since both (x,y) and (.5, .5) are on V(t).
>>>
>>> So t = 45 deg - arctan((.5 -y)/(.5 -x))
>>>
>>> Taking the tangent of both sides of this equation, and making use of the
>>> formula for the tangent of a difference, we get
>>>
>>> tan(t) as (tan 45deg - tan theta) divided by (1 +tan 45deg * tan theta).
>>>
>>> Since tan 45 deg equals 1, and tan theta equals the slope  (.5
>>> -y)/(.5-x), we get
>>>
>>> tan(t)= (1 - slope)/(1+slope)
>>>
>>> Substituting slope = (.5 -y)/(.5-x), and simplifying algebraically, we
>>> get...
>>>
>>> tan(t)=(y-x)/[2( 1-x-y)].
>>>
>>> Recognizing (1 - x -y) as the slack variable z, we see that tan(t) is
>>> just
>>> (y-x)/(2z). In terms of g, b, and m,
>>> tan(t)= .5 (g(k)-b(k)/m(k) ....
>>> which shows where the mysterious formula for the finish order within the
>>> Middle category came from in my previous message.
>>>
>>> In particular, tan 45deg = 1, which is the same as (.5 -x)/(2(1-x-.5))
>>> given by the formula, and tan(-45 deg) is -1, which is the same as (y-x)/(2z),
>>> for any point on V=V(0), where x=.5, and z=1-.5 -y, for any y between 0
>>> and. 5.
>>>
>>> It all checks out and fits together at the boundaries of the three
>>> regions.
>>>
>>> That's it for now!
>>>
>>> -Forest
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> El mar., 26 de abr. de 2022 7:20 p. m., Forest Simmons <
>>> forest.simmons21 at gmail.com> escribió:
>>>
>>>> Good Work!
>>>>
>>>> I have come to the same conclusion about MJ being much closer to IIA
>>>> than Range.
>>>>
>>>> So I've been trying to improve MJ to make it more symmetrical, satisfy
>>>> some kind of participation, improve tie breaking all around, etc.
>>>>
>>>> I have a good 3-slot version that is as decisive as possible for a
>>>> reverse symmetry method.
>>>>
>>>> I am reminded of our flurry of 3-slot methods from twenty years ago.
>>>> Back then the EM list was very sure that the best proposals were Condorcet
>>>> and Approval, but not at all sure which would be most viable. We thought a
>>>> majoritarian 3-slot method would be plenty simple, easy to count, and have
>>>> room to distinguish Roosevelt, Stalin, and Hitler. Three slot Bucklin was a
>>>> popular suggestion with various tie breaking rules, but no version was
>>>> symmetrical, or any better than (the still unheard of) MJ with three
>>>> judgment categories.
>>>>
>>>> As I have recently come to understand, our lack of design success (i.e.
>>>> inability to get reverse symmetry into three slot Bucklin) was due to
>>>> ignorance of the underlying topology.
>>>>
>>>> To make a long story short, I will finish this message by cutting
>>>> straight to the chase, saving the full explanations for tomorrow:
>>>>
>>>> Suppose the three categories are Good, Middle, and Bad. Good means a
>>>> mixture of desirable qualities from competent to excellent. Bad means
>>>> incompetent or otherwise unsuitable. Middle includes some good qualities
>>>> but not unalloyed with baser metals ...  which is different from "no
>>>> opinion" the same way that "average" is different from "no basis for a
>>>> grade." To avoid "dark horse" problems, we will count blank as Bad. (I'm
>>>> sure some philosopher or lawyer could explain why a candidate able to
>>>> generate neither appreciable support nor opposition, would be unsuitable.)
>>>>
>>>> For each candidate k, let g(k), m(k), & b(k) be the number of ballots
>>>> on which candidate k is categorized as Good, Middle, or Bad, respectively.
>>>>
>>>> If some candidate is judged to be Good on more than half of the
>>>> ballots, then among the candidates tied for the greatest g(k) value, elect
>>>> the one categorized as Bad on the fewest ballots.
>>>>
>>>> In other words (and symbols) ....
>>>>
>>>> elect argmin{b(j)| j is in argmax[g(k)]}.
>>>>
>>>> If (in the other extreme) every candidate is judged to be Bad on more
>>>> than half of the ballots,  then from among the candidates tied for the
>>>> least b(k) value, elect the one categorized as Good on the most ballots. In
>>>> symbols ...
>>>>
>>>> elect argmax{g(j)| j is in argmin[b(k)]}.
>>>>
>>>> If neither of the two above cases obtains, then elect from among the
>>>> candidates in the set argmax[(g(k)-b(k))/m(k)] the candidate j categorized
>>>> as Bad on the fewest ballots or the one categorized as Good on the most
>>>> ballots, depending on whether or not g(j) is greater than b(j). In symbols
>>>> ... elect
>>>>
>>>> argmax{T(j)| candidate  j is among the tied candidates, i.e. j is a
>>>> member of argmax[(g(k)-b(k))/m(k)]},
>>>>
>>>> where the tie breaker function T to be maximized is given by ...
>>>>
>>>> T(j) = min(b(j),g(j))sign(b(j)-g(j))
>>>>
>>>> [Here we have made use of the fact that b(j) is minimized when its
>>>> opposite -b(j) is maximized.]
>>>>
>>>> In my next message I will unfold all of these mysteries into plain view!
>>>>
>>>> At least now you have the complete 3-slot reverse symmetry compliant MJ
>>>> recipe safely in the EM cloud.
>>>>
>>>> -Forest
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> El mar., 26 de abr. de 2022 2:13 a. m., Kristofer Munsterhjelm <
>>>> km_elmet at t-online.de> escribió:
>>>>
>>>>> On 26.04.2022 00:51, Forest Simmons wrote:
>>>>> >
>>>>> >
>>>>> > El lun., 25 de abr. de 2022 4:25 a. m., Kristofer Munsterhjelm
>>>>> > <km_elmet at t-online.de <mailto:km_elmet at t-online.de>> escribió:
>>>>> >>
>>>>> >>     So the M-W strategy is: let
>>>>> >>             v_i be the strategic rating we want to find
>>>>> >>             u_i be the public utility of candidate i
>>>>> >>             p_ij be the voter's perceived probability that i and j
>>>>> will
>>>>> >>     be tied.
>>>>> >>
>>>>> >
>>>>> > I could be wrong but I think it should be "tied for winning."
>>>>>
>>>>> You're right. I was looking at the paper just now, and it says:
>>>>>
>>>>> "For each pair of candidates i and j, the /pivot probability/ p is the
>>>>> probability (perceived by a voter) of the event that candidates i and j
>>>>> will be tied for first place in the election."
>>>>>
>>>>> I imagine you could refine it a little by letting the p.p. be
>>>>> parameterized by the vote to submit. E.g. if it's Range voting and i's
>>>>> score minus j's score is 1, then you could flip the win from i to j by
>>>>> voting j 10 and i 0. But this would complicate the strategy a lot at
>>>>> (probably) only very slight benefit.
>>>>>
>>>>> > It is interesting that this strategy can actually result in non-solid
>>>>> > approval coalitions on ballots ... sometimes it requires you to
>>>>> approve
>>>>> > X while leaving unapproved some candidate Y rated above X on the same
>>>>> > ballot ... i.e. insincere strategy.
>>>>> >
>>>>> > Furthermore, if estimates of both the utilities u_i and u_j, as well
>>>>> as
>>>>> > of the probabilities p_ij in question were known with a high degree
>>>>> of
>>>>> > precision, you might get away with those insincere gaps in the
>>>>> approval
>>>>> > order.
>>>>> >
>>>>> > These facts reflect the fragility (anti-robustness) of the winning
>>>>> tie
>>>>> > probability based strategy.
>>>>>
>>>>> Yes, I think Warren observed something similar: under imperfect
>>>>> information, the optimal Range/Approval strategy might have you
>>>>> approving of X and not Y even though you rank Y ahead of X. Under
>>>>> perfect information, there's always some kind of cutoff where you
>>>>> approve everybody above it and don't everybody below it.
>>>>>
>>>>> > Nevertheless, your result is highly relevant because it shows that
>>>>> on a
>>>>> > fundamental level there is a meaningful, experimental way of defining
>>>>> > individual utilities that are just as good as the theoretical
>>>>> utilities
>>>>> > invoked as a basis for Approval strategy.
>>>>>
>>>>> I keep harping on the problem of Range and Approval to fail "de facto
>>>>> IIA" despite passing it de jure, and I suspect it's related to this. If
>>>>> we can't standardize a and b, then if the method behaves differently
>>>>> when given u_i and up_i values, then you can get strange behavior. So
>>>>> the guidelines about how to vote (mean utility, etc) are just
>>>>> preprocessing steps that make your ballot expression no longer depend
>>>>> on
>>>>> what a and b are. Then it's much more honest to attach these guidelines
>>>>> to the method itself so it does so for the voter, so that voters don't
>>>>> have to care about what society's a and b values are supposed to be,
>>>>> and
>>>>> so that the method doesn't get away with sweeping de-facto failures
>>>>> under the rug.
>>>>>
>>>>> At least MJ recognizes this and says "the only way we're going to get
>>>>> IIA is if we have a and b values that are close enough to commensurable
>>>>> that the problem doesn't occur". And then the point of using grades
>>>>> instead of scores, and using order statistics, is to make the whole
>>>>> process relatively insensitive to what a and b are, so that (hopefully)
>>>>> a common grade standard can be established.
>>>>>
>>>>> > It is equally true for the not as sensitive strategy of approving the
>>>>> > candidates k with above expectation utilities:
>>>>> > u_k >sum P_i u_i,
>>>>> > based on estimates of (non tie based) winning probabilities P_i,
>>>>> which
>>>>> > are still sketchy because of rampant misinformation, not to mention
>>>>> > intentional disinformation.
>>>>>
>>>>> Those are zero-info strategies, and indeed, they're also insensitive to
>>>>> a and b.
>>>>>
>>>>> SARVO tries to get around the fragility/chaos problem by averaging over
>>>>> a lot of vote orders. But it's somewhat of a hack; it's not
>>>>> particularly
>>>>> elegant, and it fails scale invariance. Perhaps better is finding a
>>>>> voting equilibrium where the mixed strategy is so that the distribution
>>>>> of the M-W votes are stable, and then electing the candidate with the
>>>>> highest expected score. I haven't read the M-W paper in detail, though,
>>>>> so I don't know if finding this equilibrium is easy.
>>>>>
>>>>> (Another possibility, inspired by counterfactual regret minimization,
>>>>> is
>>>>> to do M-W strategy by every voter, and then once everybdoy has
>>>>> submitted
>>>>> a vote, pulling one of the voters from the list and having him readjust
>>>>> his strategic ballot. Keep doing so over a long enough timeline and the
>>>>> average of scores should converge to an equilibrium.)
>>>>>
>>>>> For the zero-info strategies, I tried to figure out what the optimum
>>>>> zero info strategy is for Lp cumulative voting. I didn't get all the
>>>>> way
>>>>> there, but this is what I figured:
>>>>>
>>>>> Under zero information, p_ij is equal for all pairs, and is (I think)
>>>>> 1/n^2. So the objective for a zero-info voter is to maximize
>>>>>         SUM i=1..n v_i R_i
>>>>> with R_i = SUM i != j: 1/(n^2) (u_i - u_j).
>>>>>
>>>>> We also have the constraint that SUM i=1..n |v_i|^p = 1 (due to Lp
>>>>> normalization).
>>>>>
>>>>> So to use a Lagrangian:
>>>>>         max SUM i=1..n R_i v_i + lambda (1 - SUM i=1..n |v_i|^p)
>>>>> i.e.
>>>>>         max SUM i=1..n (R_i v_i - lambda |v_i|^p) + lambda
>>>>>
>>>>> Now do a hack and use v_i^p instead because it's easier to
>>>>> differentiate
>>>>> (might not be sound?), and let's consider one particular v, say v_1.
>>>>>
>>>>> The derivative wrt v_1 is
>>>>>         v_1 = ( -R_1/(lambda*p) )^(1/(p-1))
>>>>> and wrt lambda
>>>>>         sum i=1..n: v_i^p = 1.
>>>>>
>>>>> So what that means is that the optimum is at
>>>>>         v_i = (R_i/k)^(1/(p-1))
>>>>> where k is a constant set so that the pth powers of the voting
>>>>> variables
>>>>> sum to one. (I.e. lambda is set so that -lambda p = k, because the
>>>>> derivative wrt lambda itself places no constraint on lambda.)
>>>>>
>>>>> In particular, for max norm (Range), the calculation involves an
>>>>> 1/infty
>>>>> norm, i.e. 0 norm, so that the scores only depend on the sign values of
>>>>> the R variables. I don't *quite* get the right result here (it seems to
>>>>> indicate the optimum vote would be +1 or -1 for every candidate), which
>>>>> I think is because I turned |v_i| into v_i above.
>>>>>
>>>>> For ordinary cumulative voting (l1-cumulative), all R_i are raised to
>>>>> some power that's approaching infinity. So as this power approaches
>>>>> infinity, the k term grows to satisfy the constraint that the pth power
>>>>> sums must be 1. This means that everything except the v_i corresponding
>>>>> to the greatest R_i will approach zero, whereas the remaining one
>>>>> approaches one. So the best zero-info strategy is to give max score to
>>>>> your favorite and nobody else.
>>>>>
>>>>> For the quadratic norm, v_i = R_i/k, so only here is the zero info vote
>>>>> directly proportional to R_i.
>>>>>
>>>>> And R_i - R_j is proportional to u_i - u_j with the same constant of
>>>>> proportionality throughout, because:
>>>>>         R_i - R_j = 1/(n^2) (SUM i!=k (u_i - u_k) - SUM j!=k (u_j -
>>>>> u_k))
>>>>>                   = 1/(n^2) ( (n-1) u_i - SUM k: (u_k) + u_i - (n-1)
>>>>> u_j + SUM k:
>>>>> (u_k) - u_j)
>>>>>                   = 1/(n^2) (n (u_i - u_j))
>>>>>                   = 1/n (u_i - u_j)
>>>>>
>>>>> Hence for quadratic voting, so are the optimal zero info scores v_i.
>>>>> Looking at R_i - R_j removes the b factor, which is probably why I
>>>>> can't
>>>>> show that R_i is proportional to u_i directly.
>>>>>
>>>>> Again, it's not entirely sound but it indicates the general direction.
>>>>> Do improve my calculations if you can, as they're very rough.
>>>>>
>>>>> (The problem with quadratic voting is that it isn't cloneproof. I
>>>>> suspect that only Range itself is, because for every other p-norm >= 1,
>>>>> you can imagine a two-candidate election where A gets 1+epsilon points,
>>>>> B gets 1, then clone A to make A lose, if you just make epsilon small
>>>>> enough.)
>>>>>
>>>>> -km
>>>>>
>>>> ----
>> Election-Methods mailing list - see https://electorama.com/em for list
>> info
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20220428/dea8d807/attachment-0001.html>
```