[EM] Three Category Symmetric MJ

Andy Dienes andydienes at gmail.com
Thu Apr 28 14:00:34 PDT 2022


I believe this is also the "Usual Judgement"  Usual judgment - Wikipedia
<https://en.wikipedia.org/wiki/Usual_judgment>

On Thu, Apr 28, 2022 at 3:51 PM Andy Jennings <elections at jenningsstory.com>
wrote:

> Forest,
>
> I like this rule. My thesis has a section about MJ tie-breaking rules and
> I came to the same conclusion:
> http://ajennings.net/dissertation.pdf
> pages 24-31
>
> ~ Andy
>
> On Thu, Apr 28, 2022 at 10:56 AM Forest Simmons <
> forest.simmons21 at gmail.com> wrote:
>
>> contourplot (.5(y-x)/(1-x-y) for x=0... .5, y=0... .5)
>>
>> Paste the above command into Wolfram Alpha to get a contour plot of the
>> Middle region.
>>
>> El mié., 27 de abr. de 2022 6:10 p. m., Forest Simmons <
>> forest.simmons21 at gmail.com> escribió:
>>
>>> A brief summary: The three judgment categories are Good, Middle, and
>>> Bad, meaning Satisfactory to Excellent, Mediocre, and Unsuitable (for
>>> whatever reasons, including inability to elicit an opinion from the voter,
>>> so blank=Bad).
>>>
>>> Candidates marked Good on more than half of the ballots are judged to be
>>> in the Good category. Candidates marked Bad on more than half of the
>>> ballots are judged to be in the Bad category. All other candidates,
>>> including those marked Mediocre on more than half of the ballots, are
>>> judged to be in the Middle category.
>>>
>>> It is desirable for election purposes to establish a finish order that
>>> respects these judgment categories ... with all candidates judged Good,
>>> ahead of the the other candidates, and all candidates judged Bad behind the
>>> other candidates.
>>>
>>> Within the Good category, candidate k finishes ahead of candidate j if
>>> the number of ballots g(k) on which k is marked Good exceeds the
>>> corresponding number g(j) for candidate j. If this rule needs further
>>> resolution because j and k are marked good on the same number of ballots,
>>> then k is ahead of j in the finish orde if k is marked Bad on fewer ballots
>>> than j is so marked.
>>>
>>> Similarly, within the Bad category, k finishes ahead of j if k is marked
>>> Bad on more ballots than k is, and when j and k are marked Bad on the same
>>> number of ballots k is ahead of j if k is marked Good on more ballots than
>>> j is.
>>>
>>> So far this is all clear and logical. But how to establish a logical,
>>> consistent finish order within the Middle category is not so obvious ...
>>> until we have a good graphical representation of the three categories and
>>> how they fit together.
>>>
>>> How they fit together is important because (among other reasons) a small
>>> change in the number of ballots can easily bump a borderline candidate from
>>> one category into another. If k goes from (barely) Good to Middle, it
>>> should enter the Middle category at the upper end of the finish order
>>> within the Middle category. Otherwise, the method has no hope of passing
>>> any reasonable form of the Participation criterion.
>>>
>>> The graphical representation will make this continuity requirement
>>> clear. One definition of "topology" is the science of continuity. Without
>>> respect for the relevant topology, it isn't possible to have the continuity
>>> needed for the Participation compliance alluded to above.
>>>
>>> So here's the picture: Let N be the total number of ballots submitted.
>>> For each candidate k, let g(k), b(k), and m(k) be the number of ballots on
>>> which k is marked Good, Bad, or Mediocre. For graphical purposes let the
>>> three dimensional Cartesian coordinates x,  y, and z be defined as
>>> x=b(k)/N, y=g(k)/N, and z=m(k)/N, so that (no matter the candidate k ...
>>> hence suppression of k in the notation) we have the constraint equation
>>> x+y+z=1.
>>>
>>> The graph of this equation is the convex hull of the three points
>>> (1,0,0), (0,1,0), and (0, 0, 1), which are the vertices of an equilateral
>>> triangle.
>>>
>>> It is convenient to project this graph vertically onto the x,y plane,
>>> the planar region R given by the planar inequality x+y<=1. To understand
>>> this picture rewrite the constraint equation as x+y=1-z, which reveals z in
>>> the role of "slack variable" for the planar inequality.
>>>
>>> The intersection of R with the inequality y>50%, is a right triangle
>>> representing the set of candidates in the Good category:
>>> {k | g(k)>50% of N}. The set of candidates tied for y=constant, is a
>>> horizontal line segment in the Good triangle.  In other words, the "level
>>> curves" or "contour lines" of the equations y=c, for .5<c<1-x fill up the
>>> Good candidate region with horizontal level curves.
>>>
>>> Similarly the vertical segments given by x=c for .5<x<1-y fill up the
>>> bad region.
>>>
>>> To fill up the Middle region we need a family of line segments that
>>> smoothly transition from the vertical segment V forming the leftmost
>>> boundary of the Bad region to the horizontal segment H at the lower
>>> boundary of the Good region.
>>>
>>> The only way to accomplish this is to rotate the segment V  clockwise 90
>>> degrees about the point (.5, .5).
>>>
>>> At time t let V(t) make a counter clockwise angle with the diagonal y=x
>>> of t degrees, t goes from -45 to +45 degrees. Let (x, y) be a point on the
>>> segment V(t). Then the angle t between the diagonal y=x and the segment
>>> connecting V(t) is 45 degrees minus the angle theta of V(t) with the
>>> negative X axis. We have tan(theta) = slope of V(t), which equals (.5
>>> -y)/(.5 -x) since both (x,y) and (.5, .5) are on V(t).
>>>
>>> So t = 45 deg - arctan((.5 -y)/(.5 -x))
>>>
>>> Taking the tangent of both sides of this equation, and making use of the
>>> formula for the tangent of a difference, we get
>>>
>>> tan(t) as (tan 45deg - tan theta) divided by (1 +tan 45deg * tan theta).
>>>
>>> Since tan 45 deg equals 1, and tan theta equals the slope  (.5
>>> -y)/(.5-x), we get
>>>
>>> tan(t)= (1 - slope)/(1+slope)
>>>
>>> Substituting slope = (.5 -y)/(.5-x), and simplifying algebraically, we
>>> get...
>>>
>>> tan(t)=(y-x)/[2( 1-x-y)].
>>>
>>> Recognizing (1 - x -y) as the slack variable z, we see that tan(t) is
>>> just
>>> (y-x)/(2z). In terms of g, b, and m,
>>> tan(t)= .5 (g(k)-b(k)/m(k) ....
>>> which shows where the mysterious formula for the finish order within the
>>> Middle category came from in my previous message.
>>>
>>> In particular, tan 45deg = 1, which is the same as (.5 -x)/(2(1-x-.5))
>>> given by the formula, and tan(-45 deg) is -1, which is the same as (y-x)/(2z),
>>> for any point on V=V(0), where x=.5, and z=1-.5 -y, for any y between 0
>>> and. 5.
>>>
>>> It all checks out and fits together at the boundaries of the three
>>> regions.
>>>
>>> That's it for now!
>>>
>>> -Forest
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> El mar., 26 de abr. de 2022 7:20 p. m., Forest Simmons <
>>> forest.simmons21 at gmail.com> escribió:
>>>
>>>> Good Work!
>>>>
>>>> I have come to the same conclusion about MJ being much closer to IIA
>>>> than Range.
>>>>
>>>> So I've been trying to improve MJ to make it more symmetrical, satisfy
>>>> some kind of participation, improve tie breaking all around, etc.
>>>>
>>>> I have a good 3-slot version that is as decisive as possible for a
>>>> reverse symmetry method.
>>>>
>>>> I am reminded of our flurry of 3-slot methods from twenty years ago.
>>>> Back then the EM list was very sure that the best proposals were Condorcet
>>>> and Approval, but not at all sure which would be most viable. We thought a
>>>> majoritarian 3-slot method would be plenty simple, easy to count, and have
>>>> room to distinguish Roosevelt, Stalin, and Hitler. Three slot Bucklin was a
>>>> popular suggestion with various tie breaking rules, but no version was
>>>> symmetrical, or any better than (the still unheard of) MJ with three
>>>> judgment categories.
>>>>
>>>> As I have recently come to understand, our lack of design success (i.e.
>>>> inability to get reverse symmetry into three slot Bucklin) was due to
>>>> ignorance of the underlying topology.
>>>>
>>>> To make a long story short, I will finish this message by cutting
>>>> straight to the chase, saving the full explanations for tomorrow:
>>>>
>>>> Suppose the three categories are Good, Middle, and Bad. Good means a
>>>> mixture of desirable qualities from competent to excellent. Bad means
>>>> incompetent or otherwise unsuitable. Middle includes some good qualities
>>>> but not unalloyed with baser metals ...  which is different from "no
>>>> opinion" the same way that "average" is different from "no basis for a
>>>> grade." To avoid "dark horse" problems, we will count blank as Bad. (I'm
>>>> sure some philosopher or lawyer could explain why a candidate able to
>>>> generate neither appreciable support nor opposition, would be unsuitable.)
>>>>
>>>> For each candidate k, let g(k), m(k), & b(k) be the number of ballots
>>>> on which candidate k is categorized as Good, Middle, or Bad, respectively.
>>>>
>>>> If some candidate is judged to be Good on more than half of the
>>>> ballots, then among the candidates tied for the greatest g(k) value, elect
>>>> the one categorized as Bad on the fewest ballots.
>>>>
>>>> In other words (and symbols) ....
>>>>
>>>> elect argmin{b(j)| j is in argmax[g(k)]}.
>>>>
>>>> If (in the other extreme) every candidate is judged to be Bad on more
>>>> than half of the ballots,  then from among the candidates tied for the
>>>> least b(k) value, elect the one categorized as Good on the most ballots. In
>>>> symbols ...
>>>>
>>>> elect argmax{g(j)| j is in argmin[b(k)]}.
>>>>
>>>> If neither of the two above cases obtains, then elect from among the
>>>> candidates in the set argmax[(g(k)-b(k))/m(k)] the candidate j categorized
>>>> as Bad on the fewest ballots or the one categorized as Good on the most
>>>> ballots, depending on whether or not g(j) is greater than b(j). In symbols
>>>> ... elect
>>>>
>>>> argmax{T(j)| candidate  j is among the tied candidates, i.e. j is a
>>>> member of argmax[(g(k)-b(k))/m(k)]},
>>>>
>>>> where the tie breaker function T to be maximized is given by ...
>>>>
>>>> T(j) = min(b(j),g(j))sign(b(j)-g(j))
>>>>
>>>> [Here we have made use of the fact that b(j) is minimized when its
>>>> opposite -b(j) is maximized.]
>>>>
>>>> In my next message I will unfold all of these mysteries into plain view!
>>>>
>>>> At least now you have the complete 3-slot reverse symmetry compliant MJ
>>>> recipe safely in the EM cloud.
>>>>
>>>> -Forest
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> El mar., 26 de abr. de 2022 2:13 a. m., Kristofer Munsterhjelm <
>>>> km_elmet at t-online.de> escribió:
>>>>
>>>>> On 26.04.2022 00:51, Forest Simmons wrote:
>>>>> >
>>>>> >
>>>>> > El lun., 25 de abr. de 2022 4:25 a. m., Kristofer Munsterhjelm
>>>>> > <km_elmet at t-online.de <mailto:km_elmet at t-online.de>> escribió:
>>>>> >>
>>>>> >>     So the M-W strategy is: let
>>>>> >>             v_i be the strategic rating we want to find
>>>>> >>             u_i be the public utility of candidate i
>>>>> >>             p_ij be the voter's perceived probability that i and j
>>>>> will
>>>>> >>     be tied.
>>>>> >>
>>>>> >
>>>>> > I could be wrong but I think it should be "tied for winning."
>>>>>
>>>>> You're right. I was looking at the paper just now, and it says:
>>>>>
>>>>> "For each pair of candidates i and j, the /pivot probability/ p is the
>>>>> probability (perceived by a voter) of the event that candidates i and j
>>>>> will be tied for first place in the election."
>>>>>
>>>>> I imagine you could refine it a little by letting the p.p. be
>>>>> parameterized by the vote to submit. E.g. if it's Range voting and i's
>>>>> score minus j's score is 1, then you could flip the win from i to j by
>>>>> voting j 10 and i 0. But this would complicate the strategy a lot at
>>>>> (probably) only very slight benefit.
>>>>>
>>>>> > It is interesting that this strategy can actually result in non-solid
>>>>> > approval coalitions on ballots ... sometimes it requires you to
>>>>> approve
>>>>> > X while leaving unapproved some candidate Y rated above X on the same
>>>>> > ballot ... i.e. insincere strategy.
>>>>> >
>>>>> > Furthermore, if estimates of both the utilities u_i and u_j, as well
>>>>> as
>>>>> > of the probabilities p_ij in question were known with a high degree
>>>>> of
>>>>> > precision, you might get away with those insincere gaps in the
>>>>> approval
>>>>> > order.
>>>>> >
>>>>> > These facts reflect the fragility (anti-robustness) of the winning
>>>>> tie
>>>>> > probability based strategy.
>>>>>
>>>>> Yes, I think Warren observed something similar: under imperfect
>>>>> information, the optimal Range/Approval strategy might have you
>>>>> approving of X and not Y even though you rank Y ahead of X. Under
>>>>> perfect information, there's always some kind of cutoff where you
>>>>> approve everybody above it and don't everybody below it.
>>>>>
>>>>> > Nevertheless, your result is highly relevant because it shows that
>>>>> on a
>>>>> > fundamental level there is a meaningful, experimental way of defining
>>>>> > individual utilities that are just as good as the theoretical
>>>>> utilities
>>>>> > invoked as a basis for Approval strategy.
>>>>>
>>>>> I keep harping on the problem of Range and Approval to fail "de facto
>>>>> IIA" despite passing it de jure, and I suspect it's related to this. If
>>>>> we can't standardize a and b, then if the method behaves differently
>>>>> when given u_i and up_i values, then you can get strange behavior. So
>>>>> the guidelines about how to vote (mean utility, etc) are just
>>>>> preprocessing steps that make your ballot expression no longer depend
>>>>> on
>>>>> what a and b are. Then it's much more honest to attach these guidelines
>>>>> to the method itself so it does so for the voter, so that voters don't
>>>>> have to care about what society's a and b values are supposed to be,
>>>>> and
>>>>> so that the method doesn't get away with sweeping de-facto failures
>>>>> under the rug.
>>>>>
>>>>> At least MJ recognizes this and says "the only way we're going to get
>>>>> IIA is if we have a and b values that are close enough to commensurable
>>>>> that the problem doesn't occur". And then the point of using grades
>>>>> instead of scores, and using order statistics, is to make the whole
>>>>> process relatively insensitive to what a and b are, so that (hopefully)
>>>>> a common grade standard can be established.
>>>>>
>>>>> > It is equally true for the not as sensitive strategy of approving the
>>>>> > candidates k with above expectation utilities:
>>>>> > u_k >sum P_i u_i,
>>>>> > based on estimates of (non tie based) winning probabilities P_i,
>>>>> which
>>>>> > are still sketchy because of rampant misinformation, not to mention
>>>>> > intentional disinformation.
>>>>>
>>>>> Those are zero-info strategies, and indeed, they're also insensitive to
>>>>> a and b.
>>>>>
>>>>> SARVO tries to get around the fragility/chaos problem by averaging over
>>>>> a lot of vote orders. But it's somewhat of a hack; it's not
>>>>> particularly
>>>>> elegant, and it fails scale invariance. Perhaps better is finding a
>>>>> voting equilibrium where the mixed strategy is so that the distribution
>>>>> of the M-W votes are stable, and then electing the candidate with the
>>>>> highest expected score. I haven't read the M-W paper in detail, though,
>>>>> so I don't know if finding this equilibrium is easy.
>>>>>
>>>>> (Another possibility, inspired by counterfactual regret minimization,
>>>>> is
>>>>> to do M-W strategy by every voter, and then once everybdoy has
>>>>> submitted
>>>>> a vote, pulling one of the voters from the list and having him readjust
>>>>> his strategic ballot. Keep doing so over a long enough timeline and the
>>>>> average of scores should converge to an equilibrium.)
>>>>>
>>>>> For the zero-info strategies, I tried to figure out what the optimum
>>>>> zero info strategy is for Lp cumulative voting. I didn't get all the
>>>>> way
>>>>> there, but this is what I figured:
>>>>>
>>>>> Under zero information, p_ij is equal for all pairs, and is (I think)
>>>>> 1/n^2. So the objective for a zero-info voter is to maximize
>>>>>         SUM i=1..n v_i R_i
>>>>> with R_i = SUM i != j: 1/(n^2) (u_i - u_j).
>>>>>
>>>>> We also have the constraint that SUM i=1..n |v_i|^p = 1 (due to Lp
>>>>> normalization).
>>>>>
>>>>> So to use a Lagrangian:
>>>>>         max SUM i=1..n R_i v_i + lambda (1 - SUM i=1..n |v_i|^p)
>>>>> i.e.
>>>>>         max SUM i=1..n (R_i v_i - lambda |v_i|^p) + lambda
>>>>>
>>>>> Now do a hack and use v_i^p instead because it's easier to
>>>>> differentiate
>>>>> (might not be sound?), and let's consider one particular v, say v_1.
>>>>>
>>>>> The derivative wrt v_1 is
>>>>>         v_1 = ( -R_1/(lambda*p) )^(1/(p-1))
>>>>> and wrt lambda
>>>>>         sum i=1..n: v_i^p = 1.
>>>>>
>>>>> So what that means is that the optimum is at
>>>>>         v_i = (R_i/k)^(1/(p-1))
>>>>> where k is a constant set so that the pth powers of the voting
>>>>> variables
>>>>> sum to one. (I.e. lambda is set so that -lambda p = k, because the
>>>>> derivative wrt lambda itself places no constraint on lambda.)
>>>>>
>>>>> In particular, for max norm (Range), the calculation involves an
>>>>> 1/infty
>>>>> norm, i.e. 0 norm, so that the scores only depend on the sign values of
>>>>> the R variables. I don't *quite* get the right result here (it seems to
>>>>> indicate the optimum vote would be +1 or -1 for every candidate), which
>>>>> I think is because I turned |v_i| into v_i above.
>>>>>
>>>>> For ordinary cumulative voting (l1-cumulative), all R_i are raised to
>>>>> some power that's approaching infinity. So as this power approaches
>>>>> infinity, the k term grows to satisfy the constraint that the pth power
>>>>> sums must be 1. This means that everything except the v_i corresponding
>>>>> to the greatest R_i will approach zero, whereas the remaining one
>>>>> approaches one. So the best zero-info strategy is to give max score to
>>>>> your favorite and nobody else.
>>>>>
>>>>> For the quadratic norm, v_i = R_i/k, so only here is the zero info vote
>>>>> directly proportional to R_i.
>>>>>
>>>>> And R_i - R_j is proportional to u_i - u_j with the same constant of
>>>>> proportionality throughout, because:
>>>>>         R_i - R_j = 1/(n^2) (SUM i!=k (u_i - u_k) - SUM j!=k (u_j -
>>>>> u_k))
>>>>>                   = 1/(n^2) ( (n-1) u_i - SUM k: (u_k) + u_i - (n-1)
>>>>> u_j + SUM k:
>>>>> (u_k) - u_j)
>>>>>                   = 1/(n^2) (n (u_i - u_j))
>>>>>                   = 1/n (u_i - u_j)
>>>>>
>>>>> Hence for quadratic voting, so are the optimal zero info scores v_i.
>>>>> Looking at R_i - R_j removes the b factor, which is probably why I
>>>>> can't
>>>>> show that R_i is proportional to u_i directly.
>>>>>
>>>>> Again, it's not entirely sound but it indicates the general direction.
>>>>> Do improve my calculations if you can, as they're very rough.
>>>>>
>>>>> (The problem with quadratic voting is that it isn't cloneproof. I
>>>>> suspect that only Range itself is, because for every other p-norm >= 1,
>>>>> you can imagine a two-candidate election where A gets 1+epsilon points,
>>>>> B gets 1, then clone A to make A lose, if you just make epsilon small
>>>>> enough.)
>>>>>
>>>>> -km
>>>>>
>>>> ----
>> Election-Methods mailing list - see https://electorama.com/em for list
>> info
>>
> ----
> Election-Methods mailing list - see https://electorama.com/em for list
> info
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20220428/811666ef/attachment-0001.html>


More information about the Election-Methods mailing list