[EM] Exploring a uniform truncation strategy with two frontrunners

Thu Feb 10 17:51:39 PST 2022

Hello,

In simulations I usually have the ballots of each faction generated randomly. It
occurred to me that it would be interesting to compare methods when at least one
particular strategy is always used: That of ranking strictly and sincerely,
except for truncating randomly between two perceived frontrunners, such that one
is ranked above the other and the other is ranked over no one. I speculate that
many voters would use this strategy regardless of method.

Such a study requires a way to determine who the two frontrunners are in a given
election. Typically in simulations of polling I will just use the exact same
method as the election itself, which is then simply the final round of polling
with no special distinction. That would be a problem here as we can't use the
proposed strategy without first knowing frontrunners (at least tentative ones).
Additionally, a method may not obviously suggest two best candidates.

So instead I want to imagine a world where the frontrunners are identified
independently of the method to be used for the election. Certain "poll methods"
are defined to return two winners (unordered), and as input they will receive
all the sincere, strict rankings. I'll name these methods with an * prefix to
differentiate polling methods from election methods.

*IRV - return the last two candidates remaining after doing IRV eliminations.
*FPP - return the top two FPP candidates.
*River - return the River winner (which will include any Condorcet winner),
along with the candidate with the greatest pairwise opposition (PO) against the
River winner.
*BordaBest - return the top two sincere Borda candidates.
*BordaWorst - return the bottom two Borda candidates. This is basically a worst
case scenario, where the frontrunners should be as wrong as possible.

It seems that a little care is needed when defining a method. With IRV and FPP,
we know that the top two candidates by those methods are probably not clones,
and could probably be rivals. With River, I don't know this, so for the *River
poll method it seemed better to identify the candidate most able to beat the
winner pairwise. With *BordaBest, or really any single-winner method, I could
have done that as well.

The elections are all 4-candidate 5-bloc. Each bloc has a random size which is
unchanged from poll to election (which is a bit unusual as it means the poll is
perfect). The preference orders are random. Only 1000 trials per polling method
have been run. 62 rank methods were tested, some not yet proposed. Of them 27
are Condorcet methods.

It may or may not be interesting to see how, on average, the election methods'
metrics change when a different polling method is used. I am not sure that any
analysis can suggest which polling method is the most "true to life" in general.
It could be hard even to say what situation we would like to be true. A bad
statistic on average is irrelevant when we will only actually use one election
method.

One interesting question may be, under each scenario, what is the correlation of
each metric with the sincere Condorcet efficiency (i.e. the rate of electing the
sincere CW when there is one)?

We can define utility by generating a random number on which the sincere
rankings are based, and then check how often each method elects the utility
maximizer. Across the five polling methods the correlation between this rate and
sincere CW election rate averages .907, from .855 *River to .937 *FPP.

The correlation with the elected candidate's implicit approval score (i.e. just
adding up all votes with above-bottom rankings) averages .746. For *BordaBest it
is .962, and for *BordaWorst it's only .196. The truncation strategy under
*BordaWorst no doubt means that too many candidates are getting approved to the
point that high approval isn't indicative of a sincere CW.

The correlation with *voted* Condorcet efficiency is interesting. For *FPP and
*IRV it averages .888. But for *River it's only .574 and for *BordaBest it's
.472. So I guess that when the polling method is effective at "proposing" the
sincere CW as a frontrunner, then it's less important for the method to actually
*be* a Condorcet method.

In fact for each polling method there are a few methods that outperform any
Condorcet method at sincere CW efficiency, although which methods they are does
vary. The only constant is MAMPO. (Which is: Count implicit approval. Elect
purely with that unless there are multiple candidates with majority approval, in
which case elect the one of these with the best MMPO score, i.e. min max the
pairwise opposition.)

Next a curious thing is truncation incentive (meaning, beyond the truncation
that is part of the uniform strategy): Other than *BordaWorst (at -.340) the
correlation is positive, with *BordaBest and *River averaging .664. Note that a
scenario produced by additional truncation wouldn't be incorporated in the stats
for sincere CW efficiency (unless the same scenario happened to be generated
independently as its own scenario, which requires that scenario to be legal
under the uniform truncation strategy). So we're not directly saying that
truncation improves the efficiency, but that methods that would make someone
want to truncate are doing better.

Another metric is the frequency with which methods elect a candidate other than
the two frontrunners. The correlations are -.291 *BordaBest, -.065 *River, .285
*IRV, .455 *FPP, and a whopping .993 for *BordaWorst. Clearly if the
frontrunners were chosen badly, it should help sincere Condorcet efficiency by
electing someone else.

Burial incentive is pretty consistently around -.238 correlation except for
*BordaWorst .139. More interesting is DMT burial incentive specifically, where
the correlation is +/- 0.100ish except for *BordaWorst where somehow it is .439!
I couldn't currently say why an incentive to thwart a voted CW would facilitate
electing the sincere CW. One might guess that due to the truncation strategy, the
voted CWs in some cases could be particularly bad, so that burial could improve
the result.

A high first preference count of the elected candidate has a mostly negative
correlation, outside of *BordaWorst (.436), ranging -.292 to *River's -.610.

Mono-add-top and Mono-raise incidents are both negatively correlated with
sincere Condorcet efficiency. The most extreme is *IRV where Mono-raise failures
have a -.728 correlation. (Around half of the methods satisfy Mono-raise.)

Plurality failures are also uniformly negatively correlated. The range is from
-.564 (*River) to -.772 (*FPP). (But only about 12 of the methods ever violate
Plurality.)

Same with compromise incentive. The correlation range is -.620 to -.786. (There
is no equal-ranking under the uniform truncation strategy, so I don't separately
measure compression strategy, nor the weak FBC.) Compromise could mean that the
sincere CW is getting "lost" beneath higher-ranked preferences.

Finally, for now, the elected candidate having a full majority pairwise loss is
quite negatively correlated with sincere Condorcet efficiency, ranging from
-.858 (*River) to -.984 (*BordaWorst). This probably makes sense since there
should always be a majority win for the frontrunner who defeats the other, and
that other could not be a sincere CW.

Naturally since we assume the only strategy being used by anyone is truncation,
a method that possibly relies on its low truncation incentive (e.g. as a selling
point) may not shine here. For example, C//IRV and Benham did badly except when
*IRV was the polling method.

(A few such examples incidentally may suggest that a method could perform better
when paired with a matching polling method. I'm not too sure why that would be;
perhaps following the same logic twice, assuming the logic is reasonable, is
better for sincere Condorcet efficiency than if you change the philosophy
mid-way.)

After this I will look at a different case, where the sincere preferences
conform to distances between candidates and blocs on a 1D spectrum.

Kevin