[EM] Which branch of mathematics does voting theory belong to?

Thu Dec 2 09:05:54 PST 2021

To judge from the literature I would suppose that voting theory was part 
of first-order logic or of graph theory, but it seems clear that the 
right answer is Bayesian decision theory.

This is easiest to see under a spatial model, but I think it’s perfectly 
general. We have vague prior information about the attributes of voters 
and candidates (eg. their positions in space) and about voter behaviour 
(how they will cast their ballots in the light of these attributes). We 
can condition this prior information on the observations contained in a 
set of ballots and thereby compute the posterior probability of any 
desired function of the voters' and candidates' attributes. Our aim is 
to identify the most representative candidate, where the degree of 
representativeness can be expressed through a loss function, and where, 
therefore, we will seek to identify the candidate whose posterior 
expected loss is the least. The prior knowledge of voters' positions can 
be thought of as a distribution of distributions, eg. the voters come 
from a mixture of three identical and equally weighted circular 
Gaussians whose means come from a further Gaussian.

This is a constructive approach which in principle might be used to 
determine the winner of an election. We'd just need to integrate out all 
the unknowns to find the expected losses of the candidates. This was in 
essence the approach adopted by Good and Tideman in 1971 but not pursued 
further. The same view of voting theory underlies the empirical 
evaluations which have taken place subsequently: elections are sampled 
under a vague prior, and the results are assessed under an appropriate 
loss function. The only feature which conceals the decision-theoretic 
basis is the persistent use of the term 'utility' where 'loss' would be 
more correct.

Unfortunately the constructive approach seems to be numerically 
intractable in cases of interest. If the number of voters was small, the 
observations would provide probabilistic information which could be 
integrated under the prior in the normal way. But as the number of 
voters increases, the information becomes increasingly deterministic - 
it degenerates to a set of equations. And therefore two cases arise. 
Either the equations fully determine the parameters of the voter 
distribution, in which case the prior almost drops out of the 
calculation; or the equations constrain the voter parameters to a curved 
manifold in which only the prior remains to be integrated.

The former case was encountered by Good and Tideman, which is why their 
paper ended up as Bayesianism without the prior. Unfortunately their 
model (a single Gaussian) is too simple to be of interest, given the 
optimality of Condorcet methods under it.

I say that the prior 'almost' drops out of the calculation because Good 
and Tideman's parameters have a degree of freedom which is independent 
of the information in the ballots. This lies in the radial distance of 
the three candidates from the centre of the circle whose circumference 
they lie on. In general we may have prior information about this 
distance, and it may affect the candidates' losses, so it seems a 
suitable case for Bayesian treatment. But Good and Tideman adopt a 
squared-distance loss function, and under this loss function (and this 
function alone, I suspect) the radial distance is immaterial to the 
identity of the optimal candidate. (The authors claim that the same 
result applies to any loss function which depends solely on distance, 
but I believe this to be an error.) Thus Good - of all people - made the 
prior drop out completely.

It would be more interesting to adopt a Gaussian mixture prior as 
sketched above. Even then we could perform nothing more than a 
computational thought experiment. A truly realistic model would have to 
allow for an arbitrary number of Gaussians in a space of any dimension, 
and incorporate valence and random effects, and would need to allow for 
tactical voting. It's hard to imagine any useful solution being obtainable.

But if voting theory is an insoluble problem in Bayesian decision 
theory, then any voting method we encounter must be essentially ad hoc, 
even if it draws on bomb-proof reasoning from another branch of 
mathematics. At least we have the comfort of knowing that there is a 
rigorous method of evaluating the solutions which are proposed to us.

CJC