[EM] Which branch of mathematics does voting theory belong to?

Thu Dec 2 11:13:04 PST 2021

A voting distribution is a statistic. Representation is served by averages:
FAB STV: Four Averages Binomial Single Transferable Vote.

https://www.smashwords.com/books/view/806030

Profile page:
https://www.smashwords.com/profile/view/democracyscience

Richard Lung.   

On 2 Dec 2021, at 5:05 pm, Colin Champion <colin.champion at routemaster.app> wrote:

To judge from the literature I would suppose that voting theory was part of first-order logic or of graph theory, but it seems clear that the right answer is Bayesian decision theory.

This is easiest to see under a spatial model, but I think it’s perfectly general. We have vague prior information about the attributes of voters and candidates (eg. their positions in space) and about voter behaviour (how they will cast their ballots in the light of these attributes). We can condition this prior information on the observations contained in a set of ballots and thereby compute the posterior probability of any desired function of the voters' and candidates' attributes. Our aim is to identify the most representative candidate, where the degree of representativeness can be expressed through a loss function, and where, therefore, we will seek to identify the candidate whose posterior expected loss is the least. The prior knowledge of voters' positions can be thought of as a distribution of distributions, eg. the voters come from a mixture of three identical and equally weighted circular Gaussians whose means come from a further Gaussian.

This is a constructive approach which in principle might be used to determine the winner of an election. We'd just need to integrate out all the unknowns to find the expected losses of the candidates. This was in essence the approach adopted by Good and Tideman in 1971 but not pursued further. The same view of voting theory underlies the empirical evaluations which have taken place subsequently: elections are sampled under a vague prior, and the results are assessed under an appropriate loss function. The only feature which conceals the decision-theoretic basis is the persistent use of the term 'utility' where 'loss' would be more correct.

Unfortunately the constructive approach seems to be numerically intractable in cases of interest. If the number of voters was small, the observations would provide probabilistic information which could be integrated under the prior in the normal way. But as the number of voters increases, the information becomes increasingly deterministic - it degenerates to a set of equations. And therefore two cases arise. Either the equations fully determine To judge from the literature I would suppose that voting theory was part of first-order logic or of graph theory, but it seems clear that the right answer is Bayesian decision theory.

This is easiest to see under a spatial model, but I think it’s perfectly general. We have vague prior information about the attributes of voters and candidates (eg. their positions in space) and about voter behaviour (how they will cast their ballots in the light of these attributes). We can condition this prior information on the observations contained in a set of ballots and thereby compute the posterior probability of any desired function of the voters' and candidates' attributes. Our aim is to identify the most representative candidate, where the degree of representativeness can be expressed through a loss function, and where, therefore, we will seek to identify the candidate whose posterior expected loss is the least. The prior knowledge of voters' positions can be thought of as a distribution of distributions, eg. the voters come from a mixture of three identical and equally weighted circular Gaussians whose means come from a further Gaussian.

This is a constructive approach which in principle might be used to determine the winner of an election. We'd just need to integrate out all the unknowns to find the expected losses of the candidates. This was in essence the approach adopted by Good and Tideman in 1971 but not pursued further. The same view of voting theory underlies the empirical evaluations which have taken place subsequently: elections are sampled under a vague prior, and the results are assessed under an appropriate loss function. The only feature which conceals the decision-theoretic basis is the persistent use of the term 'utility' where 'loss' would be more correct.

Unfortunately the constructive approach seems to be numerically intractable in cases of interest. If the number of voters was small, the observations would provide probabilistic information which could be integrated under the prior in the normal way. But as the number of voters increases, the information becomes increasingly deterministic - it degenerates to a set of equations. And therefore two cases arise. Either the equations fully determine the parameters of the voter distribution, in which case the prior almost drops out of the calculation; or the equations constrain the voter parameters to a curved manifold in which only the prior remains to be integrated.

The former case was encountered by Good and Tideman, which is why their paper ended up as Bayesianism without the prior. Unfortunately their model (a single Gaussian) is too simple to be of interest, given the optimality of Condorcet methods under it.

I say that the prior 'almost' drops out of the calculation because Good and Tideman's parameters have a degree of freedom which is independent of the information in the ballots. This lies in the radial distance of the three candidates from the centre of the circle whose circumference they lie on. In general we may have prior information about this distance, and it may affect the candidates' losses, so it seems a suitable case for Bayesian treatment. But Good and Tideman adopt a squared-distance loss function, and under this loss function (and this function alone, I suspect) the radial distance is immaterial to the identity of the optimal candidate. (The authors claim that the same result applies to any loss function which depends solely on distance, but I believe this to be an error.) Thus Good - of all people - made the prior drop out completely.

It would be more interesting to adopt a Gaussian mixture prior as sketched above. Even then we could perform nothing more than a computational thought experiment. A truly realistic model would have to allow for an arbitrary number of Gaussians in a space of any dimension, and incorporate valence and random effects, and would need to allow for tactical voting. It's hard to imagine any useful solution being obtainable.

But if voting theory is an insoluble problem in Bayesian decision theory, then any voting method we encounter must be essentially ad hoc, even if it draws on bomb-proof reasoning from another branch of mathematics. At least we have the comfort of knowing that there is a rigorous method of evaluating the solutions which are proposed to us.

CJC
----
Election-Methods mailing list - see https://electorama.com/em for list info
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20211202/4e6df52d/attachment-0001.html>