[EM] more thoughts on how to use the Voronoi cell idea for PR

Mon Dec 7 11:19:19 PST 2015

Dear Friends,

Suppose that we have a method for partitioning the ballots into Voronoi
cells relative to any subset of candidates, and that we are to elect 17
candidates (out of many more of them) to an assembly.

What numerical measure of "goodness" could we use to order the subsets of
size 17?

If W is a set of size 17, do the following steps:

(1) Partition the ballots into 17 Voronoi cells, v_1, v_2, ...,  supporting
the 17 respective candidates, c_1, c_2, ... [plus possibly one more
category of ballots that truncate all of the members of W].

(2) Let a_i be the average approval of candidate c_i within the cell v_i .

(3)  Let m = min {(a_i)(#v_i)| i < 18}

Then m is the measure of goodness that we are seeking.

[In case of tied values of m, look at the next-to-smallest products of a
and #v, etc. "by lexicographical order" of the sorted multi-sets of
products)]

There are other possibilities for the definition of a_i in (2):  for
example, (2') let a_i be the worst rating of candidate c_i among the
ballots in the top seventeenth quantile of support for candidate c_i .

We could adapt (2') to rankings if (2'') we allowed equal rankings and
counted them according to the "equal rankings whole" convention.

If we adopted (2') or (2'') we would not need to know the precise ballots
in the respective Voronoi cells.  All we would need would be s goof
estimate of the cardinality of those cells, and the quantiles determined
from sorting all of the ballots order of support of the respective
candidates (i.e. 17 separate sortings, one for each candidate).

The sizes of the respective random ballot probabilities restricted to W,
are good estimates of the relative cardinalities of the voronoi cells.  If
we multiply these probabilities by the number of ballots that do not
truncate every member of W, we get a good estimate of the cardinalities
that we need.

It doesn't matter too much if those random ballot probabilities are based
on ranked preferences, approvals (implicit or otherwise), or explicit
scores.

What do yawl think?

My Best,

Forest

On Mon, Dec 7, 2015 at 10:12 AM, Forest Simmons <fsimmons at pcc.edu> wrote:

> Kristofer,
>
> I'm glad you asked this question.  It shows keen insight and a desire to
> explore the possibilities of this idea.
>
> Let's call the space in question, "issue space." Then suppose that the
> following condition is satisfied:  If a voter is closer to candidate X than
> to candidate Y in issue space, then the voter will rank candidate X above
> candidate Y (or else truncate both of them).
>
> Under this condition the ballots that express a preference for X over any
> other candidate are the ballots of the voters that are closer to X than to
> any other candidate.
>
> So if this condition holds, we don't need to know the precise distances
> themselves to order to calculate the number of voters (or ballots) in the
> respective Voronoi cells.
>
> However, this condition involves several assumptions, including sincere
> rankings.  It could be to our advantage to use a more subtle way of
> estimating the number of voters/ballots in the respective Voronoi cells,
> based on various possible metrics of distances between ballots.
>
> In the case of range ballots, the L1 distance, or the L2 distance, both of
> which reduce to the Hamming distance in the case of Approval ballots,
> naturally suggest themselves.  However, neither of these is very
> satisfactory, because a large clone set will exaggerate the influence of
> that set on the metric.
>
> There are ways to eliminate this clone dependence defect.  One way is to
> do a "Singular Value Decomposition" of the matrix whose rows are the score
> vectors from the ballots, and then use the eigen-vectors corresponding to
> the significant singular values as a basis for the issue space, etc.
>
> If you are interested, I could tell you about my idea of a "Poor Man's
> SVD" that is computationally easier than the standard SVD, and even better
> adapted to the task of elimination of clone dependence in this context.
>
> It turns out that with either of these methods it is relatively easy to
> pinpoint the positions of the respective candidates in issue space
> independent of their publically announced personal score ballots.  This is
> important for keeping insincere candidate posturing from manipulating the
> results.
>
> Furthermore, if this approach is taken, you can supplement (or even almost
> entirely replace) the ballots with questionnaires on all of the issues, in
> order to find the distances needed for defining the Voronoi cells.  Again
> the distances in question need to be decloned.  But that is precisely the
> problem solved by SVD analysis in the context of taxonomy, face
> recognition, etc: in calculating the "distance" between faces, for example,
> you take as many measurements of distances between different features of
> the face as you desire.  Some of the these measurements will be highly
> correlated.  We can think of these highly correlated measurements as
> "clone" measurements.  The gold standard for "de-cloning" these sets of
> measurements in this context is the SVD.
>
> If the Voronoi cell members are calculated on the basis of both the
> expressed preferences, and some subtle metric, then the degree of
> concordance between the two results will be a measure of the sincerity of
> the preference ballots.
>
> I'm sure I have generated more questions than I have answered, but I think
> they are mostly questions worth looking into.
>
> My Best,
>
> Forest
>
>
> On Sun, Dec 6, 2015 at 1:22 AM, Kristofer Munsterhjelm <
> km_elmet at t-online.de> wrote:
>
>> On 12/06/2015 02:48 AM, Forest Simmons wrote:
>> > The idea of convertin lottery methods into PR methods has been around
>> > for a long time.  The obvious idea that has been tried over and over in
>> > some form or another is to run the lottery on the entire set of ballots
>> > with the entire set of candidates, and then elect the set of w
>> > candidates with the greatest winning probabilities.  Since that doesn't
>> > work very well, either we have resorted to allowing the winners to carry
>> > weights with  them into the assembly, or we have gone back to sequential
>> > methods with droop quotas, etc.
>> >
>> > I think a much better idea (that seems to have been entirely
>> > over-looked) is to not run the lottery on the entire set of candidates,
>> > but to run it on all of the subsets of size w,, and choose the most
>> > satisfactory of these subsets.
>> >
>> > There are three things that wouold make a subset satisfactory:
>> >
>> > (1)  The  candidates in the set should not have too much difference in
>> > the probabilities assigned by the lottery to their subset.  In other
>> > words the least probability should be as large as possible.  In other
>> > words, the entropy of the probability distribution should be as large as
>> > possible.
>> >
>> > (2)  The candidates should have as high average ratings among their
>> > supporters as possible.
>> >
>> > (3)   There should be few if any ballots in the set that truncate the
>> > entire subset.
>> >
>> > My suggestions are an attempt to incorporate all of these ideals into a
>> > way of comparing subsets of the appropriate size.
>> >
>> > Here's a simple example based only on ranked preference ballots that
>> > illustrates the main idea:
>> >
>> > Suppose that we want to compare two subsets W and W' of the requisite
>> > size w.
>> >
>> > Let b_i be the number of ballots that rank candidate c_i  of the set W
>> > above every other candidate of W.  These numbers are the sizes of the
>> > Voronoi (or Dirichlet) regions for the respective candidates of W
>>
>> In what space are these Voronoi regions embedded? (I assume you mean the
>> Voronoi region for X is the region of points that are closer to X than
>> they are to any other candidate point.)
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20151207/b29e9f39/attachment-0001.htm>