[EM] High precision Yee diagrams
Kristofer Munsterhjelm
km-elmet at broadpark.no
Sat Jan 22 14:44:16 PST 2011
Ordinary Yee diagrams are constructed by, for each point, producing a
Gaussian distribution centered on that point, and have each sample of
the Gaussian vote on candidates in order of preferences. This approach
will always contain some noise, since the sample size is finite.
Some times, having this noise is useful, since it lets us see how the
voting method acts in the presence of actual noise. Other times, though,
it might be useful to see how the voting method acts when there is no
noise, to understand what instances of strange behavior, such as
monotonicity oddities with IRV, don't disappear as you add more and
more voters who vote in the same manner.
Thus, here's a suggestion for how to make high precision Yee diagrams.
The primary idea is that instead of sampling from a Gaussian, we can
simply determine the volume of the Gaussian within a certain region -
the region of voters that all have the same ranking.
This method would have two steps. In the first step, we determine the
regions (convex polygons) where every voter inside the region submits
the same rating. This step has to be done only once per Yee diagram. In
the second step, we center the Gaussian on a given point (the point
corresponding to the pixel whose color is to be determined), calculate
the volume of that Gaussian within each same-voter region, and use each
region's volume as the weighting for the region's associated ranking.
The second step can be done quite simply. Solve the double integral for
a Gaussian (with input variable center and standard deviation) with
respect to a region of the shape of a right angle triangle where two of
the legs are parallel to the axes. Then, to determine the volume over an
arbitrary triangle, rotate the axes so that the longer line is parallel
to one of the axes, and split the triangle into two right angle
triangles of the above form. To determine the volume over an arbitrary
convex polygon, just triangulate it and add up the triangle integrals.
This leaves the first step. At first glance, that seems to be
prohibitive. If we have n candidates, there are n! possible ways to rank
them, and so one would think that the combinatorial explosion would make
the algorithm completely impractical. I thought so, too, but it seems
that dimensionality significantly limits the explosion. For instance, if
there are regions belonging to candidates A, B, and C, so that the
distance from a given point in A to the closest point in C is always
greater than the distance from that point to the closest point in B,
then there can be no "A > C > B" voters in the A region (or at all, for
that matter).
Though I think it's possible to contrive examples where the number of
distinct ballots grow superpolynomially with respect to the number of
candidates, I think such examples will be unlikely in practice. I have
no proof either way, though, but if any of you have, go ahead and
mention it :-)
The first step is recursive. Initially, the whole region is assigned to
"*" (i.e. no preference), and is defined by a bounding box. For each
region, we construct the Voronoi diagram of all candidates but the ones
already specified (which in the initial case is nobody, so we build the
whole Voronoi diagram). This divides the space into a number of smaller
regions, to which the same division can be applied: e.g. we find the *
region to contain an A region, and the A region to contain a B and C
region (turning the A region into an "A > B" region and an "A > C"
region), and so on. We end the recursion when all candidates have been
added to every ranking, and are left with a data structure of the form:
A > B > C: (vertex coordinates of the convex polygon for all voters
voting this ranking)
A > C > B: (vertex coords for "A > C > B")
B > A > C: (vertex coords for "B > A > C")
C > A > B: (vertex coords for "C > A > B")
so that actually determining how many (up to scaling) vote for each
ranking is as simple as getting the integral of the polygon with respect
to a Gaussian centered on the pixel in question.
-
That should work. It would take some time programming, but it should
work. It wouldn't work for cardinal methods, though, since the rating
should in that case somehow reflect the distance between each Gaussian
sample and the candidates, and since this doesn't use sampling at all,
there is no sample to get the distance of.
It might be possible to define mean distances and get ratings from that
(e.g. center of gravity or somesuch) but only for cardinal systems where
one ballot that gives rating x for candidate A along with another that
gives rating y for candidate A is equal to one ballot that gives (x+y) to A.
If there's anything I've missed, or if you have any comments, go ahead!
More information about the Election-Methods
mailing list