[EM] A computationally feasible method (algorithmic redistricting)

Wed Sep 3 12:35:20 PDT 2008

Raph Frank wrote:
> On Tue, Sep 2, 2008 at 11:00 PM, Kristofer Munsterhjelm
> <km-elmet at broadpark.no> wrote:
>> The reasonable thing to use would be Euclidean distance, since that makes
>> sense, given the geometric nature of the districting problem. If you want to
>> be even more accurate, you can use great circle distance instead to reflect
>> that the districts are on the (near-)spherical Earth, but at the scales
>> involved, the difference would be slight (and so it would be another
>> "dotting the i-s" refinement :-)
> 
> My splitline software used a sphere (and gnomonic projection for the
> lines).  However, I think it wasn't necessary, but wanted it to be as
> accurate as possible.
> 
> OTOH, I think that it might be better to define the map as a 2d map
> using Mercator projection.  The problem with the sphere is that each
> point is a real number.
> 
> If the data is presented as (longitude, latitude), then the numbers
> that are input into the algorithm are rational numbers as they are
> given with a certain number of digits after the decimal point.  This
> allows an exact solution to be found.  However, with reals, different
> precisions could give different answers.  Also, if there is a tie, it
> may not be possible to determine that one occurs.  This is less of a
> problem if a specific algorithm for determining the result is used.
> 
> I think splitline can be solved exactly if the map is 2-d and all
> coordinates in the input data are rational numbers.

If you use a Mercator projected map, you're just hiding the 
quantization. All maps have some distortion, and since the map 
projection uses trigonometric functions, you can just use the Haversine 
distance directly. If you need the precision and an exact measurement of 
error, you could use a rational number class with sine and cosine 
approximation tables (or Taylor series), but I think real error like the 
Earth not being perfectly spherical will get you before the rounding 
errors do.

> Back to to the Voronoi diagrams, I think you may have misunderstood
> what I meant.
> 
> The issue here is that ordinarily, it doesn't matter what power you use.
> 
> If you colour each pixel the colour of the nearest point, then you get
> a standard Voronoi  diagram.
> 
> Likewise, if you say that you colour each pixel the colour of the
> point with the lowest (distance)^2, then you get the same diagram.
> 
> If you square all the distances, then the lowest distance is still the lowest.
> 
> All cells will have straight lines as their edges.
> 
> However, if I say that you should colour the pixel the colour of the
> nearest point, but that the distance to point 1 is to be decreased by
> 100km, then you would expect that the cell with point 1 as its centre
> would be increased in size as some pixels which went to other cells,
> would now go to point 1's cell (as its distance has been decreased).
> 
> If you look at the results, then you would see that cell 1 no longer
> has straight lines as its boundary.
> 
> Now, if instead, you assign the pixels to the cell with the nearest
> distance squared and apply an offset to point 1, then you still
> maintain straight line edges.
> 
> Also, it has the nice feature that you can work out the square distance as
> 
> (x0-x1)^2 + (y0-y1)^2 + Cn
> 
> You save a square root which takes a long time to calculate.

I see. I thought you were talking about how to calculate the distance in 
the first place. Since squaring and square roots are monotonic, if you 
have squared distance = (x0-x1)^2 + (y0-y1)^2, picking the maximum or 
minimum distance would be the same as picking the maximum or minimum 
squared distance. Weighting would differ, of course, as you note.

Power diagrams on Euclidean distance are still convex polyhedra, though, 
to my knowledge.

> That last one might be ideal for districting.  It would allow a city
> to be a circle which surrounded by a rural district.

The surrounding district would score pretty badly on the all-pairs 
distance measure, and also on a similar convexity measure (given as the 
probability of the line between two random points being entirely inside 
the district, or where it is outside of the district, being so only over 
water or outside of the state).

> For example, if you had a State with 2 cities and 3 seats, it might be
> best to split the state into 2 circle districts centred on the cities
> and 1 rural district which is everyone else.
> 
> Most other methods can't handle having one district as an island
> contained in another.

If it's best, the earlier measures are not adequate to discover it.

>> Also note that it's possible to find the borders of the Voronoi cells (for
>> the Euclidean metric, at least) much quicker than doing a nearest-neighbor
>> search on every single pixel. Quantization brought on by the varying sizes
>> of census blocks may complicate matters, though.
> 
> Yeah, most of the automatic methods assume that the 'population' is
> uniform density.
> 
> What is nice about the reweighted version is that you can expand and
> contract a region without having to move the points.
> 
> If you increase a region's weight, it gets smaller.

Within limits, of course. If you have districts A, B, and C, next to 
each other, then increasing the size of B will make it "eat into" both A 
and C. Thus altering the size of B (to transfer population from A to B, 
say) would also impact C. Trying to correct this by adjusting C's weight 
could in turn impact a D district (and perhaps A as well, if D borders E 
which borders... which borders Z which borders A).

>> Your reference to a Cd variable to get population proportionality is
>> interesting. I think part of the problem that you're trying to fix here
>> comes from that clustering (such as by Lloyd's algorithm) optimizes for
>> energy instead of mass. Or in other words, it finds districts so that the
>> average distance for each voter to the closest center is minimized (energy)
>> rather than to find districts that have equal proportions of people within
>> their borders (analogous to equal mass, if each voter is a point of unit
>> mass).
> 
> Maybe both the Cd and the position of each point could be moved.
> 
> I think for (nearly) any given set of points, it is possible to create
> a set of Cd values that will give near equal population.
> 
> The algorithm could be something like:
> 
> 1) Pick initial points (some rule)
> 2) Determine set of Cds to give equal populations
> 3) Move points towards geographic centres of districts
> 4) goto 2) until points have converged
> 
> This probably doesn't always converge, so a method would need to be
> created which always converges.

I did just that today with a toy program that "redistricts" the entire 
world. It does seem to converge, but it's not good enough (to use for 
redistricting purposes) if the courts demand +/- 1 proportionality of 
the districts, as the fryer paper states (footnote 12).

>> Perhaps the randomness complaint could be diminished by having a default
>>  map drawn according to some rules set in law. The redistricting law could
>> refer to that rule, where the rule is very simple - perhaps splitline? Then
>> a state commission or similar could draw the default map so that one is
>> ensured that the results won't be too bad if the redistricting fails. As
>> long as the rule is neutral, the state can't rig the default map, either.
> 
> Perhaps, they would have to beat the default map by some measure.
> 
> One option would be to say that at least 80% of the residents assigned
> to a given district must also be assigned to that district after the
> commission has refined the map to take into account whatever
> aesthetics are appropriate.

The commission would be more robotic than that. The idea I considered 
was to have the default map drawn in some simple manner - splitline 
would work. Then that map becomes the reference for all future maps in 
the "contest". Any map that doesn't strictly improve all relevant 
measures with respect to the default map is excluded from consideration. 
That way, the people can be sure that whatever happens, the map will be 
at least no worse than the default.

The maximum change limitation would make sense, since it'd keep the 
districts from wildly oscillating, but if you set the limit too high, 
the districting process may get locked into a local optimum.