[EM] "An impractical suggestion for redistricting" just got even less practical

Fri Nov 22 05:13:46 PST 2024

So, lately I've been implementing the hard capacitated k-means linear 
programming algorithm that I suggested could be used for redistricting.

I was surprised when my program would produce non-contiguous regions 
contrary to how I had earlier argued, but I think I understand why.

Suppose that we have a region with a high density point - say a city - 
somewhere near the center. Let district A surround the city and district 
B be close to it, but to the west (say). A is full, B would be full if 
it could assign the city to itself. But B is separated from the city by 
A, and B can't assign the area between itself and the city without going 
over capacity once the city is included. Then it might be the case that, 
though B could get around this by giving up more territory to A, the 
somewhat populous territory of B is far away from A; so it pays more to 
just create a discontinuity and assign the city to B directly.

This is kind of a bummer because I was hoping to find an unambiguous 
best contiguous/compact solution for any given center using only linear 
programming.

I think I can enforce districts to be connected by adding a constraint 
set of the type:
	for each district d:
		for each point a, and each neighboring point b further
		 away from d than a is:
			assign[d, a] >= assign[d, b]

But this could still create spindly non-convex regions or "Winston's 
hiccup" type artifacts.

The most straightforward way to enforce compactness is to use the 
definition of convexity:
	for all points a, b, c that form a line:
		for every district center d:
			assign[d, b] >= min(assign[d, a], assign[d, c])

but that can't be done with linear programming alone. Enforcing a 
greater than constraint against a minimum is only possible if the 
minimum term is in the objective function; and that would drive all 
assignments to 1/n.

Janáček suggests a pairwise way to check for compactness, by seeing if 
two points assigned to different districts could have their assignments 
swapped and both distances to the closest centers improved:
	for all points a, b:
		for all district centers d_1, d_2:
			if a is assigned to d_1 and b to d_2:
				dist(a, d_1) + dist(b, d_2) <= dist(a, d_2) + dist(b, d_1)

(https://www.tandfonline.com/doi/pdf/10.3846/1648-4142.2009.24.274-282)

But I don't think this can be expressed in linear terms either, when we 
don't know in advance what is going to be assigned to each district. And 
the number of constraints required would seem to be much too high for 
nonlinear (integer) programming to be feasible for reasonable point sizes.

So far, I've had some luck first solving a problem where population 
constraints aren't absolute, but instead the objective function gets a 
penalty for exceeding it; and then taking the best centers from 
repeatedly solving that problem and applying these to a problem with 
absolute population constraints.

I can give some examples later, once I've got rendering working, if any 
of you are interested. Elegant, though, it is not.

-km