[EM] Algorithmic redistricting: practical or not?

Tue Feb 4 13:39:56 PST 2025

So I got myself entangled in redistricting back in November, after 
writing a post here about how solving a related problem (that I thought 
was the same as districting) could be done through linear programming, 
if the centers were given.

Some months and lots of code later, I thought I'd post about what I 
found out.

The summary, first:
	- Bad news: the linear program I gave doesn't ensure contiguity or 
convexity, and my attempts to forcing it were too expensive.
	- Good news: the linear program isn't useless and can be used as a 
component in a branch-and-bound-like approach. Once it has found a good 
choice of centers, a local search can be used to find a convex solution.
	- All of this is much harder than just using proportional 
representation, so it's mostly an intellectual exercise.

To recap, the linear program is:
	For centers i =  1...n, with cneterx_i, centery_i being the (x,y) 
coordinates of center i

	minimize the sum over centers i, and points x, y inside the state:
	(1)    assign[i, x, y] * (pop(x, y) + eps) * dist^2(centerx_i, 
centery_i, x, y)

	subject to
	(2)    for all i, x, y: 0 <= assign[i, x, y] <= 1
	(3)    for all i: (sum over x, y: assign[i, x, y] * pop(x, y)) - tpop/n = 0
	(4)    for all x, y: (sum over i: assign[i, x, y]) = 1

(1) defines the problem: minimize expected squared distance from a 
random inhabitant to their assigned center. Assign[i, x, y] is a free 
variable that is one if the point (x,y) is assigned to center i, zero 
otherwise, and pop(x, y) is the number of people at that reside closer 
to (x,y) than any other point. The epsilon value eps ensures that a 
point with no population is assigned to its closest center.[1]
(2) defines the range of the assign values.
(3) implements the population constraint that every district has equal 
population.
(4) requires that every point be assigned to some district.

The linear program does solve the problem as specified. However, the 
resulting districts aren't contiguous.

You'd think they were because suppose district A picks a point that's 
inside district B, then shouldn't we be able to improve the solution for 
both by swapping this assignment with points closer to A? The problem is 
the population constraint. Sometimes it's easier to skip points because 
otherwise, a bunch of other points would have to change to keep equal 
populations, and the sum of these changes could be worse.

So instead you need additional constraints, and these are expensive.[2] 
That's the bad news.

The good news is that it's actually not as bad as you'd think, because 
the linear program I gave provides a good lower bound for a contiguous 
solution. Furthermore, with integer programming, it's possible to add a 
large number of candidate points and have the solver choose only some of 
these as centers. For instance, we may pick 150 candidate centers with 
something like kmeans++; and then use an integer program to say it 
should choose 8 of them, like this:

minimize the sum over centers i, and points x, y inside the state:
	assign[i, x, y] * (pop(x, y) + eps) * dist^2(centerx_i, centery_i, x, y)

	subject to
		(2b) for all i, x, y: 0 <= assign[i, x, y] <= chosen[i]
		for all i: (sum over x, y: assign[i, x, y] * pop(x, y)) - tpop/k * 
chosen[i] = 0
		for all x, y: (sum over i: assign[i, x, y]) = 1
		(5) sum over i: chosen[i] = k

	free variables: assign[i, x, y] continuous, chosen[i] binary

(2b) says: if chosen[i] is zero, then nobody can be assigned to center 
i. Otherwise, points may be assigned to center i.
(5) forces exactly k districts to be chosen. The population constraint 
(3) has also been updated accordingly: if chosen[i] is zero, then 
population must be zero.

(Thanks to Joshua Boehme for this idea.)

When run, the program will find the best k centers and give, as its 
objective value, a lower bound on the sum of squared distances that 
we're actually interested in. If this is worse than the current record, 
we just discard the solution as it can't be better anyway; otherwise, we 
check it more closely with contiguity constraints, which will return the 
true objective value.

This seems to work reasonably well in theory, but the datrasets are much 
too large to run contiguity-constrained programs on. Instead I use a 
method from Janáček and Gábrišová[2], which uses the property that, 
suppose we start with some given centers,
	(centerx_i, centery_i) for i = 1..k
Then there exist weights w_i so that the "ordinary" k-means clustering 
(Voronoi clustering) objective
	minimize sum over i, x, y: assign[i, x, y] * (pop(x, y) + eps) * 
(dist^2(centerx_i, centery_i, x, y) + w_i)

will ensure that the districts have equal populations. (This makes 
intuitive sense: if w_i is very large, district i is penalized and 
shrinks relative to the other districts.) And this "uncapacitated" or 
unconstrained problem can be easily solved by checking each point's 
value for each district and picking the best.

So I ended up with a two-step procedure: use integer programming to find 
reasonably good centers from a randomly selected large enough candidate 
pool, and then use a local search to update the weights w_i until the 
districts had nearly equal populations, thus enforcing contiguity and 
convexity.[3][4]

In practice, there were a bunch of additional problems due to census 
district shapes, non-Euclidean geometry, and how to handle large amounts 
of data quickly. But I won't go into those here; perhaps later, as this 
post is long enough.

-km

[1] I made a mistake back when I stated this back at the end of October: 
the objective value should be weighted by population, not just sum of 
squared distances. Otherwise every district wants to be exactly the same 
size as the others, which leads them all to carve inwards to cut shares 
of cities. See the unweighted objective picture here: 
https://github.com/kristomu/redistricter/tree/master/undesired_outcomes

[2] See JANÁČEK, Jaroslav; GÁBRIŠOVÁ, Lýdia. A two‐phase method for the 
capacitated facility problem of compact customer sub‐sets. Transport, 
2009, 24.4: 274-282.

[3] As far as I understand, it's possible to use a postprocessing phase 
to force exact equality. The Janáček and Gábrišová paper gives more detail.
In my tests for Colorado, the local search converged to around 0.02% 
difference in population between the most populated district and the 
least, so I didn't implement it.

[4] There may be better ways to find the weights than local search, see 
e.g. SPANN, Andrew; KANE, Daniel; GULOTTA, Dan. Electoral redistricting 
with moment of inertia and diminishing halves models. The UMAP Journal, 
2007, https://rangevoting.org/SpannGulottaKaneMCM2007.pdf, particularly 
pages 9 and 10.