[EM] "An impractical suggestion for redistricting", redux

Wed Oct 23 14:03:38 PDT 2024

In June 2022, I wrote a post about a way to optimally assign people to 
districts as long as the district centers are known beforehand.

Let the "region" be the area to be districted.
Then for notation let centerx_i and centery_i be the ith district's 
center x and y coordinates respectively; let pop[x, y] be the population 
at coordinates (x,y), and let tpop be the total population. The 
algorithm is to essentially solve the following linear problem:

minimize sum over coordinates x,y in the region:
	sum over districts i = 1..n:
(1)		assign[i, x, y] * dist^2(centerx_i, centery_i, x, y)

subject to
(2)	for all i, x, y: 0 <= assign[i, x, y] <= 1
(3)	for all i: (sum over x, y: assign[i, x, y] * pop[x, y]) = tpop/n
(4)	for all x, y: (sum over i: assign[i, x, y]) = 1

(1) says: minimize the total squared distances between each point x,y 
and the assigned district's center. assign[i, x, y] is a free variable.

(2) says: a point can at worst be 0% claimed by a district, at most 100%.

(3) says: each point has to be assigned to *some* district.

According to the paper I referenced in 2022, 
https://arxiv.org/pdf/1901.04628.pdf, if pop[x,y] is binary for every 
x,y, then the assignment weights will also be binary, i.e. every point 
definitely belongs to one of the districts.

I then concluded that for practical regions (states), the number of 
points would be so large as to make this impractical.

But I just had a thought: it seems that if we work on a coarser scale 
(e.g. each point being a rectangle), then the solution would contain 
definite points as well as mixed points; mixed points being ones where 
assign[i, x, y] is neither zero or one, and definite points being binary 
assign[i, x, y] points.

Then any point definitely assigned to a district, and surrounded by 
points also definitely assigned to that district, is known to belong to 
that district. (This because if the true boundary between district A and 
B cuts through a square, then some adjacent square next to it, closer to 
A, would belong at least partially to A, and an adjacent square closer 
to B would belong at least partially to B.)

So we should be able to run this program on a coarse map, exclude large 
fractions of that map, then refine to a higher level of detail along the 
boundaries, rinse, and repeat. The total size of the linear programs 
should then, hopefully, be manageable.

To find a reasonable districting, we could try with just random centers. 
Doing a k-means update would be pretty tough for general distances, but 
I imagine it could be done for the Euclidean distance.

This is mostly for fun, since, as I also said back in 2022, algorithmic 
districting is probably an answer to the wrong question, with PR being 
the proper way to do it.

Maybe I should test it one day - just for fun. I would have to dig up 
maps and population distributions for the states of the US, though.

-km