[EM] Deterministic Districting

Fri Jan 7 14:43:48 PST 2005

Hi all,

Our Man Arnold in his State of California address pushed for a whole 
bunch of reforms, including putting redistricting in the hands of a 
non-partisan judges panel.

As you can imagine there's a huge hue and cry. Mostly from partisans 
who fear accountability, but some from thoughtful citizens worried 
about replacing "Gerrymandering" with "Arniemandering."  After all, can 
you even trust judges (or those who appoint them) to be totally 
unbiased?  Even if Arnold's a good guy, should we trust his successor?

This brings us back to the question of automated redistricting. We've 
often discussed how the 'fairest' algorithm would use a measure such as 
  "minimizing lanes of traffic cut by the circumference" by combining 
census tracts while ensuring equal-population districts.   Its easy to 
get an approximate answer from this criteria using various statistical 
means, but is there any computationally feasible way to get a 
*deterministic" answer that is at least near-optimal?  Otherwise, I 
fear that people will either create pseudo-statistical results, or 
challenge truly random results due to their fear of (random) biases.

Interestingly, I think one could phrase it in terms similar to a voting 
problem, at least given equal-sized districts.  Say we have N 
individuals in a playground, and they need to form M teams.   Each 
individual uses (honest) cardinal ratings to describe which of their 
neighbors they'd like on the same team as them (corresponding to shared 
lanes of traffic).  What is the fairest way to create equal-sized teams 
given those rankings -- without requring an N! M! computation!

My initial guess would be to use a "greedy" algorithm to coalesce 
individual census blocks into districts by first "healing" the edges of 
'widest' connectivity, stopping when you reach the maximum district 
size.  However, that would (in general) result in isolated fragments of 
sub-par size, meaning we'd need another pass to 'steal' sections back 
for the starved districts.  But would that ever terminate? In fact, no 
matter how you do it, no purely local algorithm can guarantee global 
closure, right?

Another option would be to start with M seeds which "repel" so they are 
maximally dispersed through out the state, then let them grow 
organically (smallest first). That would probably minimize variance, 
but not eliminate it.   I can't think of anything better than 
"simulated annealing" (statistical methods) to 'bubble' districts 
around until they're equal, but that gets back to the random factor.   
I suppose it then becomes how 'random' can something be and still be 
acceptable to voters?

I suppose a key question is how precise do we need to be.   We 
obviously can't be more precise than a single tract, which could be 
8,000 people (out of 440K), which is 2%.    But we probably don't want 
to be much more divergent than than

The fallback position is of course 'brute-force', but I think that's 
been tried and failed because it didn't respect 'natural communities'.

http://www.westmiller.com/fairvote2k/in_law.htm

Any other thoughts?  I can't help but think that some sort of 
'four-color' theorem might be relevant, but I'm darned if I know how...

-- Ernie P.

http://www.census.gov/td/stf3/append_a.html#CENSUS%20TRACT
Census tracts usually have between 2,500 and 8,000 persons.
(call it 5000/block, +- 50%)

http://www.ss.ca.gov/elections/ror/reg_stats_10_18_04.pdf
http://quickfacts.census.gov/qfd/states/06000.html
California has about 22 million eligible voters out of 35 million 
residents

http://www.assembly.ca.gov/acs/acsframeset7text.htm
There are 80 assembly districts,
or one per 275K voters or 440K residents.
Thus, around 80+-20 blocks per district, so pretty fine-grained.

http://www.fairvote.org/redistricting/reports/remanual/ca.htm
http://swdb.berkeley.edu/index.html
http://www.westmiller.com/fairvote2k/in_refer.htm