[EM] A computationally feasible method (algorithmic redistricting)
Brian Olson
bql at bolson.org
Wed Sep 3 05:57:19 PDT 2008
On Sep 3, 2008, at 5:32 AM, Raph Frank wrote:
> On 9/3/08, Brian Olson <bql at bolson.org> wrote:
>> Anyway, what my pseudo-voronoi solver does is this:
>>
>> Initialization:
>> • Assign district centers to random points (block
>> coordinates).
>
>> • Assign census blocks to nearest district center.
>>
>> Repeat until done:
>> • Move district centers towards center of population they were
>> actually assigned.
>
> Is this guaranteed to improve the district quality?
>
> In fact, do you only make an update if that improves quality by your
> measure?
I think it's not guaranteed to be an improvement because the
simultaneous update of all the other district centers and weights
could wind up creating overshoots or other destructive interference. I
always make the update. It may actually not be terribly well thought
out beyond 'seems like an intuitively good idea' and 'seems to work
pretty well'.
>
>> • Adjust district weight, nudging weight up or down if there
>> are too
>> few or two many people assigned to it.
>
> Do you do this until it completely converges, or just 1 nudge?
1 nudge per cycle of all the "Repeat until done" stages. 10000 to
20000 cycles are needed to settle depending on the state.
>> • (optional) Nudge district centers away from underpopulated
>> district centers and towards overpopulated district centers.
>> • For each census block, assign census block to the district
>> with
>> the lowest ((distance to block from district center) * weightd)
>
> You would get straight edges if you used
>
> lowest( (distance^2) - weight )
>
> This also saves you having to do a square-root, though I guess your
> method could be implemented as
>
> lowest( (distance^2)*(weight)^2 )
>
> This would speed up the calculations as there would be no need for
> finding the square root of the distance, while still giving the same
> result.
I checked my code and I'm not doing the expensive square root. It's
not quite the second though, it's actually:
((dx*dx + dy*dy) * weight)
The weight gets nudged by multiplying by 1.01 or 0.99. Squaring the
weight or not and how it's nudged is probably just an efficiency issue
and not a correctness issue, it should still get to the weight it
needs, just maybe along some suboptimal curve.
I think multiplicative weight makes more sense. No chance of
underflow, and might scale better between districts with different
densities.
>
>> • Fixup district contiguity (because straight-line-nearest
>> can jump
>> across water and other gaps in odd ways)
>
> I think water counts as part of a district, so technically that isn't
> a breech of contiguity.
>
> What could be an issue if if a State has a concave boundary.
>
> Btw, do you parse the block geography file to work out which blocks
> border each other?
Yes, the block shape file determines adjacency. They handily label
every line segment with what block is on either side of it, so I can
pretty quickly filter that to keep a list of all pairings I've seen.
Of course I do also wind up going back and rendering that geometry to
an actual map.
>> (That's now enshrined here:
>> http://code.google.com/p/redistricter/wiki/NearestNeighborSolver
>> )
>>
>> Oh, and when it's done (an hour or two for most states), start it
>> over
>> again with a different random starting state because it may have
>> gotten
>> stuck in a local optimum.
>
> My splitline software was able to do all the States in one evening.
> That is the benefit of a non-search algorithm.
I can get decent mappings for 90% of the states in a couple days. CA
and TX are tough. Lots of data and lots of of districts to balance.
There are some tunable parameters in the search process that I should
try adjusting specifically for solving those states.
Brian Olson
http://bolson.org/
More information about the Election-Methods
mailing list