[EM] Re: Compactness
Ernest Prabhakar
drernie at mac.com
Tue Jan 20 19:35:02 PST 2004
Hi Joe,
On Jan 20, 2004, at 5:30 PM, Joe Weinstein wrote:
> However, after posting the message I started wondering whether I got
> carried away on my point (3) as to supposed advantage of DC over BC in
> computation. So Ive since tried to subject raw intuition to a bit of
> precision and reality check. Having done so to a point, perhaps not
> enough, I must revise what I implied, and go halfway to your position.
Thanks for being big enough to revise your views based on the evidence
(not something to be taken for granted in this forum :-). I think I'm
starting to appreciate the virtue of your position, as well.
> Very likely, in your context, BC is easier to handle than DC, or
> anyhow is no harder to handle. It sounds like you have a good
> algorithm, and for good reasons. ....
> In this latter case especially, it would pay to pre-process the
> specification of plan P into convenient supplementary tables. In
> particular, one could prepare a list L of 5-tuples, one for each of
> the at most 3T boundaries. Each 5-tuple would list not only the three
> data in the boundarys triple, but also the two atoms current
> district IDs. Then, a small change in plan P would require but small
> changes in L, and an easy pass through L would readily sum the
> boundary values for different-district atom pairs - or, even faster,
> would get the complementary sum of all values for same-district pairs.
Actually, I think there's a really easy trick to make it easy to
calculate boundary changes. To find the effect of transferring node A
from District X to District Y, we can just calculate A(Y) - A(X) (that
is, A's edge-weight to nodes in Y vs. A's edge-weight to other nodes in
X). With an object-oriented approach, both node A and node B would
carry the edge-weight from A to B, so its easy to look it up.
> On the other hand (details below) there seems to be just about equally
> good news from the DC side too, at least if DC uses squared distance.
> ... But there is good news from the DC side too, if the compactness
> criterion be to minimize the average SQUARED distance between
> same-district atom pairs. This criterion amounts to minimizing
> average within-district atom-centroid squared distance, alias average
> district (moment of) inertia. It turns out that one can readily
> account for small changes in districts by storing and updating for
> each district the X- and Y- components both of its centroid and of its
> inertia. (Thanks to Pythagoras, the X- and Y- inertias sum to the
> total inertia.) For those who want a named theorem behind the
> technique, its the Parallel Axis Theorem. Years ago I saw a
> programmers writeup of the one-dimensional analog: for the sake of
> scratch-memory conservation, he calculated the mean and variance of a
> sequence of read-in numbers, by updating simultaneously the running
> mean and running variance.
That's a very good point. The use of moment of inertia as the measure
of compactness (if I understand you correctly) seems like a good
metric. So, we can treat each atom as if its entire population (w) was
concentrated at a single point p = (x,y) (ideally the centroid, though
if its a small enough region the geographical center will do fine).
Thus, for each district we maintain these vector sums:
P = sum(p) # centroid
W = sum(w) # mass
PW = sum(p*w) # moment
P2W = sum(p^2 * w) # variance
I believe this gives us:
C (center of mass) = PW / W
so the moment of inertia M would be:
M = sum (w * (p-C)^2) = sum (w*(p^2 - 2*p*C + C^2))
= P2W - 2 W * P *C + W*C^2
Is that pretty much what you had in mind? That implies that if adding
node A (p_A, w_A) to District Y doesn't significantly change the mass W
or Center of Mass C, then:
dM_Y = (P2W' - P2W') - 2 (P'-P) *C * W = w*p^2 - 2 p * C * W
Does that seem right?
The only thing I'm not sure about is how to define a simple
standard-deviation of M to measure the impact of a given change (like
sqrt(N) would for a population N), but hopefully I can figure something
out.
> Anyhow, Ernie, I do not doubt that you have an excellent algorithm for
> updating the BC effect of small changes in the districting plan, and I
> thank you for your corrective!
Thank you. That seems simple enough; I'll see if I can add those
calculations into my program. Ultimately I'd like it to be a runtime
option which to use, since the 'right' measure is ultimately more of a
political one than a technical one.
Cheers,
- Ernie P.
More information about the Election-Methods
mailing list