[EM] Spatial models -- Polytopes vs Sampling

Mon Jan 31 19:35:58 PST 2022

On Mon, Jan 31, 2022 at 3:32 PM Kristofer Munsterhjelm <km_elmet at t-online.de>
wrote:

> > I don't yet know whether computing volumes is actually faster than
> > method (1) for comparable accuracy. But if it is, then my method (3)
> > could be the best of both worlds:
> >
> > - The superior speed vs accuracy trade-off geometry (2).
> > - The flexibility and realism of a dedicated voter-density function (1).
>
> I'd imagine it would depend on how easy it is to do high dimensional
> Gaussian integration. There may be more efficient methods than ordinary
> MC (e.g. quasi-MC,
>
> https://web.maths.unsw.edu.au/~josefdick/preprints/DKS2013_Acta_Num_Version.pdf
> )
> or there may be special-case Gaussian integrals over simplices, and then
> the polytope can be decomposed to simplices (like the approach for
> determining the exact area) and the Gaussian integral applied to them.
>

So I started reading up on quasi-MC. I haven't done this before so my
initial thoughts might be incorrect, but I think I've found two things:

1) It looks easy to do. There are libraries (e.g. scipy.stats.qmc) that
just give you a sequence of numbers that you can use as a drop-in
replacement for your random number generator in your MC integrator.

2) But I see a lot of warnings saying that QMC constructions are designed
for special values of n, like powers of 2 or large primes, and if I
sub-sample or pick the wrong number of points I might ruin all their
properties. This is potentially a huge issue because integrating over the
polytope consists of throwing away all the points that aren't inside the
polytope.

Right now I'm just hoping that one of the QMC number generators is more
forgiving. Sobol seems to be the worst offender, while the Halton sequence
looks like it might be more forgiving:

https://docs.scipy.org/doc/scipy/reference/reference/generated/scipy.stats.qmc.Halton.html#scipy.stats.qmc.Halton

The doc says:

-------------
Alternatively, you can skip some points like `sampler.fast_forward(5)` ...
-------------

Presumably that means that the Halton sequence isn't very finicky about
exactly which points get included in the integral. As luck would have it,
the Halton sequence is apparently superior for "low-dimensional" problems N
<= 6 (Morokoff & Caflisch 1995), which is the dimensionality I most care
about anyway. Honestly, the N=2 and N=3 problem is the most interesting for
real world elections.

Cheers,
-- 
Dr. Daniel Carrera
Postdoctoral Research Associate
Iowa State University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20220131/64e7de70/attachment.html>