[EM] Spatial models -- Polytopes vs Sampling

Thu Feb 3 14:33:52 PST 2022

On Thu, Feb 3, 2022 at 2:15 PM Kristofer Munsterhjelm <km_elmet at t-online.de>
wrote:

> Great!
>
> I was browsing the scipy.stats.qmc manual and noticed it has a third
> method, Latin hypercube, explicitly designed for cubes. Would this
> method be applicable to your problem, if you use the function g that's
> zero outside of the simplex and Gaussian inside it?

I tested it, and it didn't do as well as Sobol. The Latin Hypercube was a
little faster than the Halton sequence but a little less accurate. I think
Sobol has a huge advantage because (according to Wikipedia) there is a fast
implementation based on low-level bitwise operations and that's just hard
to beat with regular arithmetic.

> Does the proportion
> of the cube occupied by the simplex vanish too quickly as d increases?
>

I ran a test. It's certainly an issue, but not quite as bad as I had
expected. First, the library can easily detect the polytopes that are
completely empty. For the rest, I ran 10 sample elections with candidates
uniformly random in the unit cube. The proportion of the cube occupied by
the polytope seems to approach 10% as I get closer to 8 dims.

N_candidates = N_dim + 1

2 Dims: v = 0.535417 +/- 0.236171
3 Dims: v = 0.256944 +/- 0.134119
4 Dims: v = 0.156250 +/- 0.065763
5 Dims: v = 0.134810 +/- 0.035892
6 Dims: v = 0.125693 +/- 0.009282

That said, higher dimensions do get more expensive for other reasons ---
like the fact that you have (N_dim+1)! polytopes. I was planning to run the
simulation till N_dim=8 but I've been waiting for N_dim=7 to finish and it
just doesn't want to finish.

I would also imagine that you could reduce the dimension by one by using
> a standard 1D Gaussian integral over the last dimension as long as you
> can do line-simplex intersections to determine what line you should
> integrate over. But perhaps the general covariance problem you mentioned
> earlier would make this impractical - that it would be rather difficult
> to line up the Gaussian integral with that line in the remaining dimension.
>

Yeah. If we assume that the Gaussian is fully symmetric (which feels a bit
restrictive) you could imagine drawing radial lines and figuring out where
they cross the polytope. The details could be a bit complicated. I have no
idea if that would be faster, but I could try something like that. I can't
work on this idea right now, but I didn't want to dismiss it.

On a related note, I was reading James Green-Armytage's paper about
> strategic voting: http://jamesgreenarmytage.com/strategy-utility.pdf. On
> page 21, he states that an 8D spatial model is a good fit to the
> political poll model, while 1D is not quite as good. He doesn't mention
> intermediate dimensionality models, but it may provide a reason for
> supporting high dimension spatial models (as long as the fit keeps
> improving even when going from say, 7D to 8D). It does, I think, provide
> pretty good evidence that there's little need for going beyond 8D, at
> least.
>

It's good to have a maximum. I hope we can get away with fewer than 8
dimensions. Regardless of the true political dimensions of the electorate,
if you only have N<8 parties, those parties will lie in an
(N-1)-dimensional subspace. That's why I suspect that we can get away with
modelling a lot less than 8D and just be aware that I'm only modelling the
subspace of political positions that is actually spanned by the candidates.
The US is a specially pathological example, where apparently your views on
LGBT rights somehow dictate your views on sex education, AR-15s, tax law,
climate change, vaccines, and the Israel-Palestine conflict.

Cheers,
-- 
Dr. Daniel Carrera
Postdoctoral Research Associate
Iowa State University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.electorama.com/pipermail/election-methods-electorama.com/attachments/20220203/e57b6f3f/attachment.html>