Coulomb rhymes with variance reduction

Monte Carlo
point processes

Rémi Bardenet


November 1, 2023

… Well, it does rhyme if you read the title aloud with a French accent, hon hon hon.

To paraphrase Nicolas’s previous post, say I want to approximate the integral \[ I(f) := \int_{S} f(u) du, \] where \(S\) is a compact set of \(\mathbb{R}^d\). I could use plain old Monte Carlo with \(N\) nodes, \[ \hat{I}(f) = \frac 1 N \sum_{i=1}^N f(U_i), \quad U_i \sim \mathrm{U}(S). \tag{1}\] Intuitively, an i.i.d. uniform sample of quadrature nodes \(U_1, \dots, U_N\) will however leave “holes”; see Figure 1 (a). In words, given a realization of the nodes, it is possible to insert a few large balls in \(S\) that do not contain any \(U_i\). These holes may make us miss some large variations of \(f\). Part of the variance of the Monte Carlo estimator in Equation 1 could intuitively be removed if we managed to fill these holes, using some of the nodes that got lumped together by chance.

Many sampling algorithms, such as randomized quasi-Monte Carlo, impose similar space-filling constraints, yielding a random sample with guarantees of “well-spreadedness”. In the paper I describe in this post, Diala Hawat and her two advisors (Raphaël Lachièze-Rey and myself) obtained variance reduction by explicitly trying to fill the holes left by a realization of \(U_1, \dots, U_N\). In the remainder of the post, I will describe Diala’s main theoretical result.

(a) A Poisson sample

(b) The same sample after repulsion

Figure 1: Note how the repelled sample has fewer visible “holes” and “lumps”. The details of how we implemented the repulsion are interesting in themselves, and can be found in the paper and the associated code.

The basic intuition is to imagine the quadrature nodes \(U_1, \dots, U_N\) as electrons. In physics, electrons (like all charged particles) are subject to the Coulomb force. The Coulomb force exerted by one electron onto another points away from the first electron, with a magnitude that is inversely proportional to the \(d-1\)th power of the Euclidean distance between the two. As a result, electrons tend to repel each other, and electrons close to you will push you away harder than electrons at the other side of the support of \(f\). This is the behaviour that we would like to emulate, so that our quadrature nodes avoid lumping together and rather go and fill holes where no particle causes any repulsion.

If we solved the differential equation implementing Coulomb’s repulsion on our \(N\) i.i.d. nodes, however, the points would rapidly leave the support of \(f\) and “go to infinity”, to make sure that the pairwise distances between nodes are as large as possible. One way to avoid this undesired behaviour is to consider an “infinite” uniform Monte Carlo sample in \(\mathbb{R}^d\), so that, wherever an electron looks, there are an infinite number of electrons preventing it from escaping. To make the situation comparable with our initial \(N\)-point estimator in Equation 1, we also require that there are roughly \(N\) points inside the region \(S\) where we integrate \(f\). Formally, we consider a homogeneous Poisson point process \(\mathcal{P}\) of intensity \(\rho = N/V\) in \(\mathbb{R}^d\), where \(V\) is the volume of \(S\). Consider the modified Monte Carlo estimator \[ \tilde{I}(f) = \frac{1}{N} \sum_{x\in S\cap\mathcal{P}} f(x). \] This estimator is very similar to the \(N\)-point crude Monte Carlo estimator \(\hat{I}(f)\), except the number of evaluations of \(f\) in the sum is now Poisson-distributed, with mean and variance \(N\). What we have gained is that we can now intuitively apply the Coulomb force to the points of \(\mathcal{P}\), and hope that both before and after repulsion, about \(N\) points remain in our integration domain \(S\). Proving this remains technically thorny, however. For starters, for \(x\) in \(\mathbb{R}^d\), the series defining the Coulomb force exerted on \(x\) by a collection \(C\) of points in \(\mathbb{R}^d\), namely \[ F_C(x) = \sum_{y\in C, y\neq x} \frac{x-y}{\Vert x-y\Vert^{d}}, \] is not absolutely convergent, so that the order of summation matters. However, it was observed as early as 1943 that, if you sum by increasing distance to the reference point \(x\), and \(C=\mathcal{P}\) is a homogeneous Poisson point process, then the (random) series \(F_\mathcal{P}(x)\) converges almost surely. Interested readers are referred to a classical paper by Chatterjee, Peled, Peres, and Romik (2010) on the gravitational allocation of Poisson points, one of the inspirations behind Diala’s work.

Putting (important) technical issues aside, we are ready to state the main result of our paper. We prove that, for \(\epsilon\in(-1,1)\), the repelled Poisson point process \[ \Pi_\epsilon\mathcal{P} = \{ x+\epsilon F_{\mathcal{P}}(x), \quad x\in\mathcal{P} \} \] is well-defined, and has on average \(N\) points in \(S\). Moreover, \[ \check{I}(f) = \frac{1}{N} \sum_{x\in S\cap \Pi_\epsilon\mathcal{P}} f(x) \] is an unbiased estimator of \(I(f)\). Finally, if \(f\) is \(C^2\), for \(\epsilon>0\) small enough, the variance of \(\check{I}(f)\) is lower than that of \(\tilde{I}(f)\). To sum up, for any \(C^2\) integrand, we can in principle reduce the variance of our Monte Carlo estimator by slightly repelling the quadrature nodes away from each other. This is it: by breaking lumps and filling holes in a postprocessing step, we obtain variance reduction over crude Monte Carlo. The proof is not trivial, and relies on the super-harmonicity of the potential behind the Coulomb force.

Let me close with two further pointers to the paper. First, we discuss a particular value of the “step size” parameter \(\epsilon\) in the paper, which has an easily-implemented closed form, and reliably led to variance reduction across our experiments. Second, while our theoretical results only cover the Poisson case so far, we also show experiments on other (stationary) point processes than Poisson, which confirm that variance reduction is also achieved across point processes with varying second-order structure. In Monte Carlo terms, and being very optimistic, some sort of repulsion might become a standard postprocessing step in the future, to reduce the variance of one’s estimator, independently of the law of the nodes (Markov chain, thinned PDMP, you name it).