Constrained Sampling#

The importance of using specialized proposal distributions for constrained sample problems can be illustrated with a simple example. Consider a box \([0, 1e5]^d\) for some dimension \(d\) and assume the sampler starts close to the origin. In this case, a classical isotropical Gaussian proposal distribution will have approximately \(\frac{2^d - 1}{2^d}\) of its density located outside the constrained region. Thus, the probability to generate a proposal which lies inside the constrained region is geometrically distributed and has probability \(2^{-d}\). It is easy to see, that this probability will vanish very quickly, hence, leaving the Markov chain stuck for a long time. More precisely, it will remain stuck for \(2^d\) moves in expectation. In contrast, the Hit-and-Run algorithm will never generate samples outside the constrained region. The effect of this can be seen by monitoring the proposed states of a Gaussian and a Hit-and-Run algorithm for the same problem as dimension increases.

[1]:
import hopsy
import numpy as np
import matplotlib.pyplot as plt
[2]:
dims = np.array(range(1, 14))
Problem = lambda dim: hopsy.add_box_constraints(
    hopsy.Problem(-np.identity(dim), [0]*dim), 0, 1e3)

proposals = [
    hopsy.GaussianHitAndRunProposal,
    hopsy.GaussianProposal,
]

accrates = np.zeros((len(dims), len(proposals)))

n = 1000

for i, dim in enumerate(dims):
    for k, Proposal in enumerate(proposals):
        problem = Problem(dim)
        proposal = Proposal(problem, starting_point=[.001]*dim)
        rng = hopsy.RandomNumberGenerator(seed=42)
        accrate = 0

        for _ in range(n):
            state = proposal.propose(rng)
            accrate += 1 if (problem.A @ state <= problem.b).all() else 0

        accrates[i, k] = 1.*accrate / n

Set parameter Username
Academic license - for non-commercial use only - expires 2023-11-30
[3]:
analytical = 1./2**dims

plt.figure(dpi=300)
plt.plot(dims, accrates, label=['Hit-and-Run', 'Gaussian'])
plt.plot(dims, analytical, linestyle='dashed', color='gray')

plt.xlabel(r'dimension $d$')
plt.ylabel(r'acceptance rate $\alpha$')

plt.legend()

plt.show()
../_images/notebooks_ConstrainedSampling_3_0.png
[4]:
dims = np.array(range(1, 14))
Problem = lambda dim: hopsy.add_box_constraints(
    hopsy.Problem(-np.identity(dim), [0]*dim, hopsy.Gaussian(mean=[0]*dim)), 0, 1e3)

proposals = [
    hopsy.GaussianHitAndRunProposal,
    hopsy.GaussianProposal,
]

accrates = np.zeros((len(dims), len(proposals)))

for i, dim in enumerate(dims):
    for k, Proposal in enumerate(proposals):
        problem = Problem(dim)
        mc = hopsy.MarkovChain(problem, Proposal, starting_point=[.001]*dim)
        rng = hopsy.RandomNumberGenerator(seed=42)

        accrate, _ = hopsy.sample(mc, rng, 10*2**dim)

        accrates[i, k] = accrate[0]

[5]:
analytical = 1./2**dims

plt.figure(dpi=300)
plt.plot(dims, accrates, label=['Hit-and-Run', 'Gaussian'])
plt.plot(dims, analytical, linestyle='dashed', color='gray')

plt.xlabel(r'dimension $d$')
plt.ylabel(r'acceptance rate $\alpha$')

plt.legend()

plt.show()
../_images/notebooks_ConstrainedSampling_5_0.png
[ ]: