{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Parallel Tempering with MPI" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Parallel tempering is a technique for improving the efficiency of MCMC, especially in the case of multi-modal target distributions. The core idea is to build a sequence of distributions starting with an easy-to-sample reference (e.g. the prior) and ending with the difficult-to-sample target (e.g. the posterior). Given the target $\\pi$ and the reference $\\pi_0$, this sequence is constructed using a \"temperature\" parameter $0 < \\beta < 1$:\n", "$$\\pi_\\beta = \\pi_0^{1-\\beta} \\cdot \\pi^\\beta$$\n", "Here, for $\\beta = 0$ (also referred to as \"cold\" chain) the distribution is equal to the target, for $\\beta = 1$ (also referred to as \"hot\" chain) it is equal to the reference. By running multiple chains with an ascending sequence of temperatures $\\beta_0 = 0, \\beta_1, \\dots, \\beta_n = 1$ and allowing theses chains to pass states based on Metropolis-Hastings, the simpler properties of hotter distributions improve the exploration of the state space and can result in faster convergence of the cold chain to the target distribution.\n", "\n", "*hopsy* implements parallel tempering. This notebook illustrates it by sampling a mixture of Gaussians that has two distinct modes. Depending on the starting point, vanilla MCMC approaches have trouble capturing the multi-modality. This is because once the chain has found a mode, Metropolis-Hastings proposal are very unlikely to propose high-density points in the other mode. With parallel tempering, the hotter chains are less effected by this and can better sample broadly. By passing these states on to the colder chains, other modes can be explored." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import hopsy\n", "import multiprocessing\n", "import ipyparallel as ipp\n", "\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The sampling code with all imports must be defined within a file or function like here. The code is later called by an MPI process, enabling communication of the tempered chains in *HOPS*. Notice, that for the definition of the Markov chain, a synchronized `RandomNumberGenerator` (must have the same seed for each process!) and an exchange probability must be given.\n", "\n", "Important: For technical reasons it is not possible to use multiple chains from each MPI process. A different way of generating replica chains for computing statistics is shown below. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def run_tempered_chain(n_samples: int, chain_idx: int):\n", " import hopsy\n", " import numpy as np\n", " from mpi4py import MPI\n", "\n", " class GaussianMixture:\n", " def __init__(self, mu1, mu2):\n", " epsilon = 0.05\n", " cov = epsilon * np.eye(2, 2)\n", " self.model1 = hopsy.Gaussian(mean=mu1, covariance=cov)\n", " self.model2 = hopsy.Gaussian(mean=mu2, covariance=cov)\n", "\n", " def compute_negative_log_likelihood(self, x):\n", " return -np.log(\n", " np.exp(-self.model1.compute_negative_log_likelihood(x))\n", " + np.exp(-self.model2.compute_negative_log_likelihood(x))\n", " )\n", " \n", " A = np.array([[1, 0], [0, 1], [-1, 0], [0, -1]])\n", " b = np.array([1, 1, 1, 1])\n", "\n", " model = GaussianMixture(np.ones(2).reshape(2, 1), -np.ones(2).reshape(2, 1))\n", " problem = hopsy.Problem(A, b, model)\n", "\n", " comm = MPI.COMM_WORLD\n", " rank = comm.Get_rank()\n", "\n", " syncRng = hopsy.RandomNumberGenerator(42)\n", " mc = hopsy.MarkovChain(\n", " proposal=hopsy.GaussianCoordinateHitAndRunProposal,\n", " problem=problem,\n", " parallel_tempering_sync_rng=syncRng,\n", " exchange_attempt_probability=0.15,\n", " starting_point=0.9 * np.ones(2))\n", " mc.proposal.stepsize = 0.25\n", "\n", " rng = hopsy.RandomNumberGenerator(rank + chain_idx + 11)\n", "\n", " acc_rate, samples = hopsy.sample(markov_chains=mc, rngs=rng, n_samples=n_samples)\n", " return acc_rate, samples, rank" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function implements running MPI processes from within notebooks using `ipyparallel`. An different strategy would be to place your code in a Python file and invoking the processes using `mpiexec` from the command line." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def run_tempered_chains(n_samples: int, n_temps: int, chain_idx: int):\n", " with ipp.Cluster(engines='mpi', n=n_temps) as rc:\n", " view = rc.broadcast_view()\n", " result = view.apply_sync(run_tempered_chain, n_samples, chain_idx)\n", " result = sorted(result, key=lambda x: x[2])\n", "\n", " return [r[1] for r in result]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because this approach is hard to debug, it is generally a good idea to run a single instance of tempered chains for a short time before running large numbers of samples." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Starting 5 engines with \n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "eab8159d166941b4876eec905042b5c7", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/5 [00:00\n", "Starting 5 engines with \n", "Starting 5 engines with \n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "afd31492153c41619b7bab68e7415bd5", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/5 [00:00\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ca10fd4dcdd941c1982a6b94ede97669", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/5 [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig, axs = plt.subplots(1, n_replicas, sharey=True, figsize=(20, 5))\n", "for i in range(n_replicas):\n", " ax = axs[i]\n", " ax.set_title(f\"replica {i+1}\")\n", " for j in range(n_temps):\n", " ax.hist(result[i][j][:, :, 0].flatten(), density=True, alpha=0.245, label=\"$\\\\beta_\"+str(j)+\"$\", bins=100)\n", "plt.tight_layout()\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }