Drawing random numbers from a multivariate Gaussian distribution

I like software libraries. A software library means that someone has already done some of the heavy lifting for me. A software library means that I won’t have to reinvent the wheel. It means that I might not have to spend my afternoon fixing bugs and stringently validating my own code*.

It also means that I don’t necessarily have a good idea of what the computer is doing “behind the scenes”; that I’m potentially unaware of the clever (or not) algorithm that is being implemented, and what it’s limitations (or bugs) are.

For a long time now, I’ve been blithely accepting that numpy.random.multivariate_normal applies some sensible, but mysterious, procedure in order to generate random variables from a multivariate Gaussian (normal) distribution. And it does – I just never tried to unravel said mystery. So, prompted by Dag Sverre, and courtesy of Wikipedia,  I present a Python snippet that draws multivariate Gaussians given (a) some way of generating a set of standard univariate Gaussian random numbers, and (b) a Cholesky decomposition routine. In all, it’s about 10% faster than the NumPy routine, which applies a bunch of validity tests that I don’t.

import numpy as np

# Define mean, covariance matrix, and no. of samples required
mean = ... # some array of length N
cov = ... # some positive-definite NxN matrix
Ndraws = 1000

# Do factorisation (can store this and use it again later)
L = np.linalg.cholesky(cov)

# Get 3*Ndraws Gaussian random variables (mean=0, variance=1)
norm = np.random.normal(size=Ndraws*3).reshape(3, Ndraws)

# Construct final set of random numbers (with correct mean)
rand = mean + np.dot(L, norm)

Of course, this now introduces two new mysteries: How do you efficiently implement Cholesky decomposition, and how do you generate univariate Gaussian random numbers? Those are exercises for another day, I think.

* It depends on how sloppy I’m feeling; of course, you should validate everything that matters, regardless of who wrote it.

About Phil Bull

I'm a theoretical cosmologist, currently working as a NASA NPP fellow at JPL/Caltech in Pasadena, CA. My research focuses on the effects of inhomogeneities on the evolution of the Universe and how we measure it. I'm also keen on stochastic processes, scientific computing, the philosophy of science, and open source stuff. View all posts by Phil Bull

3 responses to “Drawing random numbers from a multivariate Gaussian distribution

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: