I like software libraries. A software library means that someone has already done some of the heavy lifting for me. A software library means that I won’t have to reinvent the wheel. It means that I *might* not have to spend my afternoon fixing bugs and stringently validating my own code*.

It also means that I don’t necessarily have a good idea of what the computer is doing “behind the scenes”; that I’m potentially unaware of the clever (or not) algorithm that is being implemented, and what it’s limitations (or bugs) are.

For a long time now, I’ve been blithely accepting that numpy.random.multivariate_normal applies some sensible, but mysterious, procedure in order to generate random variables from a multivariate Gaussian (normal) distribution. And it does – I just never tried to unravel said mystery. So, prompted by Dag Sverre, and courtesy of Wikipedia, I present a Python snippet that draws multivariate Gaussians given (a) some way of generating a set of standard univariate Gaussian random numbers, and (b) a Cholesky decomposition routine. In all, it’s about 10% faster than the NumPy routine, which applies a bunch of validity tests that I don’t.

`import numpy as np`

```
```# Define mean, covariance matrix, and no. of samples required

mean = ... # some array of length N

cov = ... # some positive-definite NxN matrix

Ndraws = 1000

# Do factorisation (can store this and use it again later)

L = np.linalg.cholesky(cov)

# Get 3*Ndraws Gaussian random variables (mean=0, variance=1)

norm = np.random.normal(size=Ndraws*3).reshape(3, Ndraws)

`# Construct final set of random numbers (with correct mean)`

rand = mean + np.dot(L, norm)

Of course, this now introduces two new mysteries: How do you efficiently implement Cholesky decomposition, and how do you generate univariate Gaussian random numbers? Those are exercises for another day, I think.

* It depends on how sloppy I’m feeling; of course, you should validate everything that matters, regardless of who wrote it.

### Like this:

Like Loading...

*Related*

## About Phil Bull

I'm a theoretical cosmologist, currently working as a NASA NPP fellow at JPL/Caltech in Pasadena, CA. My research focuses on the effects of inhomogeneities on the evolution of the Universe and how we measure it. I'm also keen on stochastic processes, scientific computing, the philosophy of science, and open source stuff.

View all posts by Phil Bull
March 2nd, 2013 at 7:36 pm

Hi, can this method be used for generating artifical samples from the original data space for Hopkins statistics.

March 2nd, 2013 at 8:56 pm

I don’t see why not. It should work fine for anything that needs a sample of Gaussian-distributed random variables.

March 3rd, 2013 at 9:56 am

Great thanks.