Tag Archives: numpy

Making Numpy arrays read-only

The CMB analysis framework that we’re writing at the moment needs to cache certain data (e.g. sky maps, maps of noise properties etc.) for quick access during various stages of the analysis. It’s important that other parts of the program don’t alter the cached information directly, since it could fall out of sync with the current state of the MCMC chain. To protect the cache, then, we need to make certain arrays read-only.

An easy way of doing this for Numpy arrays is to use the setflags() method on the array. The flag for modifying read/write access is, unsurprisingly, write, so to make an array called arr read-only, you would simply call arr.setflags(write=False). If you subsequently try to modify the array, it raises a ValueError, like so:

>>> a = np.arange(6)
>>> a.setflags(write=False)
>>> a[4] = 6
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only

Advertisements

Recent numpy/matplotlib finds: Log plotting and map projections

With a change of research focus comes a change of tools. I’m fiddling with clusters of galaxies at the moment (GR stuff has been put slightly to one side for a little while), so I’ve been re-educating myself in the art of handling catalogues. I discovered a handful of neat numpy/matplotlib routines in the process, which I thought I’d share:

  • Map projections: You can project plots differently by using the projection keyword argument for subplot, e.g. subplot(111, projection="mollweide")
  • Logarithmic sampling: If you need to sample some function on a logarithmic interval (e.g. a cluster pressure profile), use numpy’s logspace function instead of the usual linspace.
  • Split log axes: Sometimes you want log axes for a plot that includes both positive and negative values. Rather than messing around with taking the absolute value of the negative numbers and then changing the line style, you can use SymmetricalLogScale.

NumPy Tip: The index of the array element nearest to some value

Courtesy of Stack Overflow, here’s a little one liner that I use surprisingly frequently. Suppose that you’ve been using an ordered array of numbers arr to do some interpolation and, now that you have the interpolated value v, you want to return only the slice of the array that has arr[i] < v. To do this, you need to find the array index of the element that has the value closest to v, like so:

idx = (np.abs(arr - v)).argmin()

Simple. This finds the deviation between each array element and the value, takes its absolute value, and then returns the index of the element with the smallest value in the resulting array. This, of course, is the number that is closest to v. You can then slice arr using idx, as you would with any other list or array.


NumPy tip: Turn on error reporting when debugging

It’s been a relatively varied week so far. We got back (and discussed) the referee report for our recent paper on proving homogeneity with a generalised kinematic Sunyaev-Zel’dovich effect, I sent off a couple of comments on a nice draft paper by some collaborators of mine, and we’ve had a busy couple of schools outreach evenings. I’m also now a trained STEM ambassador. But most of my time this week has been sunk into getting a couple of NumPy/SciPy-based codes working properly.

I’ll keep it short and sweet: If you’re writing a numerical code in Python (integrating an ODE, or solving some equation with a root finder, or whatever), you might like to turn on floating point error reporting while you debug. Normally, NumPy has the good grace to suppress floating point errors – who wants to write an error handler for every time you have 0/0 in some plotting code? – but there are plenty of applications where you want to catch NaNs and infs as soon as they happen rather than waiting until the end of the calculation to inspect the result. Yeah, I’m looking at you, scipy.integrate.odeint(). Well, it turns out that NumPy has a nice and easy convenience function to control floating point error reporting: numpy.seterr(). Call it at the very top of your code, like this:

numpy.seterr(all='raise')

That’s all you need. Now, when your code comes across an invalid floating point number, it’ll raise an error ASAP and you can have fun debugging the problem. Don’t forget to remove it when you’ve fixed everything up.

As usual, I can’t believe how long it took me to find this.