Category Archives: Code

Book Review: How Linux Works 2

I received a review copy of How Linux Works 2, by Brian Ward, from the lovely folks at No Starch Press (my publisher) late last year. Inexcusably, it’s taken me until now to put together a proper review; here it is, with profuse apologies for the delay!

How Linux Works 2

How Linux Works 2 is a very nice technical read. I’ve been a user and administrator of Linux systems for over a decade now, and can safely say I learned a lot of new stuff from both angles. Newer users will probably get even more from it – although absolute beginners with less of a technical bent might be better off looking elsewhere.

The book fills something of a niche; it’s not a standard manual-type offering, nor is it a technical system reference. It’s more impressionistic than either of those, written as a sort of overview of the organisation and concepts that go into a generic Linux system, although with specific details scattered throughout that really get into the nuts and bolts of things. If you’re looking for “how-to”-type instructions, you’re unlikely to find everything you need here, and it isn’t a comprehensive reference guide either. But if you’re technically-minded and want to understand the essentials of how most Linux distros work in considerable (but not absolute) depth, with a bit of getting your hands dirty, then it’s a great book to have on your shelf.

Various technical concepts are covered ably and concisely, and was I left with a much better feeling for more mysterious Linux components – like the networking subsystem – than I had before. There are practical details here as well though, and you’ll find brief, high-level overviews of a number of useful commands and utilities that are sufficient to give a flavour for what they’re good for without getting too caught up in the (often idiosyncratic) specifics of their usage.

That said, the author does sometimes slip into “how-to” mode, giving more details about how to use certain tools. While this is fine in moderation, the choice of digression is sometimes unusual – for example, file sharing with Samba is awarded a whole six pages (and ten subsections) of usage-specifics, while the arguably more fundamental CUPS printing subsystem has to make do with less than 2 pages. The discussion of SSH is also quite limited, despite the importance of this tool from both the user’s and administrator’s perspective, and desktop environments probably could have done with a bit more than a brief single-chapter overview. Still, this book really isn’t intended as a manual, and the author has done well not to stray too far in this direction.

A common difficulty for Linux books is the great deal of variation between distros. Authors often struggle with where to draw the line between complete (but superficial) distro-agnostic generality and more useful, but audience-limiting, distro specifics. How Linux Works succeeds admirably in walking this tightrope, providing sufficient detail to be useful to users of more or less any Linux system without repeatedly dropping into tiresome list-like “distro by distro” discussions. This isn’t always successful – the preponderance of init systems in modern distros has necessitated a long and somewhat dull enumeration of three of the most common options, for example – but HLW2 does much better at handling this than most books I’ve seen. The upshot is that the writing is fluid and interesting for the most part, without too many of the “painful but necessary” digressions that plague technical writing.

Overall, this book is an enjoyable and informative read for anyone interested in, well, how Linux works! You’ll get an essential understanding of what’s going on under the hood without getting bogged down in minutiae – making this a very refreshing (and wholly recommended) addition to the Linux literature.

You can find a sample chapter and table of contents/index on the No Starch website.


Reducing the padding in matplotlib figures

By default, matplotlib figures have a lot of padding around them. This is quite annoying when producing plots for publication, as you end up with a load of useless whitespace that you have to compensate for by screwing around with the figure sizing and position (especially in LaTeX articles).

Fortunately, matplotlib has a neat convenience function that removes all of this unnecessary whitespace in one fell swoop. It’s as easy as this:

plt.tight_layout()

See the figures below for a comparison. This function also takes a ‘pad’ argument that lets you fine-tune the padding manually.

Oh, how I wish I’d known about this three years ago.

Matplotlib figure with normal padding

Normal padding

Matplotlib figure with minimal padding

Minimal padding


Making Numpy arrays read-only

The CMB analysis framework that we’re writing at the moment needs to cache certain data (e.g. sky maps, maps of noise properties etc.) for quick access during various stages of the analysis. It’s important that other parts of the program don’t alter the cached information directly, since it could fall out of sync with the current state of the MCMC chain. To protect the cache, then, we need to make certain arrays read-only.

An easy way of doing this for Numpy arrays is to use the setflags() method on the array. The flag for modifying read/write access is, unsurprisingly, write, so to make an array called arr read-only, you would simply call arr.setflags(write=False). If you subsequently try to modify the array, it raises a ValueError, like so:

>>> a = np.arange(6)
>>> a.setflags(write=False)
>>> a[4] = 6
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only


Book review: Think Like a Programmer, by V. Anton Spraul

My publisher, No Starch Press, sent me a review copy of another of their books a few months ago. Regrettably I’ve been a bit slow in getting around to properly reading it, but here, finally, is my review of Think Like a Programmer, by V. Anton Spraul.

Front cover of Think Like a Programmer

Programming is as much an art as it is a science. When you’re starting out as a programmer, there’s a big mess of concepts and rules to get into your head before you can do anything much more complicated than printing out a shopping list on-screen – things like differences between different types of variable, and how pointers work. Even if you’re dealing with a language that hides all of these icky details, like Python, you’re still going to find yourself spending most of your time learning about specific structures like for loops or class declarations.

Most books for the newbie programmer focus on these mechanical details, generally with specific application to only one programming language. And quite rightly so, in my opinion; after all, it’s only by learning this stuff that you’re ever going to be able to do anything interesting. And so it’s only when you start programming regularly, or try to do anything more substantial, that the “artistic” side of programming starts to become really important.

With Think Like a Programmer, Spraul has his sights firmly set on the people who’ve already put the time in with the initial “science” bit. They have a decent amount of experience with one, maybe two, programming languages, and have progressed beyond textbook exercises or copy-pasting together code snippets to actually writing their own functioning code from scratch. But it’s at this point that a lot of people get stuck, with some of them never moving far beyond this predominantly mechanical understanding. Without a firm push in the right direction, habits can develop, experience reinforces them, and the art of programming never blooms.

It’s a philosophy, not a cookbook

Think Like a Programmer is a 233 page push in the right direction. It’s not a pattern book, with endless lists, block diagrams, and flowcharts for deciding when to use one tried-and-tested program structure over another. It’s not a cookbook, with myriad clever, practical examples to use as “inspiration”. Nor is it an “advanced programming” textbook, with detailed treatises on using the obscure, God-level features of whatever programming language you’re most concerned with. No. While it does have elements of all of these things here and there (they’re useful, after all), it’s much more concerned with your attitude to programming – how you approach solving problems with code. Your coding philosophy.

It would be easy to churn out a book on this subject that goes little farther than chastising you into adopting good coding style (how to indent blocks? where to add comments? how to name variables?), and pointing at a few useful patterns that relative newcomers often don’t know about. But Spraul has gone far beyond this. He starts off with a general discussion on how to solve problems, using examples of the type you might find somewhere near the back of a newspaper. The puzzles he walked through were really fun, and it felt good to go from blindly fiddling around with them to successfully applying the strategies he suggested. The lesson he’s teaching here is to sit down, think about the problem, and start using powerful general-purpose approaches like reducing it into smaller sub-problems, being systematic, and so forth. And, hey, whaddaya know? It works. Thinking works!

It’s a journey, not a destination

Things progress from there into the programming domain. In each successive chapter, Spraul builds on the discussion that has gone before, introducing new, generic, problem-solving approaches and combining them with the methods discussed previously to solve progressively harder example problems. The examples are followed through in excellent explanatory detail, with an emphasis on the structure and logic behind the code rather than the particular language features that are used (though all the examples are in C++, which has its fair share of relatively opaque syntax). He also strongly encourages that you try the numerous exercises at the end of each chapter to cement what you’ve learned, an approach that, while it may sound dry and textbooky, really does help you to get to grips with things.

There are chapters on general programming/problem-solving approaches, plus more specialised ones on using important tools like arrays, classes, pointers, and recursion to your advantage (the recursion one is available to download for free). The discussion is kept general – there’s little in here about specific optimisation or debugging tricks, and rather more of an emphasis on just writing good code, regardless of the specific application you might have in mind. As such, if all you’re after is quick tips for making your code run better, you’re not going to get that much out of the book. If you’re willing to sit down and patiently follow it through, however, you’ll find that what it’s teaching you is an enlightened approach to programming – essentially, how to become a Good programmer with a capital-G. And in the long run, that’s what’s going to make the difference between scraping by, writing cobbled-together solutions that just about work, and outputting truly nice, effective ones.

That troublesome audience

This is all well and good – noble, even – but I can’t help but wonder if the book will get through to its intended audience. If a novice programmer picks this up, there’s a good chance that they’ll struggle with the choice of language. While it’s a sensible pick for showing off the concepts the author is interested in, C++ isn’t the easiest language in the world, and Spraul isn’t afraid of using some of its more obscure syntax with only the briefest of explanations. For someone with experience only of Python, for example, the one-page overview of how pointers work in C++ isn’t going to leave them much wiser on the subject. There’s absolutely nothing to help the non-C++-using reader get set up with compilers and IDEs either, which could cause some serious headaches for those actually wanting to play with the examples themselves. As a result, those familiar with C++ will find the book a considerably easier ride than those who are not, which is a shame.

The purpose of the book is also a bit more subtle than your average (the blurb describes it as a “one-of-a-kind text”). You know what you’re getting with a cookbook, whereas the benefits brought by Think Like a Programmer are somewhat less tangible. The readers who’ll get the most out of this are the patient, motivated learners, whereas those who’re looking for shortcuts and quick fixes to “becoming a better programmer” will likely find it frustrating. Ultimately, I guess that’s fine – you can only help those who will be helped – but I guess this sort of presentation would be more effective to a broader range of people  in the context of a taught course rather than self-study.

Verdict

The book is well-written, with tons of excellent advice and solid, well-thought-out examples. If you’re willing to devote some time to studying the material (perhaps, depending on your background, with a C++ reference in hand), you’ll soon find yourself equipped with an impressive array of problem-solving strategies and, maybe, a new outlook on programming. Recommended.


Calling the WMAP likelihood code from C/C++

If you’re interested in cosmological parameters, chances are you’ll want to include WMAP CMB constraints in your parameter estimation code at some point. Happily, the WMAP team have made their likelihood code publicly-available, so it’s relatively simple to add them into your own MCMC software (or whatever else you’re using). Less happily, the likelihood code is written in Fortran 90, so users of more modern languages will need to do a bit of fiddling to get it to play ball with their code.

Joe Zuntz was kind enough to share a little C wrapper that he wrote for the WMAP likelihood code. It’s pretty easy to follow, although you should of course read the documentation for the likelihood code to see how to use it properly. First of all, compile the original Fortran WMAP likelihood code. Then, compile this wrapper function as a static library (libwmapwrapper) as follows:

gfortran -O2 -c WMAP_likelihood_wrapper.F90
ar rc libwmapwrapper.a WMAP_likelihood_wrapper.o

You can call the wrapper function from your C code by linking to that library. Joe has written a test implementation that shows how it works. To compile this, you’ll need to make sure you’re linking against everything the WMAP code needs, including a BLAS/LAPACK library; the following should work on Mac OS X (using veclib):

gcc wmap_test.c -std=c99 -L. -lwmap -lwmapwrapper -lcfitsio -framework veclib -lgfortran -o test_wmap

(N.B. Joe’s code is written for the WMAP 7-year release, so you may need to change a couple of numbers to get it working with the 9-year release.)


Customising contour plots in matplotlib

So, you need to include a contour plot in some publication of yours, little one? There are two things that you must learn. But beware! The first will raise your spirits, while the second will quicken your descent into madness. These facts I address to you, should you stand to read them: (1) matplotlib has a really customisable contour plot implementation; (2) the convenience functions are few, and the necessary keyword arguments are confusing.

I’m prepping a bunch of contour plots for a publication in a quick letter at the moment, and want to improve their legibility. This involves things like making all of the lines thicker, increasing the font size of the labels (both on the axes and on the contours themselves), and changing the contour scaling. Some of these modifications have proven more difficult than others with matplotlib, so I thought I’d jot down a couple of examples here for reference. (The matplotlib contour examples are useful, but don’t cover everything.)

Tick size

First of all, let’s change the width and length of the ticks on each axis, and the size of the font of the labels for each tick.

import pylab as P

MP_LINEWIDTH = 2.4
MP_TICKSIZE = 10.

P.rc('axes', linewidth=MP_LINEWIDTH)

P.subplot(111)
for tick in P.gca().xaxis.get_major_ticks():

  tick.label1.set_fontsize(20.)
  tick.tick1line.set_markeredgewidth(MP_LINEWIDTH)
  tick.tick2line.set_markeredgewidth(MP_LINEWIDTH)
  tick.tick1line.set_markersize(0.5*MP_TICKSIZE)
  tick.tick2line.set_markersize(0.5*MP_TICKSIZE)

The call to P.rc() changes the thickness of all of the axis lines on the plot (i.e. it changes the global properties, and isn’t restricted to just one plot). The other stuff could probably be done using calls to rc(), but that’s an exercise for another time.

Plot of Compton y-distortion.

An example of a customised contour plot in matplotlib.

The loop is over all major ticks on the x axis of the current subplot. (The call to gca() gets the current set of plot axes, but of course you could use the xaxis.get_minor_ticks() method from any previously-defined axes object.) The object tick1line is the x-axis at the bottom of the plot, and tick2line is at the top.

Inside the loop, I’m setting the font size of the tick label (the number that appears below each tick), the width of the tick (using the markeredgewidth property), and the length of the tick (using markersize). You can also do things like hiding certain ticks, hiding certain labels (especially useful if you want to remove the tick label at the origin, because it overlaps with the label for the other axis there), changing the appearance of gridlines (somewhat unintuitively), and so on. There’s a list of all the tick-related objects that you can change here.

This will only change the tick styles for the minor ticks on the x axis. To cover all of the ticks on the plot, you’ll need to loop through all the minor ticks using P.gca().xaxis.get_minor_ticks() too, and then do both major and minor ticks for the y axis (using P.gca().yaxis) as well.

Changing which values contours are drawn at, and how they are labeled

To change where the contours are drawn, you need to change the contour locator. This is a keyword argument to contour(), and requires a ticker object. There’s a list of built-in tickers here. In the example below, I wanted a log scaling for my plots, so I used LogLocator().

It’s also useful to be able to change the labels that are added to the contours.For this, you need to use the clabel() function (which controls contour labels) and manipulate the formatter, using the fmt keyword. A list of formatters is given here, with more documentation on them (including formatter-specific keyword arguments for further customisation) given further down that page. Particularly useful formatters include:

  • LogFormatterMathtext (which add LaTeX mathmode labels, especially useful for numbers with exponents)
  • FormatStrFormatter (which lets you use a C-like format string to determine how significant figures, signs, etc. are displayed)
  • FuncFormatter (which lets you define a custom function to handle string formatting)

Setting the font size of the contour label is as easy as changing the fontsize keyword of clabel().

Finally, a subtlety of contour() is its use of the linewidths keyword, rather than linewidth (as with other pylab functions). It works exactly the same otherwise.

from matplotlib import ticker
ctr = P.contour(X, Y, Z, locator=ticker.LogLocator(), colors='k', linewidths=3.0)
P.clabel(ctr, inline=1, fontsize=20., fmt=ticker.LogFormatterMathtext())


Update: Floating-point exception handling on Mac OS X

A few months ago, I was having a few problems porting some C++ code over to Mac OS X because of some non-standard floating-point exception handling functions that are present in glibc on Linux. Well, it just so happened that a colleague of mine, Rich Booth, recently ran into the same problem, only from a different angle. He wanted to keep track of floating point exceptions in a simulation code of his, but found that he couldn’t do this on his Mac.

After a bit of digging around, we found a portable implementation of floating point exception handling that would happily run on both Linux and Mac OS X. It was written by David N. Williams in 2009, and includes some good documentation on the implementation in the comments. There’s also an example program right at the end of the file, so you can test it out right away.

The code should be simple enough to figure out pretty quickly, but Rich split out a header file anyway, just to make everyone’s lives that bit easier. You can find a tarball with Rich’s modifications here.