Rich Booth gave us an introduction to using OpenMP to parallelise our code. It turns out to be surprisingly easy – all you need to do is add a specially-formed comment here and and there, and OpenMP will do the rest. At its most basic, OpenMP just takes a serial code (code meant to be run on a single processor), and splits the work between multiple threads, which can be run on multiple processors. This (hopefully) speeds up code execution by sharing the load. Rich gave examples for OpenMP in C and Fortran, but there are other parallelisation tools out there for other languages, like the multiprocessing module in Python. It was also mentioned that Matlab has excellent multi-processing capabilities, which tend to be quite easy to use.
Rich worked off this introduction to OpenMP, and showed us some basic examples of how for loops can be parallelised. He also discussed concepts like:
- Scheduling: how the workload should be split up between different threads to make things as fast as possible (different scheduling strategies are available)
- Scope: which variables should be private to a thread, which should be public and shared between all threads, and how to specify which is which
- Preventing race conditions: ensuring that shared variables are updated in a coherent way by individual threads, e.g. by using atomic operations
- Functions: Bundling code into a function which is then called from inside the parallelised block of code, to make it easier to keep track of private and shared variables
Other topics that were touched on include the difficulty of debugging parallelised code (the best way is to avoid having to debug by following a few good practises), the fact that OpenMP-enabled code can be run in “serial mode” by compiling with a standard compiler instead (the compiler will just throw up a few spurious warnings when it gets to the OpenMP bits), and that it’s best to only use OpenMP to parallelise a few well-defined tasks, like computationally-intensive for loops, rather than trying to parallelise everything (which can take a long time to code properly).
All in all, OpenMP looks like a nice, relatively stable way of speeding up operation that can be vectorised. It’s a lot simpler to implement than I thought, and doesn’t seem to require a load of code rewrites or anything as serious as that. There’s quite a bit of introductory tutorial material on the web, along with a few blogs dedicated to parallel programming. As usual, Wikipedia is helpful on concepts.