BICEP2: Impact on cosmological parameters

Antony Lewis and Shaun Hotchkiss are posting/tweeting some preliminary CosmoMC results for Planck + BICEP2. Here’s a brief list of what they’ve put out so far:

Check out Shaun’s Twitter feed for the latest, plus some initial analysis.

[Update: Antony has now posted a PDF with several tables of joint constraints from Planck + BICEP2.]


How solid is the BICEP2 B-mode result?

Phew! An exciting day indeed, so I’ll jot down a few notes to recap what happened.

The BICEP2/Keck experiments detected B-modes at large angular scales in the polarisation of the CMB. They released two papers and some data online just as the announcement was made, which you can find here. Not all of the data mind, but it’s plenty to go on for now.

Their interpretation of the data is that they detect a bump at low-ell that is characteristic of primordial B-modes generated by inflation. If true, this is super exciting, as it gives us a (sort of, but not really) direct detection of gravitational waves, and opens up a new window on the very early Universe (and hence extremely high energy scales). People are even saying it’s a probe of quantum gravity, which I guess is sort of true. Furthermore, they find a best-fit value of the scalar-tensor ratio of r = 0.20 +0.07/-0.05, which is a significantly higher value than many inflationary theorists would have expected, but which counts as a very firm detection of r. This will surely shake-up the inflation people in coming months.

Null tests

There do appear to be some issues with the data – as there always are for any experiment – but it’s not clear how important they are. In particular, Hans Kristian Eriksen points out that their null tests look a bit fishy at first glance. Check out the blue points in the plot to the right. These are the null test for the BB power spectrum, which you’d expect to be consistent with zero if everything is hunky dory. And they are! The problem is that they look too consistent – absolutely all of the errorbars overlap with zero. You’d naively expect about a third of the points to have their errorbars not overlapping with zero, since they represent a 68% confidence level – on average, 32% of samples should lie more than one errorbar away. This isn’t the case.

BB power spectrum from BICEP2/Keck, annotated with a line that suggests they haven't detected the "bump" characteristic of inflationary B-modes

What does this mean? Well, maybe they don’t have a perfect handle on their noise levels. If they overestimate the noise, the errorbars are larger than they should be, and the null tests look more consistent than they really are. This could hide Bad Things, like systematics. (I’m certainly not saying they purposefully inflated their errorbars, by the way; just that something doesn’t quite add up with them. This happens very commonly in cosmology.) But hey, maybe this is a relatively minor issue.

You also see this in Table I of the results paper [pdf], where they quote the results of their “jackknife” tests. The idea behind jackknife tests is explained reasonably well here (Section 7) – you cut up your data into two roughly equal halves that should have the same signal, but might be subject to different systematics, and check to see if they’re statistically consistent with one another. If not, you’ve identified a systematic that you need to deal with.

The consistency test normally involves subtracting one sub-set from the other, and checking if the result significantly differs from zero. For example, you might imagine splitting the data depending on what time of year it was taken: Spring/Summer vs. Autumn/Winter, for example. If the two are inconsistent, then you’re seeing some sort of seasonal variation, which is most likely a systematic that you didn’t account for rather than a real time-dependence in your data…

Anyway, Table I quotes the results of a whole battery of jackknife tests. Great. But things are still a bit fishy. Why do three of the tests have a probability to exceed (PTE) of 0.000, for example? (Up to rounding error, this actually means p < 0.0005). What are the odds of that happening? PTE’s should be uniform distributed. For the 14 x 12 jackknife tests that have been used, the odds of getting three results drawn from Uniform with p < 0.0005 is a bit slim – you could maybe get away with one, but not three. So there’s maybe some inconsistency here. It could be the data, it could be to do with the simulations they’ve used to calculate the PTE’s, I don’t know. Or maybe I’ve missed something. But the problem gets worse if you think the errorbars are overestimated; shrinking the errorbars will shrink the width of the simulated distribution, and the observed value will look less and less consistent – the PTE’s will fall across the board.

[Update: Christopher Sheehy comments below that the three null test failures were apparently just typos. The BICEP2 team have updated the paper on arXiv, and now there's only one PTE < 0.0005 in the table.]

(Quick note: the PTE, as I understand it in this context, is the probability that a value drawn from their simulations will be greater than the observed value. So a PTE of 0.9 means that there’s a 90% chance a randomly-chosen simulated value will be greater than the observed value, which would be good here – it means the observed value is well within what they expect from simulations, so it would be consistent with no systematic effect being present. Low PTE’s are bad, since it means the observed value is less consistent with your expectations. You should normally expect to see some low PTE’s, however, and the number of very low PTE’s that can be tolerated depends on how many tests you did. More tests means you expect more low PTE’s.)

Excess/additive systematics

So that’s the blue points in the plot above. Now on to the black points and red/green lines. If you squint a bit, and ignore the dashed red lines, you can convince yourself that a straight line would fit the points quite well (green line; my addition). The point is that BICEP2 don’t clearly see the bump feature (the “smoking gun” of the inflationary B-modes) in their data; they just see an excess amplitude at low ell. Could something else cause this excess?

Constraints on scalar-tensor ratio, r, versus lensing amplitude

Imagine if there were no primordial B-modes, and you only had the lensing spectrum, which is the solid red line. If the lensing amplitude was increased, could you make it fit? Probably not; the lensing spectrum drops off too quickly at low-ell, so it would be difficult to capture the first two data points while staying consistent with everything at higher ell just by changing the amplitude. The BICEP2 team have tried this trick, in fact (see the plot on the right), and even allowing the lensing amplitude to vary by quite a large factor isn’t enough to explain the low ell power. So it still looks like a detection of non-zero r.

There’s also the issue of an excess at higher ell in the BICEP2-only results, as shown in the first plot, above (it seems to go away in the BICEP2 x Keck preliminary results). You could maybe imagine an additive systematic in the BB power spectrum that shifts a lensing-only BB spectrum upwards (roughly the green line). This would fit the data quite well, without any primordial contribution, although whether such an additive systematic is plausible or not I don’t know.

Other non-inflation stuff (primordial magnetic fields or some such, who knows) might explain the low-ell power too. All I’m saying here is that while the primordial B-mode seems to fit extremely well, the unique “bump” shape isn’t clearly detected, so maybe there are other explanations too. We’ll need to wait and see if anything else works.

Foregrounds

I’ve heard some minor grumbling about foreground subtraction, which I’ll only mention briefly. Polarised galactic dust seems to be the main worry, and they’re arguably not using the most fantastically realistic dust maps, although as they correctly point out it will probably have to wait until the next Planck release until something better is available. Their Fig. 6 shows the contribution to the polarisation that they’d expect from a bunch of foreground models, all of which are dwarfed by the detected signal. The implication is that foregrounds aren’t a major issue, but of course this statement is only as good as the models. Maybe Planck will see more polarised emission at the BICEP2 pointing than expected? We’ll have to wait and see, although it seems like a bit of a stretch to flag this up as a major concern.

Also, if I’m interpreting their paper correctly, it seems that they just subtract off the foreground templates with fixed amplitude, rather than fitting the amplitude (and propagating through the errors associated with doing this). Hey, this is what Commander is for. But I doubt that accounting for this would blow up their errorbars too much. Foreground subtraction does shift their best-fit value of r down to more like r=0.16, though, which is slightly less jarring than a full r=0.2. It doesn’t get rid of the detection, though.

[Edit: Clive Dickinson left an insightful comment on the foreground issue below.]

Overall picture

The overall picture is that this is a serious result, which looks pretty good, but isn’t entirely free of holes. My gut feeling is that the claimed detection significance, and best-fit value of r, will go down with further analysis. I’d be surprised to see the detection go away entirely, though, unless they found a whopping systematic. We’ll have a better idea what’s going on when the Keck analysis has been completed and, after that, when the Planck polarisation data is released towards the end of the year.

So all I can say is, congratulations BICEP2!

(Thanks to Hans Kristian Eriksen and Yashar Akrami for useful discussions over the course of the day. Any errors in the above are my own, of course.)


B-modes? r=0.2 B-modes.

That is the question.

So it’s an exciting day, today! The cosmology community is a-buzz with rumours that BICEP2/KECK will announce the first detection of primordial B-mode polarisation of the CMB in a press conference this afternoon (technical presentation starting 3.45pm CET). Shaun Hotchkiss has a great summary of some of the coverage in the blogosphere, and the Guardian even jumped the gun a bit and posted an article on Friday.

The rumours are by now pretty convincing – there was some minor speculation that they might be announcing the discovery of an exo-moon instead (boooring), but it’s now been confirmed that the press conference is definitely about the CMB. So, some (brief) talking points, then:

  • The rumour is r=0.2 (where r is the scalar-tensor ratio). This is significantly higher than most people expect, and it seems to be in conflict with the Planck inflation mega-plot from early 2013. Not necessarily, though, if you allow the scalar spectral index to run. So which inflationary models would this kill? Will the poor Encyclopaedia Inflationaris guys have to redo everything?
  • Are primordial gravitational waves really a “smoking gun” for inflation? Here’s a relatively vitriolic review by Grishchuk for the entertainment of the contrarian in your life.
  • Foregrounds! Systematics! Here in Oslo, Hans Kristian Eriksen has a 1000 NOK bet running that the accepted value of r will be much lower (inconsistent with r=0.2 at 3 sigma) in 5 years’ time. This polarisation business is tricky, you know. There are also rumblings that the Southern Hole, which is the region of low galactic dust emission that BICEP2 is apparently looking through, might have higher-than-expected polarised foregrounds after all. Does this matter?
  • If it turns out that we need n_run to be non-zero for things to make sense, then we’ve added another parameter to the oh-so-simple, six-parameters-is-all-you-need standard model. I can almost hear the floodgates opening.
  • Of course, everyone should play BICEP2 Bingo during the press conference (by Joe Zuntz).

Here’s hoping for a fireworks-filled afternoon!


Reducing the padding in matplotlib figures

By default, matplotlib figures have a lot of padding around them. This is quite annoying when producing plots for publication, as you end up with a load of useless whitespace that you have to compensate for by screwing around with the figure sizing and position (especially in LaTeX articles).

Fortunately, matplotlib has a neat convenience function that removes all of this unnecessary whitespace in one fell swoop. It’s as easy as this:

plt.tight_layout()

See the figures below for a comparison. This function also takes a ‘pad’ argument that lets you fine-tune the padding manually.

Oh, how I wish I’d known about this three years ago.

Matplotlib figure with normal padding

Normal padding

Matplotlib figure with minimal padding

Minimal padding


Scientific Pilgrimages

One of the perks of being a scientist is that you end up travelling a hell of a lot, generally to places with universities and other sites of scientific interest (but not necessarily anything else of interest, as is sometimes the case…) This provides ample opportunity to do a bit of science sightseeing.

100" Hooker telescope, Mount Wilson Observatory

Over the past couple of months I’ve been in Pasadena, which is ripe for this sort of thing. I’ve been splitting my time between Caltech (the place of Feynman and Gell-Mann) and JPL (the place of rockets and twitchy security procedure); I can see Mount Wilson from my apartment; and there’s even the Cheesecake Factory, the (alleged) site of much of the action in the Big Bang Theory. A couple of highlights have been seeing the Hooker telescope — the very place where modern observational cosmology was founded — and visiting the Carnegie Observatories, or “Santa Barbara Street” as it was known back in the day when Hubble, and then Sandage, were knocking around. It’s even nice to just walk the same streets as Feynman.

Some of these visits feel like pilgrimages. Of the substantial number of places I’ve been able to visit over the past few years, a handful have inspired a genuine sense of awe, as if I were standing on hallowed ground. I think this comes from a feeling of being connected to the events and people that so significantly shaped our understanding of the Universe; to feel a small part of the momentous things I’ve been reading about in textbooks and popular articles for so long.  Standing on the spot where someone first split the atom, or first understood that the Universe had a beginning, separated from them by only a few decades — well, that’s as close as I can ever get.

I like the idea of being a part of a scientific tradition. Science is one of the more noble projects of humanity, and I feel pretty honoured to be able to participate in it — the same great adventure that occupied the Einsteins and Newtons of this world. Earning a place in said tradition isn’t easy, but at least I’m making some progress: my Erdos number is 5!


Making Numpy arrays read-only

The CMB analysis framework that we’re writing at the moment needs to cache certain data (e.g. sky maps, maps of noise properties etc.) for quick access during various stages of the analysis. It’s important that other parts of the program don’t alter the cached information directly, since it could fall out of sync with the current state of the MCMC chain. To protect the cache, then, we need to make certain arrays read-only.

An easy way of doing this for Numpy arrays is to use the setflags() method on the array. The flag for modifying read/write access is, unsurprisingly, write, so to make an array called arr read-only, you would simply call arr.setflags(write=False). If you subsequently try to modify the array, it raises a ValueError, like so:

>>> a = np.arange(6)
>>> a.setflags(write=False)
>>> a[4] = 6
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only


Planck results: Boring or Anomalous?

The Planck Collaboration released a 29-paper avalanche on the cosmology community just over a week ago, comprising their first set of CMB results (all of the papers are listed here). Along with a lot of people in the field (I imagine…), I’ve been letting the implications of this data release sink in for a few days and, on the face of it, the results are a little boring: no detection of non-Gaussianity, no extra neutrino species, “robust support for the standard, six parameter LCDM model of cosmology” — standard, standard, standard. OK, so the best-fit parameters are a bit different to the WMAP 9-year ones, but that hardly seems like front-page news. A few other anomalies were flagged up in the Planck papers too, but none of the cosmologists I’ve spoken to so far have seemed particularly excited about them. Raised eyebrows, murmurs of “that’s interesting”, but no suggestion that we’re looking at anything earth-shattering.

But hey; they’re worth a look, right? These are the “anomalies” that have piqued my interest, and why:

CMB/SZ σ8 discrepancy: The value of σ8 measured from the CMB is in tension with the value measured from the SZ cluster sample. The Planck Collaboration “attribute this tension to uncertainties in cluster physics”, but an interesting note by Macaulay, Wehus and Eriksen suggests that the SZ value is actually pretty consistent with a number of independent redshift space distortion (RSD) measurements from galaxy redshift surveys (see figure, right). This is interesting: If Planck are right about the reason for the CMB/SZ discrepancy, then perhaps there is a potentially important systematic bias in the RSD analyses that has been left unaccounted for. And if they’re wrong, and the value of σ8 from multiple probes of large-scale structure is genuinely different from the CMB value, then maybe we’re looking at some interesting physics (e.g. a spatial variation in the power spectrum amplitude, or an interesting dark energy equation of state).

f * sigma_8 from Macaulay et al.

f * sigma_8 from Macaulay et al. The dashed line is the theory curve for the Planck CMB best-fit parameters, and is clearly discrepant with the RSD data.

Hemispherical power asymmetry: The Planck results confirm the existence of a hemispherical power asymmetry that was previously seen by WMAP. This is the result that the amplitude of the angular power spectrum appears to have a dipolar variation across the sky. A letter by Liang Dai and others considers a number of physical models for this asymmetry, including a spatial variation in the initial scalar spectral index, an isocurvature mode, and an inhomogeneous reionisation optical depth. If some of these explanations turn out to be true, we might need to seriously revise our picture of what the Universe looks like on the largest scales (which would be cool).

H0 discrepancy: The best-fit value of the local Hubble rate found with Planck is significantly lower than the Riess et al. measurement from local distance measures. I’ve been noticing for a while that H0 estimates from WMAP+BAO analyses were tending to come in lower than the Riess value, but the new value from Planck has smaller error bars, and so the discrepancy is more significant. Again, the simple answer is that someone has a systematic error that they need to deal with, but maybe there could be some neat physics here too. Valerio Marra and others point out that one should expect a local measurement of H0 to differ from the background value as a result of cosmic variance, although they expect a smaller deviation than the observed one. Myself and Tim Clifton have previously suggested that you’d expect such a discrepancy if backreaction effects were important too.

Bulk flow: The official Planck analysis failed to detect a bulk flow using the KSZ effect from galaxy clusters, which is ostensibly bad news for previous claims of an anomalously large bulk flow detection with WMAP. A subsequent paper by Atrio-Barandela claims that the statistical errors have been overestimated in the Planck analysis, and that a significant detection would be found if they were corrected. A large bulk flow seems to be quite difficult to explain within the standard LambdaCDM framework, although I think this result needs more work to be convincing.


Follow

Get every new post delivered to your Inbox.