I co-chaired a series of parallel sessions on radio surveys at the 2015 UK National Astronomy Meeting in Llandudno earlier this month. It was a fun session, with lots of nice talks. We’ve now made the talk slides available online – take a look!
A large fraction of my time over the last 18 months has been spent working out parts of the cosmology science case for the Square Kilometre Array, a gigantic new radio telescope that will be built (mostly) across South Africa and Australia over the coming decade. It’s been in the works since the early 90’s and – after surviving the compulsory planning, political wrangling, and cost-cutting phases that all Big Science projects are subjected to – will soon be moving to the stage where metal is actually put into the ground. (Well, soon-ish – the first phase of construction is due for completion in 2023.)
A detailed science case for the SKA was developed around a decade ago, but of course a lot has changed since then. There was a conference in Sicily around this time last year where preliminary updates on all sorts of scientific possibilities were presented, which were then fleshed out into more detailed chapters for the conference proceedings. While a lot of the chapters were put on arXiv in January, it’s good to see that all of them have now been published (online, for free). This is, effectively, the new SKA science book, and it’s interesting to see how it’s grown since its first incarnation.
My contribution has mostly been the stuff on using surveys of neutral hydrogen (HI) to constrain cosmological parameters. I think it’s fair to say that most cosmologists haven’t paid too much attention to the SKA in recent years, apart from those working on the Epoch of Reionisation. This is presumably because it all seemed a bit futuristic; the headline “billion galaxy” spectroscopic redshift survey – one of the original motivations for the SKA – requires Phase 2 of the array, which isn’t due to enter operation until closer to 2030. Other (smaller) large-scale structure experiments will return interesting data long before this.
We’ve recently realised that we can do a lot of competitive cosmology with Phase 1 though, using a couple of different survey methods. One option is to perform a continuum survey [pdf], which can be used to detect extremely large numbers of galaxies, albeit without the ability to measure their redshifts. HI spectroscopic galaxy surveys rely on detecting the redshifted 21cm line in the frequency spectrum of a galaxy, which requires narrow frequency channels (and thus high sensitivity/long integration times). This is time consuming, and Phase 1 of the SKA simply isn’t sensitive enough to detect a large enough number of galaxies in this way in a reasonable amount of time.
Radio galaxy spectra also exhibit a broad, relatively smooth continuum, however, which can be integrated over a wide frequency range, thus enabling the array to see many more (and fainter) galaxies for a given survey time. Redshift information can’t be extracted, as there are no features in the spectra whose shift can be measured, meaning that one essentially sees a 2D map of the galaxies, instead of the full 3D distribution. This loss of information is felt acutely for some purposes – precise constraints on the equation of state of dark energy, w(z), can’t be achieved, for example. But other questions – like whether the matter distribution violates statistical isotropy [pdf], or whether the initial conditions of the Universe were non-Gaussian – can be answered using this technique. The performance of SKA1 in these domains will be highly competitive.
Another option is to perform an intensity mapping survey. This gets around the sensitivity issue by detecting the integrated HI emission from many galaxies over a comparatively large patch of the sky. Redshift information is retained – the redshifted 21cm line is still the cause of the emission – but angular resolution is sacrificed, so that individual galaxies cannot be detected. The resulting maps are of the large-scale matter distribution as traced by the HI distribution. Since the large-scale information is what cosmologists are usually looking for (for example, the baryon acoustic scale, which is used to measure cosmological distances, is something like 10,000 times the size of an individual galaxy), the loss of small angular scales is not so severe, and so this technique can be used to precisely measure quantities like w(z). We explored the relative performance of intensity mapping surveys in a paper last year, and found that, while not quite as good as its spectroscopic galaxy survey contemporaries like Euclid, SKA1 will still be able to put strong (and useful!) constraints on dark energy and other cosmological parameters. This is contingent on solving a number of sticky problems to do with foreground contamination and instrumental effects, however.
The thing I’m probably most excited about is the possibility of measuring the matter distribution on extremely large-scales, though. This will let us study perturbation modes of order the cosmological horizon at relatively late times (redshifts below ~3), where a bunch of neat relativistic effects kick in. These can be used to test fundamental physics in exciting new ways – we can get new handles on inflation, dark energy, and the nature of gravity using them. With collaborators, I recently put out two papers on this topic – one more general forecast paper, where we look at the detectability of these effects with various survey techniques, and another where we tried to figure out how these effects would change if the theory of gravity was something other than General Relativity. To see these modes, you need an extremely large survey, over a wide redshift range and survey area – and this is just what the SKA will be able to provide, in Phase 1 as well as Phase 2. While it turns out that a photometric galaxy survey with LSST (also a prospect for ~2030) will give the best constraints on the parameters we considered, an intensity mapping survey with SKA1 isn’t far behind, and can happen much sooner.
Cool stuff, no?
In March of this year, immediately following the jubilation surrounding the BICEP2 results, the Daily Mail published a bizarre opinion piece on two scientists that were interviewed about the experiment on BBC’s Newsnight programme. The gist of the article was that the Beeb was cynically polishing its “political correctness” credentials by inviting the scientists to the programme, because they were both non-white and non-male. More details about the debacle can be found in this Guardian article.
Now, I’m not much of a Daily Mail fan at the best of times, but this struck me as particularly egregious; not only were their facts wrong and their tone borderline racist and sexist (in my opinion, at least), but they also seemed to be mistaking science for some sort of all white, all-boys club that women and people of other ethnic groups have no right to involve themselves with. This is damaging to all of us in science, not just those who were personally attacked – so I complained.
I just received word back on my complaint, which was sent to the Press Complaints Commission in the UK, who have the job of (sort of) regulating the press. Their response is reproduced below in full; my allegation of factual inaccuracy was upheld, but they declined to act on the allegation of inappropriate racial/gender commentary because I wasn’t one of the parties being discussed.
Commission’s decision in the case of
A man [me] v Daily Mail
The complainant expressed concern about an article which he considered to have been inaccurate and discriminatory, in breach of Clauses 1 (Accuracy) and 12 (Discrimination) of the Editors’ Code of Practice. The article was a comment piece, in which the columnist had critically noted Newsnight’s selection of “two women….to comment on [a] report about (white, male) American scientists who’ve detected the origins of the universe”.
Under the terms of Clause 1 (i) of the Code, newspapers must take care not to publish inaccurate information, and under Clause 1 (ii) a significant inaccuracy or misleading statement must be corrected promptly, and with “due prominence”.
The newspaper explained that its columnist’s focus on gender and ethnicity was designed to be nothing more than a “cheeky reference” to the BBC’s alleged political correctness. In the columnist’s view, the selection of Dr Maggie Aderin-Pocock and Dr Hiranya Peiris to comment on the BICEP2 (Background Imaging of Cosmic Extragalactic Polarisation) study was another such example of this institutional approach.
The complainant, however, noted the BICEP2 team were, in fact, a diverse, multi-ethnic, multi-national group which included women, something which the newspaper accepted. Furthermore, he said that white, male scientists had been interviewed on Newsnight as well, which undermined the columnist’s claim that Dr Maggie Aderin-Pocock and Dr Hiranya Peiris had been specifically selected. The suggestion that the BICEP2 team were all white and male was a basic error of fact and one which appropriate checks could have helped to prevent. There had been a clear failure to take care not to publish inaccurate information, and a corresponding breach of Clause 1 (i) of the Code.
The newspaper took a number of measures to address the situation: the managing editor wrote to both Dr Aderin-Pocock and Dr Peiris; a letter criticising the columnist’s argument was published the following day; its columnist later explicitly noted both scientists expertise, and competence to comment on the study; and, a correction was published promptly in the newspaper Corrections & clarifications column which acknowledged that the BICEP2 study was “conducted by a diverse team of astronomers from around the world”, and which “apologis[ed] for any suggestion to the contrary”. The latter measure was sufficient to meet the newspaper’s obligation under Clause 1 (ii) of the Code, to correct significantly misleading information.
The columnist’s suggestion that Dr Aderin-Pocock and Dr Peiris were specifically selected for the Newsnight programme because of “political correctness” was clearly presented as his own comment and conjecture which, under Clause 1 (iii) and the principle of freedom of expression, he was entitled to share with readers. There was, therefore, no breach of the Code in publishing that suggestion. However, the subsequent correction of the factual inaccuracy regarding the BICEP2 team and the acknowledgment of both experts’ expertise will have allowed readers to assess the suggestion in a new light.
Under Clause 12 (Discrimination) (ii) of the Code, “details of an individual’s race, colour, religion, sexual orientation, physical or mental illness or disability must be avoided unless genuinely relevant to the story”. The complainant’s concerns under this Clause were twofold; he believed that the references to the gender and ethnic background of both Dr Aderin-Pocock and Dr Peiris, and the BICEP2 team members, were irrelevant in a column about a scientific study. While the terms of Clause 12 (ii) do not cover irrelevant references to gender, the Commission would need to have received a complaint from a member, or members of the BICEP2 team, or Dr Aderin-Pocock or Dr Peiris in order to consider the complaint about under this Clause. In the absence of any such complaints, the Commission could not comment further.
- Constraints on scalar-tensor ratio vs. scalar and tensor spectral indices (with and without running)
- Running of scalar spectral index (pretty inconsistent with zero!), with and without varying lensing amplitude
- Sigma8, including n_run
Check out Shaun’s Twitter feed for the latest, plus some initial analysis.
[Update: Antony has now posted a PDF with several tables of joint constraints from Planck + BICEP2.]
Phew! An exciting day indeed, so I’ll jot down a few notes to recap what happened.
The BICEP2/Keck experiments detected B-modes at large angular scales in the polarisation of the CMB. They released two papers and some data online just as the announcement was made, which you can find here. Not all of the data mind, but it’s plenty to go on for now.
Their interpretation of the data is that they detect a bump at low-ell that is characteristic of primordial B-modes generated by inflation. If true, this is super exciting, as it gives us a (sort of, but not really) direct detection of gravitational waves, and opens up a new window on the very early Universe (and hence extremely high energy scales). People are even saying it’s a probe of quantum gravity, which I guess is sort of true. Furthermore, they find a best-fit value of the scalar-tensor ratio of r = 0.20 +0.07/-0.05, which is a significantly higher value than many inflationary theorists would have expected, but which counts as a very firm detection of r. This will surely shake-up the inflation people in coming months.
There do appear to be some issues with the data – as there always are for any experiment – but it’s not clear how important they are. In particular, Hans Kristian Eriksen points out that their null tests look a bit fishy at first glance. Check out the blue points in the plot to the right. These are the null test for the BB power spectrum, which you’d expect to be consistent with zero if everything is hunky dory. And they are! The problem is that they look too consistent – absolutely all of the errorbars overlap with zero. You’d naively expect about a third of the points to have their errorbars not overlapping with zero, since they represent a 68% confidence level – on average, 32% of samples should lie more than one errorbar away. This isn’t the case.
What does this mean? Well, maybe they don’t have a perfect handle on their noise levels. If they overestimate the noise, the errorbars are larger than they should be, and the null tests look more consistent than they really are. This could hide Bad Things, like systematics. (I’m certainly not saying they purposefully inflated their errorbars, by the way; just that something doesn’t quite add up with them. This happens very commonly in cosmology.) But hey, maybe this is a relatively minor issue.
You also see this in Table I of the results paper [pdf], where they quote the results of their “jackknife” tests. The idea behind jackknife tests is explained reasonably well here (Section 7) – you cut up your data into two roughly equal halves that should have the same signal, but might be subject to different systematics, and check to see if they’re statistically consistent with one another. If not, you’ve identified a systematic that you need to deal with.
The consistency test normally involves subtracting one sub-set from the other, and checking if the result significantly differs from zero. For example, you might imagine splitting the data depending on what time of year it was taken: Spring/Summer vs. Autumn/Winter, for example. If the two are inconsistent, then you’re seeing some sort of seasonal variation, which is most likely a systematic that you didn’t account for rather than a real time-dependence in your data…
Anyway, Table I quotes the results of a whole battery of jackknife tests. Great. But things are still a bit fishy. Why do three of the tests have a probability to exceed (PTE) of 0.000, for example? (Up to rounding error, this actually means p < 0.0005). What are the odds of that happening? PTE’s should be uniform distributed. For the 14 x 12 jackknife tests that have been used, the odds of getting three results drawn from Uniform with p < 0.0005 is a bit slim – you could maybe get away with one, but not three. So there’s maybe some inconsistency here. It could be the data, it could be to do with the simulations they’ve used to calculate the PTE’s, I don’t know. Or maybe I’ve missed something. But the problem gets worse if you think the errorbars are overestimated; shrinking the errorbars will shrink the width of the simulated distribution, and the observed value will look less and less consistent – the PTE’s will fall across the board.
[Update: Christopher Sheehy comments below that the three null test failures were apparently just typos. The BICEP2 team have updated the paper on arXiv, and now there’s only one PTE < 0.0005 in the table.]
(Quick note: the PTE, as I understand it in this context, is the probability that a value drawn from their simulations will be greater than the observed value. So a PTE of 0.9 means that there’s a 90% chance a randomly-chosen simulated value will be greater than the observed value, which would be good here – it means the observed value is well within what they expect from simulations, so it would be consistent with no systematic effect being present. Low PTE’s are bad, since it means the observed value is less consistent with your expectations. You should normally expect to see some low PTE’s, however, and the number of very low PTE’s that can be tolerated depends on how many tests you did. More tests means you expect more low PTE’s.)
So that’s the blue points in the plot above. Now on to the black points and red/green lines. If you squint a bit, and ignore the dashed red lines, you can convince yourself that a straight line would fit the points quite well (green line; my addition). The point is that BICEP2 don’t clearly see the bump feature (the “smoking gun” of the inflationary B-modes) in their data; they just see an excess amplitude at low ell. Could something else cause this excess?
Imagine if there were no primordial B-modes, and you only had the lensing spectrum, which is the solid red line. If the lensing amplitude was increased, could you make it fit? Probably not; the lensing spectrum drops off too quickly at low-ell, so it would be difficult to capture the first two data points while staying consistent with everything at higher ell just by changing the amplitude. The BICEP2 team have tried this trick, in fact (see the plot on the right), and even allowing the lensing amplitude to vary by quite a large factor isn’t enough to explain the low ell power. So it still looks like a detection of non-zero r.
There’s also the issue of an excess at higher ell in the BICEP2-only results, as shown in the first plot, above (it seems to go away in the BICEP2 x Keck preliminary results). You could maybe imagine an additive systematic in the BB power spectrum that shifts a lensing-only BB spectrum upwards (roughly the green line). This would fit the data quite well, without any primordial contribution, although whether such an additive systematic is plausible or not I don’t know.
Other non-inflation stuff (primordial magnetic fields or some such, who knows) might explain the low-ell power too. All I’m saying here is that while the primordial B-mode seems to fit extremely well, the unique “bump” shape isn’t clearly detected, so maybe there are other explanations too. We’ll need to wait and see if anything else works.
I’ve heard some minor grumbling about foreground subtraction, which I’ll only mention briefly. Polarised galactic dust seems to be the main worry, and they’re arguably not using the most fantastically realistic dust maps, although as they correctly point out it will probably have to wait until the next Planck release until something better is available. Their Fig. 6 shows the contribution to the polarisation that they’d expect from a bunch of foreground models, all of which are dwarfed by the detected signal. The implication is that foregrounds aren’t a major issue, but of course this statement is only as good as the models. Maybe Planck will see more polarised emission at the BICEP2 pointing than expected? We’ll have to wait and see, although it seems like a bit of a stretch to flag this up as a major concern.
Also, if I’m interpreting their paper correctly, it seems that they just subtract off the foreground templates with fixed amplitude, rather than fitting the amplitude (and propagating through the errors associated with doing this). Hey, this is what Commander is for. But I doubt that accounting for this would blow up their errorbars too much. Foreground subtraction does shift their best-fit value of r down to more like r=0.16, though, which is slightly less jarring than a full r=0.2. It doesn’t get rid of the detection, though.
The overall picture is that this is a serious result, which looks pretty good, but isn’t entirely free of holes. My gut feeling is that the claimed detection significance, and best-fit value of r, will go down with further analysis. I’d be surprised to see the detection go away entirely, though, unless they found a whopping systematic. We’ll have a better idea what’s going on when the Keck analysis has been completed and, after that, when the Planck polarisation data is released towards the end of the year.
So all I can say is, congratulations BICEP2!
(Thanks to Hans Kristian Eriksen and Yashar Akrami for useful discussions over the course of the day. Any errors in the above are my own, of course.)
That is the question.
So it’s an exciting day, today! The cosmology community is a-buzz with rumours that BICEP2/KECK will announce the first detection of primordial B-mode polarisation of the CMB in a press conference this afternoon (technical presentation starting 3.45pm CET). Shaun Hotchkiss has a great summary of some of the coverage in the blogosphere, and the Guardian even jumped the gun a bit and posted an article on Friday.
The rumours are by now pretty convincing – there was some minor speculation that they might be announcing the discovery of an exo-moon instead (boooring), but it’s now been confirmed that the press conference is definitely about the CMB. So, some (brief) talking points, then:
- The rumour is r=0.2 (where r is the scalar-tensor ratio). This is significantly higher than most people expect, and it seems to be in conflict with the Planck inflation mega-plot from early 2013. Not necessarily, though, if you allow the scalar spectral index to run. So which inflationary models would this kill? Will the poor Encyclopaedia Inflationaris guys have to redo everything?
- Are primordial gravitational waves really a “smoking gun” for inflation? Here’s a relatively vitriolic review by Grishchuk for the entertainment of the contrarian in your life.
- Foregrounds! Systematics! Here in Oslo, Hans Kristian Eriksen has a 1000 NOK bet running that the accepted value of r will be much lower (inconsistent with r=0.2 at 3 sigma) in 5 years’ time. This polarisation business is tricky, you know. There are also rumblings that the Southern Hole, which is the region of low galactic dust emission that BICEP2 is apparently looking through, might have higher-than-expected polarised foregrounds after all. Does this matter?
- If it turns out that we need n_run to be non-zero for things to make sense, then we’ve added another parameter to the oh-so-simple, six-parameters-is-all-you-need standard model. I can almost hear the floodgates opening.
- Of course, everyone should play BICEP2 Bingo during the press conference (by Joe Zuntz).
Here’s hoping for a fireworks-filled afternoon!
By default, matplotlib figures have a lot of padding around them. This is quite annoying when producing plots for publication, as you end up with a load of useless whitespace that you have to compensate for by screwing around with the figure sizing and position (especially in LaTeX articles).
Fortunately, matplotlib has a neat convenience function that removes all of this unnecessary whitespace in one fell swoop. It’s as easy as this:
See the figures below for a comparison. This function also takes a ‘pad’ argument that lets you fine-tune the padding manually.
Oh, how I wish I’d known about this three years ago.