Advances in Observational Cosmology
Advances in Observational Cosmology
the breadth of recent developments, such as precision cosmology and the concordance
cosmological model, inflation, gravitational lensing and shear, the extragalactic far-infrared
and X-ray backgrounds, downsizing and baryon wiggles. Forthcoming major facilities are
covered, including radio, X-ray, submm-wave and gravitational wave astronomy. Suggestions
for further reading provide accessible and approachable jumping off points for students aiming
to further their studies. Produced by Open University academics and drawing on decades of
Open University experience in supported open learning, the book is completely self-contained
with numerous exercises (with full solutions provided). Designed to be worked through
sequentially by a self-guided student, it also includes clearly identified key facts and equations
as well as informative chapter summaries.
Stephen Serjeant is a Reader in Cosmology at The Open University. He led the extragalactic
science case of the SCUBA-2 All Sky Survey, and co-led the active galaxies science theme
of the ATLAS Key Project on the Herschel Space Observatory. Stephen also coordinates the
science faculty’s broadcasting at The Open University and is the lead science academic for the
BBC1 science show Bang Goes the Theory.
Cover image: A multi-wavelength, false-colour view of the M82 galaxy. X-ray data recorded
by Chandra appears in blue; infrared light recorded by Spitzer appears in red; Hubble’s
observations of hydrogen emission appear in orange, and the bluest visible light appears in
yellow-green. Copyright: NASA/JPL-Caltech/STScI/CXC/UofA/ESA/AURA/JHU
Author:
Stephen Serjeant
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Dubai, Tokyo
Cambridge University Press
The Edinburgh Building, Cambridge CB2 8RU, UK
In association with THE OPEN UNIVERSITY
The Open University, Walton Hall, Milton Keynes MK7 6AA, UK
Published in the United States of America by Cambridge University Press, New York.
[Link]
Information on this title: [Link]/9780521157155
First published 2010.
Copyright © The Open University 2010.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, transmitted or utilised in any form
or by any means, electronic, mechanical, photocopying, recording or otherwise, without written permission from the publisher or a
licence from the Copyright Licensing Agency Ltd. Details of such licences (for reprographic reproduction) may be obtained from the
Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS; website [Link]
Open University course materials may also be made available in electronic formats for use by students of the University. All rights,
including copyright and related rights and database rights, in electronic course materials and their contents are owned by or licensed to
The Open University, or otherwise used by The Open University as permitted by applicable law. In using electronic course materials
and their contents you agree that your use will be solely for the purposes of following an Open University course of study or otherwise
as licensed by The Open University or its assigns.
Except as permitted above you undertake not to copy, store in any medium (including electronic storage or use in a website), distribute,
transmit or retransmit, broadcast, modify or show in public such electronic materials in whole or in part without the prior written
consent of The Open University or in accordance with the Copyright, Designs and Patents Act 1988.
Edited and designed by The Open University.
Typeset by The Open University.
Printed and bound in the United Kingdom by Latimer Trend and Company Ltd, Plymouth.
This book forms part of an Open University course S383 The Relativistic Universe. Details of this and other Open University courses
can be obtained from the Student Registration and Enquiry Service, The Open University, PO Box 197, Milton Keynes MK7 6BJ,
United Kingdom: tel. +44 (0)845 300 60 90, email general-enquiries@[Link]
[Link]
British Library Cataloguing in Publication Data available on request.
Library of Congress Cataloguing in Publication Data available on request.
ISBN 978-0-521-19231-6 Hardback
ISBN 978-0-521-15715-5 Paperback
Additional resources for this publication at [Link]/9780521157155
Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites
referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
1.2
Contents
OBSERVATIONAL COSMOLOGY
Introduction 9
Chapter 1 Space and time 11
Introduction 11
1.1 Olbers’ paradox 11
1.2 Olbers’ paradox in a different way 12
1.3 Metrics: the Universe in a nutshell 13
1.4 Redshift and time dilation 19
1.5 Cosmological parameters 21
1.6 The age of the Universe 25
1.7 The flatness problem 27
1.8 Distance in a warped spacetime 28
1.9 The edge of the observable Universe 31
1.10 Measuring distances and volumes 32
1.11 The fate of the Universe 35
5
Contents
Epilogue 282
Appendix A 283
Appendix B 285
Solutions 291
Acknowledgements 312
Index 318
8
Introduction
I have gathered a posie of other men’s flowers, and nothing but the thread
that binds them is my own.
Montaigne
Observational cosmology is in a tremendously exciting time of rapid discovery.
Cosmology can be enriching and enjoyable at this level no matter what your
aims are, but my guiding principle for the topics in this book has been: what
would I ideally like a person finishing an undergraduate degree and starting a
PhD in observational cosmology to know? What would represent a balanced
undergraduate introduction that I would like them to have had?
Throughout this book, I’ve tried to give readers enough grounding to appreciate
the current topics in this enormously active and exciting field, and to give some
sense of the gaps — and in some cases chasms — in our understanding. I haven’t
forgotten that the step up to third-level undergraduate study can be difficult and
daunting, so I’ve included further reading sections. Some of the items in these
lists will take you to more leisured introductions and backgrounds to some of the
material that we shall cover. Nevertheless, this book is intended to be fully
self-contained.
I’ve also given some jumping-off points if readers want to go into more
depth. You’ll find these mostly in the further reading sections, but also
in some footnotes and figure captions. There are some references to journal
articles, such as ‘Hughes et al., 1998, Nature, 394, 241’. The first number is
a volume number, and the second is a page number. Many of these can
currently be read online, either in preprint form or as published papers, at
[Link] [Link]. Some of the further reading is
most easily found on the internet, but internet addresses are transitory so I’ve tried
to keep these to a minimum. References to arXiv or astro-ph reference numbers
are to the preprint server, currently at [Link] or various worldwide
mirrors. Entering the article identification in the search there usually results in
the paper. Sometimes the further reading section will point to more advanced
material, beyond the normal scope of an undergraduate degree. I’ve chosen to do
this partly in order to ease the transition from undergraduate to postgraduate level
for those of you who are on that track. The online abstract service also has the
facility to list later papers that have cited any given paper, so it’s very useful for
literature reviews. At every stage, each level is a big step up from the previous one
and the transition can be difficult. I don’t intend this book to be a postgraduate
textbook, but if I can ease the transition to that level, then all to the good.
Inevitably the selection of topics betrays my own biases and interests, and there
are undoubtedly many exciting areas not covered. But the biggest problem is that
this is a fast-paced field with lots of exciting and rapid developments. Some future
advances are foreseeable, such as gravitational wave astronomy or the Square
Kilometre Array, and I can give tasters for what these fabulous new facilities
promise, so this book should keep its relevance for a few years at least. As I write
this, the Herschel and Planck satellites are waiting to be launched in French
Guyana. However, the ‘unknown unknowns’ I can do nothing about. This is the
mixed blessing of writing a book during the golden age of cosmology.
9
Introduction
Finally, I would like to thank David Broadhurst, Mattia Negrello, Andrew Norton,
Robert Lambourne, Jim Hague and Carolyn Crawford for their critical readings of
early drafts of this book. Any errors that I somehow managed to sneak through
their careful ministrations are down to me alone. I would also like to thank the
editors and artists at The Open University for turning my scribbles into something
beautiful.
10
Chapter 1 Space and time
God does not care about our mathematical difficulties. He integrates
empirically.
Albert Einstein
Introduction
How did the Universe begin? How big is the observable Universe? Why is the
night sky dark? What will the Universe be like in the year one trillion? What is
the ultimate fate of the Universe? This chapter will answer these questions and
more, and give you the tools that you need for understanding modern precision
cosmology.
Although it’s not necessary for you to have met special relativity and the
Robertson–Walker metric before, you may find that we take these subjects at a
fast pace in this chapter if these are new topics to you. If so, you may find
Appendix B on special relativity helpful, or you might try a more comprehensive
introduction to expanding spacetime metrics such as that in Robert Lambourne’s
Relativity, Gravitation and Cosmology (see the further reading section).
called a power law, with a in this case being the power law index. Here, dN/dS
is a power law function of S, with a power law index of −5/2.
Now, we’ve assumed that all the stars are identical, but suppose instead that
there are several types of star, each with a different luminosity Li and number
density ρi , with i = 1, 2, 3, . . .. Each type of star will have its own number counts
dNi /dS = ki S −5/2 , where ki is some constant specific to type i. The total
number counts will still obey a −5/2 power law:
dN E dNi E ! F E
= = ki S −5/2 = S −5/2 ki ∝ S −5/2 .
dS dS
So any homogeneous, isotropic population of stars produces a −5/2 power law
for number counts. But this leads to a profound problem: the total flux of stars
brighter than S0 is
* ∞ * ∞
dN −1/2
Stotal = S dS ∝ S −3/2 dS ∝ S0 ,
S0 dS S0
Exercise 1.1 First, we’ve argued that a homogeneous, isotropic Universe gives
you a sky as bright as the Sun. Next, we’ve argued that the sky is infinitely bright
in a homogeneous, isotropic Universe. They can’t both be true, and it’s not a
mistake in the algebra, so what’s different in our assumptions? ■
The night sky is a long way from being as bright as the Sun, and is certainly not
infinitely bright. So what is the answer to Olbers’ profound question? It’s not that
the Universe is opaque — in fact, as we shall see later in this book, the Universe is
surprisingly transparent at optical wavelengths. It’s also not that stars have finite
lifetimes, because that doesn’t stop lines of sight ending on a star eventually and
inevitably.
Part of the answer is that the Universe is only finitely old. Edgar Allan Poe was
the first to point out this solution, in his 1848 book Eureka: a Prose Poem. But
another part of the answer is that we don’t live in a static, flat space. Rather, we
live in a curved, expanding spacetime, which we shall meet in the next section.
14
1.3 Metrics: the Universe in a nutshell
Solution
Without loss of generality, we can choose coordinates in which the observer
is moving along the x-axis, so δy % = δz % = 0. Now v = dx% /dt% , and from
δs% = δs we have (c δt)2 = (c δt% )2 − (δx% )2 . If we divide by (c δt% )2 , we
find
c2 (δt)2 (δx% )2 (dx% )2
= 1 − = 1 − ,
c2 (δt% )2 c2 (δt% )2 c2 (dt% )2
so
C D2 C D2
δt dx% v2
=1− =1−
δt% c dt% c2
hence
δt% = γ(v) δt.
For the second part, δs = c δτ , where τ is the proper time, i.e. the time
measured by the watch. Here, we have δτ = δt, so the total interval is
0.5c metres, or 0.5 light-seconds. When the watch is being whirled around,
τ is in an accelerating frame, but it’s always true that δs = c δτ , so the total
interval is 0.5 light-seconds.
Solution
The spacetime interval between any two points on a light ray is zero. The
connection to causality is best illustrated in the lightcone diagram shown in
Figure 1.4. An event at the origin can send a message at light speed or
slower to any event in the future lightcone. Similarly, any event in its past
lightcone could have affected it. The spacetime intervals between the origin
and these events are time-like intervals, (δs)2 > 0. Events outside the
lightcone cannot affect, or be affected by, the event at the origin. The
spacetime intervals between the origin and these events are space-like, i.e.
(δs)2 < 0. Points on the lightcone have exactly zero spacetime interval
between one another. The δs = 0 intervals are sometimes referred to as null.
Exercise 1.2 The highest-energy cosmic rays have energies of 1020 eV and
above. Most cosmic rays are protons, with rest masses of 938.28 MeV/c2 . The
diameter of our Galaxy is roughly 100 000 light-years. Calculate how long it
would take to cross the Galaxy, according to the highest-energy cosmic rays.
(Hint: You don’t need to know the conversion between eV and joules, nor do you
need the conversion between light-years and metres.) ■
15
Chapter 1 Space and time
ct
h t re
fu tco
ne
lig
lig futu
tu ne
co
h
re
δx r
θ
c δt φ y
x
x
ht st
pa tcon
ne
lig
lig pa
st e
co
Figure 1.4 The lightcone in special relativity. Figure 1.5 The position of a point can be
The point at position (δx, c δt) has c δt > δx, so specified in terms of Cartesian coordinates
(c δt)2 − (δx)2 > 0, implying that (δs)2 > 0, (x, y, z) or in terms of spherical coordinates
meaning that the invariant interval between that (r, θ, φ).
point and the origin is time-like.
time t1
To describe an expanding Universe, we could modify the metric by multiplying
the spatial parts with a time-dependent expansion factor:
1 (
ds2 = c2 dt2 − R(t) dr2 + r2 dθ2 + r2 sin2 θ dφ2 ,
l(t1 ) where R(t) is called the scale factor of the Universe. A schematic representation
of this is shown in Figure 1.6.
time t2
In fact, the most general homogeneous, isotropic metric is
C D
2 2 2 2 dr2 2 2 2 2 2
ds = c dt − R (t) + r dθ + r sin θ dφ , (1.6)
1 − kr2
where the constant k determines whether the Universe is spatially flat (k = 0),
spherical (k = +1) or hyperbolic (k = −1). (We use only these three values of k
l(t2 ) because other values can be found by rescaling√R and r; for example, if k = −3,
√
Figure 1.6 A cubical volume then the substitutions r% = r 3 and R% = R/ 3 give an equation of the same
of the Universe. The length of form as Equation 1.6 for k = −1.) Figure 1.7 illustrates some two-dimensional
the side l expands with the scale surfaces in which k = +1, 0 or −1, to give you some intuition (if not actually a
factor of the Universe, so visualization) of the three-dimensional counterparts. You might reasonably
l(t2 ) = (R(t2 )/R(t1 )) l(t1 ). object that these two-dimensional representations oversimplify the situation. In
physics (and general relativity especially) it’s often easier to describe something
mathematically than it is to visualize it; physics makes tremendous demands on
16
1.3 Metrics: the Universe in a nutshell
the imagination. Perhaps the human brain doesn’t have the cognitive machinery to
be able to visualize curved expanding spacetimes.
β
b
initially
γ α parallel
lines
β
α
C > 2πb
γ b
initially
parallel
lines
α + β + γ < 180◦
Figure 1.7 Curved surfaces may have geodesics that start parallel, but don’t remain parallel. Also, the angles of
a triangle need not add up to 180◦ , nor is the circumference of a circle necessarily 2π times the radius. The
spherical model has k = +1, the flat model has k = 0 and the saddle-shaped model has k = −1.
Note that Equation 1.6 defines a preferred reference frame, in which the expansion
is isotropic. As we shall see, this is well-supported by observations both of the
large-scale structure of the galaxy distribution, and of the cosmic microwave
background. Nevertheless, is it possible to conceive of a universe consistent with
Einstein’s field equations in which there are no preferred reference frames? One
possibility is a fractal structure, and we shall meet this in later chapters when
discussing inflation.
The field equations of Einstein’s general relativity determine both k and R(t).
These equations can be shown to yield
C D Proving these equations would
dR 2 8πG(ρm + ρr )R2 Λc2 R2
= Ṙ2 = − kc2 + , (1.7) take us a long way outside
dt 3 3 the scope of this book into
C D C D
d2 R d dR 3p R Λc2 R what is usually graduate-level
= = R̈ = −4πG ρ m + ρ r + + , (1.8)
dt2 dt dt c2 3 3 physics, but if you wish to
pursue this rewarding path
where ρm is the average matter density of the matter in the Universe, ρr is an you might try, for example,
equivalent matter density for radiation (derived using E = mc2 ), G is Newton’s Relativity, Gravitation and
gravitational constant, p is the pressure of the matter and radiation, and R is a Cosmology by R. Lambourne for
function of time, R = R(t), though we drop the function notation for clarity and an advanced undergraduate-level
brevity. These equations are known as the Friedmann equations. Both the introduction, or General
densities and p also vary with time. Λ is known as the cosmological constant, Relativity: An Introduction for
and features in Einstein’s field equations for general relativity. Physically, it Physicists by M.P. Hobson, G.P.
represents an in-built tendency of space to expand (or, for Λ < 0, contract). Some Efstathiou and A.N. Lasenby.
special cases are fairly simple: for example, if k = Λ = 0, then R(t) ∝ t2/3 in a
matter-dominated universe, or R(t) ∝ t1/2 in a radiation-dominated universe.
We can derive Equation 1.8 from Equation 1.7 by differentiation. This will give us
a term involving d(ρm + ρr )/dt, but we could treat a part of the Universe as a box
of gas, and use the conservation of energy to show that the change in energy
density equals the p dV work, i.e. d((ρm + ρr )c2 R3 ) = −p d(R3 ). Therefore
d((ρm + ρr )c2 R3 )/dt = −p d(R3 )/dt.
Exercise 1.3 Derive Equation 1.8 from Equation 1.7, using the conservation
of energy. ■
(The issue of p dV work is slightly more subtle in general relativity, since it’s
not immediately clear what the work is done against, but the full relativistic
calculation gives the same result.)
In the next few sections, we shall explore some of the surprising aspects of this
expanding spacetime model, before returning to Olbers’ profound paradox in the
next chapter.
19
Chapter 1 Space and time
expansion of the Universe. The second photon will arrive (R0 /R1 ) δt later than
the first. This implies that distant clocks in the Robertson–Walker universe appear
time-dilated by a factor R0 /R1 , sometimes called cosmological time dilation.
We’ll find it useful to define the dimensionless scale factor a as
R1
a= , (1.9)
R0
so a = 1 today and a < 1 in the past.
A similar argument applies to the photons themselves. Treating them this time as
waves, the distance between two peaks of a light wave will be expanded by the
same factor R0 /R1 , i.e. the wavelength is longer, and the light is shifted to the
red. We define redshift (symbol z) using
R0 1 λobserved
1+z = = = , (1.10)
R1 a λemitted
where λobserved is the observed wavelength of the photon, and λemitted is the
original photon wavelength when the light was emitted. Sometimes this is written
as
λobserved − λemitted
z= . (1.11)
λemitted
A high redshift means that there has been a big increase in the expansion factor
since the light was emitted. Redshift is sometimes misleadingly referred to as
‘recession’, since a receding object would have a Doppler shift to the red. Indeed,
galaxies are not stationary relative to each other, but have relative velocities up to
even 1000 km s−1 . In astronomy these are usually known as peculiar velocities,
and these will indeed contribute both blue and red Doppler shifts. However,
cosmological redshift swamps these effects at distances beyond about 100 Mpc,
and you should not confuse Doppler shifts with the redshift from cosmological
expansion. The distance between us and a distant galaxy is getting bigger because
of the expansion of the Universe, but this is a physically distinct situation to a
galaxy moving away in a flat, non-expanding spacetime.
One alternative to the Robertson–Walker model is the ‘tired light’ universe,
proposed by Fritz Zwicky in 1929. In this model, redshift is due to photons
gradually losing energy during their passage through the universe, due to some
interaction with intervening matter. There are many observations that are difficult
to reproduce in this model, but in particular, the experimental detection of
cosmological time dilation has made this interpretation untenable. Figure 1.9
shows the decay times of supernovae as a function of redshift, which show exactly
the (1 + z) time dilation predicted by theory.
But to measure redshifts, we need to know λemitted . This can be done using atomic
or molecular transitions that occur at particular quantized energies, and so involve
the emission or absorption of photons with particular quantized wavelengths. If
we can identify the transition in the distant object, we know λemitted , provided that
atoms behaved in the same way in the early Universe.
But did they? And if not, how could we tell? It turns out that many characteristic
atomic and molecular transitions can easily be recognized at high redshifts (see,
for example, Figure 1.10), so any differences must be fairly subtle. If the strength
of the electromagnetic interaction were different at early cosmic epochs, this
20
1.5 Cosmological parameters
would change the atomic fine structure constant α = e2 /(4πε0 !c) 1 1/137. The
fractional difference in wavelength (δλ/λ) between a pair of relativistic fine
structure lines is proportional to α2 , so changes in α should lead to wavelength
shifts between some emission lines in distant cosmological objects.
relative decay rate
1.2
1.0
0.8
0.00 0.01 0.02 0.03
(a) redshift, z
relative decay rate
1.2
1.0
0.8
0.6
0.4
0.0 0.1 0.2 0.3 0.4 0.5 0.6
(b) redshift, z
Figure 1.9 Supernova decay rates in the nearby Universe (top), and in the high-redshift Universe (bottom). The
dashed line shows the tired light prediction of no time dilation, and the red line shows the (1 + z)−1 time dilation
expected in the Robertson–Walker metric. The high-redshift data strongly support the expanding universe model;
the measured variation is (1 + z)−0.97±0.10 .
So far, comparisons of the atomic and molecular transitions in the early Universe
with laboratory experiments have not yielded any uncontested evidence for α
being any different in the early Universe; some claimed detections of changes in
α have not been corroborated by other experiments, and it is clear that the
experiments are both very difficult and prone to systematic errors. In terrestrial
laboratories, α̇/α = (−2.6 ± 3.9) × 10−16 per year, i.e. consistent with no
change. However, it remains possible that ongoing cosmological experiments will
make a ground-breaking detection of a change in α. Some speculative theories
allow for possible changes in α, such as supersymmetry or M-theory. However,
these theories do not (or don’t yet) predict the specific variations in α with
redshift that some groups have claimed, in any unique and unforced way.
21
Chapter 1 Space and time
1 8
[O III] Hα
flux/arbitrary units
0.8
flux/arbitrary units
6
Hγ Hβ [O III] [S II] [S III]
0.6
4
Hα + [N II]
0.4
2
0.2
Hβ 0
0 500 600 500 600 700 800 900
observed wavelength/nm observed wavelength/nm
600
[O III] 1
flux/arbitrary units
[O III]
flux/arbitrary units
0.8
400 Hα + [N II]
0.6
[O II] [O III]
Hβ 0.4
200
⊕ 0.2
0 0
400 500 600 700 800 900 1200 1400 1600 1800 2000 2200
observed wavelength/nm
observed wavelength/nm
Figure 1.10 Spectra of objects in the local Universe and in the high-redshift Universe, showing many of the
same characteristic spectral features. The y-axes in all the spectra are relative flux. The top left panel is a planetary
nebula in our Galaxy, M57. The top right panel is an H II region in the Virgo cluster of galaxies. The bottom left
panel shows the spectrum of a star-forming galaxy at a redshift of z = 0.612 (the ⊕ symbol marks absorption from
the Earth’s atmosphere), and the bottom right panel shows another star-forming galaxy at a redshift of z = 2.225.
In all cases there is the characteristic [O III] emission line doublet at a rest-frame wavelength of 495.9, 500.7 nm, as
well as other emission lines such as Hα 656.3 nm, Hβ 486.1 nm, [N II] 654.8, 658.4 nm. The bottom right panel
has the emission lines redshifted into the near-infrared range, in which only certain regions of the spectrum are
available, for reasons of atmospheric transparency.
22
1.5 Cosmological parameters
i.e. the apparent recession velocity v is proportional to distance Y, but recall the
warnings in Section 1.4. Sometimes this apparent flow is called the Hubble flow.
Beware: H is frequently (but misleadingly) known as the Hubble constant
(Hubble constant is a fair description of H0 , however). Although it’s virtually
constant over our lifetimes, it certainly isn’t constant over the history of the
Universe. In some sense, H0 is a measure of the current expansion rate of the
Universe, and it has the value 72 ± 3 km s−1 Mpc−1 , or about 2 × 10−18 s−1 . This
is also sometimes written as H0 = 100h km s−1 Mpc−1 , with h = 0.72 ± 0.03.
This may seem deliberately obtuse, but the Hubble parameter is so fundamental
that it affects many other cosmological measurements, so some observational
cosmologists opt to quote their results in terms of h.
If we divide Equation 1.7 by R2 , we obtain
9 <2
2 Ṙ 8πG(ρm + ρr ) Λc2 kc2
H = = + − 2. (1.14)
R 3 3 R
The terms on the right-hand side drive the expansion of the Universe. It’s common
in cosmology to define their fractional contributions:
8πGρm
Ωm = , (1.15)
3H 2
8πGρr
Ωr = , (1.16)
3H 2
Λc2
ΩΛ = , (1.17)
3H 2
−kc2
Ωk = 2 2 , (1.18)
R H
where the subscript ‘m’ stands for ‘matter’ and the subscript ‘r’ stands for
‘radiation’. Equation 1.14 then implies that
Ωm + Ωr + ΩΛ + Ωk = 1. (1.19)
23
Chapter 1 Space and time
Planck
scale EW BBN now
Ωr Ωm ΩΛ
1
Matter densities are often expressed relative to this critical density. For example,
the baryon density of the Universe, ρb , is sometimes written as
ρb
Ωb = . (1.22)
ρcrit
The matter density of the Universe can be expressed as
ρm = Ωm ρcrit = 1.8789 × 10−26 Ωm h2 kg m−3
= 2.7752 × 1011 Ωm h2 M) Mpc−3 , (1.23)
where 1 M) is the mass of the Sun.
We shall see in later chapters that most of the matter content of the Universe
is dark matter that neither absorbs nor emits light. Dark matter appears to
interact only (or very nearly only) through gravitation, and its only observational
consequences so far have been via its gravitational effects.
The current experimental values from the WMAP satellite (which we shall meet
later) are
Ωm,0 h2 = 0.1326 ± 0.0063, (1.24)
ΩΛ,0 = 0.742 ± 0.030, (1.25)
Ωb,0 h2 = (2.273 ± 0.062) × 10−2 . (1.26)
We’ll show in Chapter 2 that the contribution from radiation and neutrinos gives
24
1.6 The age of the Universe
The value of Ωr,0 is therefore negligible, and we’ll usually assume that it’s zero
in this book. Note that WMAP doesn’t constrain Ωm,0 on its own, but rather
constrains the product of Ωm,0 with the Hubble parameter squared.
25
Chapter 1 Space and time
from which we can easily find dt/dz. Although admittedly pretty ghastly, this
equation does have the advantage of using only present-day observable quantities,
and we’ll be referring to it several times in this book.
In general, dt/dz can’t be integrated analytically, so t(z) can be calculated only
by numerically integrating dt/dz. The time to z = ∞ is the age of the Universe,
and this is shown in Figure 1.12. Equation 1.34 can also be integrated to give the
time taken for light to reach us from redshift z. This is known as the lookback
time.
1
0 =1
.5
1 .2
t0 H
0.8
0 =
1.0
t0 H
0 =
0.9
t0 H
=
0.6
0
t0 H
0.8
=
0 closed
ΩΛ,0
t0 H
0.4
fla
t
0.7
n g
=
0.2 open
rati
ele g 0
H
c
67
a c tin
0.6
lera
t0
c e
de =
0
0
H
t0
−0.2
0 0.2 0.4 0.6 0.8 1 1.2
Ωm,0
Figure 1.12 The age of the Universe times the Hubble parameter, for several
cosmological models. Also shown is the spatial geometry (open, flat and closed)
and whether the present-day expansion of the Universe would be accelerating or
decelerating.
How does this estimate of the age of the Universe compare to the ages of the
oldest objects in the Universe? There is now a well-developed theory for main
sequence stellar evolution that can be used to find the ages of stars. Particularly
useful are globular clusters (e.g. Figure 1.13), which are some of the oldest
gravitationally-bound objects in the Universe. The stars that comprise any given
globular cluster are believed to have formed at about the same time (though the
ages of globular clusters vary). More luminous stars spend less time on the main
sequence in the colour–magnitude diagram, so if one can find the luminosity of
26
1.7 The flatness problem
the stars in a globular cluster that are just leaving the main sequence, one can infer
an age for the globular cluster.
The oldest known globular cluster appears to be 12.7 ± 0.7 Gyr old. In the 1990s
it was recognized that globular cluster ages are an important constraint on the age
of the Universe, and therefore on the cosmological parameters that control the
geometry and fate of the Universe (Figure 1.12). But as we shall see in the next
section, there seemed to be very good reasons to expect that we live in a Universe
with Ωm,0 = 1 and Λ = 0, which as you will see turns out to be significantly
younger, so these stars appeared to be older than the Universe.
Exercise 1.5 Using Equation 1.35, show that the age of the Universe in an
Ωm = 1, Λ = 0 model is t0 = 2/(3H0 ), and evaluate the age in Gyr for the value
of H0 in Section 1.5. A spacetime that expands in this way is sometimes called
the Einstein–de Sitter model. ■
Figure 1.14 shows how the Ω parameters depend on redshift, for various
cosmological models. If Ωm = 1 and ΩΛ = 0, then they keep these values
throughout the history of the Universe. This would remove the need to explain the
cosmological fine-tuning in the Ω parameters, and led to the expectation among
at least some astronomers that the likely values are Ωm,0 = 1 and ΩΛ,0 = 0.
However, there is now good experimental evidence to reject this particular
cosmological model, so we are left with the problem: what caused the fine-tuning
in the early Universe?
28
1.8 Distance in a warped spacetime
Figure 1.16 A simulation of the expanding Universe as seen in proper coordinates and
in comoving coordinates. The panels show a simulation of a segment of the Universe at
redshifts of z = 5.7 (left), z = 1.4 (centre) and z = 0 (right). The upper panels are shown
in proper coordinates, while the lower panels are shown in comoving coordinates. A white
bar shows a comoving length of 125/h Mpc. Note also the gradual increase in large-scale
structure in this simulation with time, which we shall return to in later chapters.
To calculate the comoving distance to a cosmological object, we can use the
Robertson–Walker metric, Equation 1.6. We want to know the radial distance
between us and a distant object, at a fixed coordinate time t = t0 (i.e. the present),
perhaps imagining a tape measure stretched between us and it, which we read at
the time t = t0 . Therefore dt = 0 and dθ = dφ = 0. The remaining non-zero
terms of Equation 1.6 are
−R2 (t0 ) dr2
ds2 = .
1 − kr2
This is −1 times the square of a spatial separation. We define the comoving
distance dcomoving via
R(t0 ) dr R0 dr
ddcomoving = √ =√ (1.36)
1 − kr 2 1 − kr2
(with apologies for the profusion of the letter d) so that
* r
dr%
dcomoving = R0 √ . (1.37)
0 1 − kr%2
29
Chapter 1 Space and time
30
1.9 The edge of the observable Universe
thus
* z
c dz %
dcomoving = - . (1.44)
H0 0 (1 + z % )2 (1 + z % Ωm,0 ) − z % (2 + z % ) ΩΛ,0
(We have integrated the previous differential from z to 0, but used its minus sign
to swap the limits and obtain a positive integral.) There are a few special cases
where this integral comes out as a relatively simple expression, such as when
Λ = 0 and Ωm = 1:
2c ! F
dcomoving = 1 − (1 + z)−1/2 only for Ωm = 1, Λ = 0. (1.45)
H0
guise (Equation 1.13), but warned that the left-hand side is not a recession
velocity, but rather an apparent recession. Here, you see our reason for this
warning! An object moving through a flat, unexpanding space has a maximum
speed of c, but an expanding spacetime is a very different physical situation,
and the maximum cosmological apparent ‘recession’ speed in our Universe is
currently about 3.53c.
32
1.10 Measuring distances and volumes
time
θ
R0 r
1
E
D
B Figure 1.18 The variation
H0 dA /c
33
Chapter 1 Space and time
Again, the Robertson–Walker metric holds another surprise: the angular diameter
distance dA has a maximum value, as you can see in Figure 1.18. What does this
mean? In flat unexpanding space, objects always appear smaller when they are
placed further away, but in the Robertson–Walker spacetime, an object that is
placed further away can appear larger. This is partly because the Universe was
smaller when the light was emitted, so the object was then nearer to us. It’s partly
also to do with the geometry of the space. For example, light rays emitted on a
two-dimensional spherical surface at the South pole will initially diverge, but
will eventually converge again as they approach the North pole. A spherical
unexpanding space would therefore still have a maximum angular diameter
distance.
We shall see how these distances affect Olbers’ paradox later in this book. In the
meantime, the following exercise will give you a clue to how Olbers’ paradox is
resolved in a Robertson–Walker universe.
Exercise 1.6 How does surface brightness (flux per square degree) vary with
redshift? ■
In observational astronomy we rarely measure the total luminosities of distant
objects; instead, we tend to measure the redshift and the flux in a particular
wavelength interval Δλobs . Two effects change the observed flux: first, the
observed wavelength interval Δλobs corresponds to a smaller wavelength interval
in the emitted frame, because Δλobs = (1 + z)Δλem ; second, the distant object
may emit different amounts of light at rest wavelengths of λem and λobs . This
latter effect is known as the K-correction for historical reasons, and we shall
meet it later in this book. If the underlying spectrum is a ‘power law’, i.e. if the
flux per unit frequency is Sν ∝ ν −α , then a useful expression for the luminosity is
C D2
Lν Sν dM
= −26 (1 + z)1+α . (1.51)
1026 W Hz−1 sr−1 10 W Hz−1 m−2 3241 Mpc
Finally, cosmologists often use the term comoving volume to describe volumes
with the expansion factor divided out (Figure 1.16). We shall use this many
times throughout this book. Imagine a patch of sky with an angular area δΩ
(in units, for example, of square degrees). We can convert this to a proper
area at any redshift using the angular diameter distance: δA = d2A δΩ. Now
imagine that we are observing a slab at redshift z with a proper area δA and
proper thickness R(1 − kr2 )−1/2 dr. The proper volume will therefore be
dVproper = δA × (1 − kr2 )−1/2 R dr, or
R dr
dVproper = d2A (z) δΩ × √ . (1.52)
1 − kr2
Now the comoving volume is just dVcomoving = (1 + z)3 × dVproper , so
R dr
dVcomoving = d2A (z) δΩ √ × (1 + z)3 . (1.53)
1 − kr 2
There are many ways of integrating this, but one approach is to express it in terms
of the proper motion distance dM = R0 r. (Recall that in a flat universe this is
equivalent to the comoving distance.) Then
d2M (z)
dVcomoving = J d(dM ) δΩ. (1.54)
1 + Ωk,0 H02 d2M /c2
34
1.11 The fate of the Universe
(1.55)
where DH = c/H0 is sometimes known as the Hubble distance. Also, for
reference, the proper motion distance can be expressed as
: * z L
1 6 G−1/2
DH sin |Ωk,0 |1/2 2
(1 + z) (1 + Ωm,0 z) − z(2 + z)ΩΛ,0 dz if k = +1,
|Ωk,0 |1/2
0
dM = dcomoving if k = 0,
: * z L
1 6 G−1/2
sinh |Ωk,0 |1/2 (1 + z)2 (1 + Ωm,0 z) − z(2 + z)ΩΛ,0
DH 1/2
dz if k = −1.
|Ωk,0 | 0
(1.56)
Equation 1.44 gives the expression for dcomoving . Remember that proper motion
distance dM is equal to comoving distance dcomoving only when k = 0.
10 D
(H0 /c)3 (dV /dz) δΩ
E
1
B
0.1 C Figure 1.19 The variation
of the comoving volume
0.01 derivative dV /dz with
A redshift, for the various
0.001 cosmological models in
0.1 1 10 100
Figure 1.18, and for an
redshift, z
angular area on the sky of δΩ.
This differential equation can be solved numerically, and the predicted fate
of the Universe is shown in Figure 1.20 as a function of Ωm,0 and ΩΛ,0 . The
cosmological parameters in Section 1.5 are very clearly in the regime of
expanding forever.
an g
g b an
bi ig b
g
b
no
2
ting
c e lera ng
1 ac ati
eler
d ec
expands forever
ΩΛ,0
0 recollapses eventuall
y
closed
−1 fla
open t
What will this be like? The Universe will become increasingly sparse, as the
matter density decreases and Ωm tends to zero. The cosmological constant will
then dominate, and the Universe will tend to the ΩΛ = 1 model. Equation 1.14
will reduce to
Λc2
H2 =
3
so the Hubble constant will, finally, be truly a constant. The expansion will be
exponential, as you can see from substituting H = Ṙ/R into the equation above:
#
1 dR Λc2
= ,
R dt 3
√
which has the solution R ∝ ect Λ/3 . This is sometimes known as de Sitter
spacetime.
This exponential expansion has a curious consequence. Regions of the Universe
that were once in causal contact eventually lose contact with each other, as the
rapidly expanding space makes it impossible for even light signals to pass between
them. To show this, imagine a light signal being sent out in this universe. How far
can it get? The light signal will be in an exponentially expanding universe, so in a
sense it will go infinitely far, but if we normalize our distances by the scale factor,
then we’ll see that the light signal reaches only a finite comoving region.
36
1.11 The fate of the Universe
Light signals still satisfy Equation 1.39, i.e. R(t) dr = c dt (note that k = 0
because Ωk = 1 − Ωm − ΩΛ = 1 − 0 − 1 = 0), but this time R(t) is very
different. If we set R(t) = R1 at a time t = t1 , then
√
R(t) R(t) ect Λ/3 √
= = √ = ec(t−t1 ) Λ/3 .
R(t1 ) R1 ect1 Λ/3
If we treat R1 r as our choice of comoving distance, then we have
√
R1 dr = c e−c(t−t1 ) Λ/3 dt
so
* ∞ √ #
−c(t−t1 ) Λ/3 c 3 c
R1 r = ce dt = - = = .
t=t1 c Λ/3 Λ H
So as time
- tends to infinity, the light signal penetrates a comoving
- distance of only
R1 r = 3/Λ. Objects beyond a comoving distance of 3/Λ cannot be seen,
because the intervening space is expanding so quickly that even a light signal
cannot cross it. But H is constant and the expansion rate is unchanging, so this is
true at any time.
- What this would look like is a fixed horizon around you at a
distance of 3/Λ, and neighbouring galaxies being accelerated away from you
towards this horizon and out of your observable Universe, which is gradually
being emptied out. However, you would never see a galaxy cross this horizon: its
redshift would get larger as it approached the horizon, and if you could watch a
clock in that galaxy, the time dilation of that clock would get longer. If t2 is the
coordinate time when the galaxy reaches the horizon, then you would see the
clock slow at it approached t2 , but it would never quite reach t2 from your
point of view. However, from the galaxy’s point of view, the passage of time is
unaffected. There, they would see your clocks running slowly, as you passed out
of their observable Universe.
You may recognize this redshifting and time dilation from descriptions of objects
falling into the event horizon of a black
- hole (which you will also meet in
Chapter 6). Indeed, the horizon at 3/Λ is a cosmological event horizon. The
Universe in the far future will look like a black hole, but inside-out.
Exercise 1.7 How big, in megaparsecs and in metres, will the cosmological
event horizon be? You will need the cosmological parameters in Section 1.5. How
does this compare to the current size of the observable Universe? ■
How far ahead can we look? When Ralph Alpher and George Gamow realized
that the early Universe was hot and dense enough for nuclear reactions, and
calculated the amount of heavy elements production, they were condemned by
some physicists for their rashness. What grounds do we have, the critics argued,
for believing that the same physical theories applied three minutes after the
Big Bang? The Universe provided the rebuttal: the predictions of primordial
nucleosynthesis have been very extensively confirmed, as we shall see in later
chapters. Nevertheless, the words of warning from these critics should still ring in
our ears as we extrapolate to the distant future.
At the moment all the baryons in the Universe are either involved in the cycle of
star birth and death, or could potentially take part. However, by about the year
one trillion (1012 years), the Universe will be too sparse to support more star
37
Chapter 1 Space and time
formation. At that point, baryons will either be in degenerate matter (in white
dwarfs or neutron stars), or be locked up in brown dwarfs, or have fallen into black
holes, or just be atoms or molecules too sparsely distributed to form new stars.
Looking further ahead, we eventually reach the epoch of possible proton decay.
In the standard model of particle physics, the proton is stable and does not decay.
However, some ‘grand unified theories’ in particle physics predict eventual
proton decay. The best current limit on proton half-life t1/2 comes from the
Super-Kamiokande experiment in Japan, which found t1/2 > 1035 years. Perhaps
1035 years is the furthest ahead that one might venture to predict the contents of
the Universe. But who would be around to contradict you if you got it wrong?
Summary of Chapter 1
1. In a flat Euclidean space, the number counts of a homogeneous and isotropic
distribution of objects vary as dN/dS ∝ S −5/2 , but the total flux diverges.
2. In special relativity, lengths and times are not observer-independent, but the
relativistic interval s is invariant.
3. δs = 0 always for light rays, and δs = c δτ always for massive particles,
where τ is proper time.
4. Any homogeneous, isotropic expanding Universe consistent with special
relativity can be described by the Robertson–Walker metric
C D
2 2 2 2 dr2 2 2 2 2 2
ds = c dt − R (t) + r dθ + r sin θ dφ . (Eqn 1.6)
1 − kr2
5. Cosmological redshift z, given by
R0 1 λobserved
1+z = = = , (Eqn 1.10)
R1 a λemitted
is caused by the expansion of the Universe, not by the Doppler effect.
Random galaxy motions (known as proper motions) can contribute
additional red or blue shifts from the relativistic Doppler effect.
6. Nevertheless, if one regards cosmological redshift as an apparent recession
velocity, then the apparent velocity is proportional to distance from the
observer, with the constant of proportionality known as the Hubble
parameter.
7. The contributions to the energy density of the Universe from matter and the
cosmological constant are denoted as Ωm and ΩΛ , respectively, and are
defined by
8πGρm
Ωm = , (Eqn 1.15)
3H 2
Λc2
ΩΛ = . (Eqn 1.17)
3H 2
These determine the age and fate of the Universe.
8. Neglecting radiation, if Ωm + ΩΛ = 1 at any time, then this is true at all
times. Also, if either Ωm or ΩΛ is zero, then this is also true at all times. In
all other situations, there is a fine-tuning problem in the early Universe for
the values of Ωm and ΩΛ .
38
Summary of Chapter 1
Further reading
• For a more leisurely introduction to the Robertson–Walker metric, see
Lambourne, R., 2010, Relativity, Gravitation and Cosmology, Cambridge
University Press.
• For a useful review of distance measures in cosmology (though pre-dating dark
energy, which we shall meet in later chapters, and the observation that
ΩΛ 1 0.7), see Carroll, S.M., Press, W.H. and Turner, E.L., 1992, ‘The
cosmological constant’, Annual Review of Astronomy and Astrophysics,
30, 499.
39
Chapter 2 The cosmic microwave background
I would rather live in a world where my life is surrounded by mystery than
live in a world so small that my mind could comprehend it.
Harry Emerson Fosdick
Introduction
We effectively answer Olbers’ paradox in this chapter, with the first and most
famous cosmic background light. You will also find out how quantitative
cosmology is done using this background, which has already resulted in two
Nobel prizes.
40
2.1 The discovery of the cosmic microwave background
wavelength/cm
Figure 2.1 The intensity of
300 30 3 0.3 0.03 the CMB radiation, measured by
10−17
ground-based various techniques. Both the
10−18 balloon x-axis and y-axis are plotted
intensity/J s−1 m−2 sr−1 Hz−1
Universe. From a detailed statistical mechanical calculation2 (we shall spare you
the details), the fraction of ionized gas x near z = 1000 comes out as
-
−3 Ωm,0 h2 ! z F12.75
x(z) 1 2.4 × 10 , (2.1)
Ωb,0 h2 1000
i.e. it depends on the density of baryons relative to the critical density, Ωb,0 , and
on Ωm,0 . We’ll see later how the CMB gives estimates of Ωm,0 and Ωb,0 . (The
physical process of electrons binding with protons to make hydrogen is known as
‘recombining’ but this is a misnomer here, because they are in fact combining for
the first time.)
2
Jones, B.J.T. and Wyse, R.F.G., 1985, Astronomy and Astrophysics, 149, 144.
41
Chapter 2 The cosmic microwave background
cities. They also found pigeons roosting in the antenna and needed to remove
what Penzias later called a ‘white dielectric material’. Although they trapped
the pigeons and released them thirty miles away, the pigeons kept returning;
reluctantly, the birds were eventually shot. The anomalous isotropic noise source
remained, and neither Penzias nor Wilson could find its source.
Unknown to both, a rival group in Princeton, led by Robert Dicke, had just
predicted an isotropic CMB from the Big Bang theory. Dicke’s group were
planning an experiment to detect it. Once Dicke heard of Penzias and Wilson’s
anomalous noise, Dicke spoke on the telephone to the Bell Laboratories group,
after which he announced to his team: ‘Boys, we’ve been scooped’.
The result was that two back-to-back papers in the Astrophysical Journal
announced the discovery to the world: Penzias and Wilson published the
discovery itself, while Dicke’s group published the theoretical explanation.
Penzias and Wilson co-won the 1978 Nobel Prize in Physics for their discovery
(along with Pyotr Kapitsa for a different discovery).
The story has further twists. It turned out that the CMB prediction was implicit in
calculations published in 1948 by George Gamow, Ralph Alpher and Robert
Herman. Furthermore, in 1934 Andrew McKellar measured the ‘effective
temperature of interstellar space’ in a careful experiment using spectroscopy to
derive the typical energy level excitations of molecules in the interstellar medium
(we shall see how in Section 2.2 and in later chapters). He found this to be about
2.3 K, but the work pre-dated the Big Bang predictions and neither he nor any
reader appreciated its significance at the time. The currently accepted value of the
CMB temperature is 2.725 ± 0.001 K.
Figure 2.2 All-sky maps of Exercise 2.2 Stefan’s law can be shown to imply that the energy density is
the CMB temperature. The top 4σT 4 /c. Use this to show that the present-day CMB radiation energy density of
image is scaled from 0 K to 4 K the Universe is Ωr,0 h2 = 2.5 × 10−5 . (The value of Stefan’s constant σ is
and looks very uniform. The 5.67 × 10−8 W m−2 K−4 .) ■
next image has a much smaller
There is also an additional contribution from light neutrinos to Ωr of about
scaling, with a range of just
68%, so the total relativistic energy density is slightly larger (Equation 1.27 in
3.353 mK. The pattern is the
Chapter 1) with a value Ωr,0 h2 = 4.2 × 10−5 .
dipole that results from the
Doppler shift due to the Earth’s The CMB is also remarkably uniform (Figure 2.2); we’ll show that this presents a
motion relative to the cosmic very serious cosmological problem. If we increase the contrast level, we first find
rest frame (Section 2.11). After a characteristic hot and cold pattern (Figure 2.2). We’ll show in Section 2.11 that
accounting for this motion this is caused by our motion relative to the cosmic rest frame. Correcting for
and further restricting the this motion, we find a strong signal from our Galaxy (the horizontal band in
temperature range to just 18 µK, Figure 2.2), and apart from this we find that the CMB has intrinsic fluctuations at
we see a large band due to our the level of microkelvins. There is no single clear physical theory for the level of
Galaxy and the background these fluctuations, though we’ll see in Section 2.8 what we know about how they
primordial fluctuations where may have been generated.
the Galactic foreground does not
outshine them. These three
images were taken with the 2.2 The CMB temperature as a function of
COBE satellite; the final image
is higher-resolution data from
redshift
the WMAP satellite. Why is the CMB such a perfect black body? Shouldn’t the redshifting of the
photons distort the spectrum? It turns out that black body radiation has the
42
2.2 The CMB temperature as a function of redshift
the relative strengths of emission lines in high-redshift objects to constrain the gas
temperatures, and so place upper limits on the CMB temperature.
Srianand, R. et al., 2008, One recent measurement of this temperature at a redshift of z = 2.418 37 yielded
Astronomy & Astrophysics, 482, T = 9.15 ± 0.32 K using carbon monoxide rotational modes, and the CMB has
L39–42. been argued to dominate the CO excitation in this system. This beautifully
confirms the predicted CMB temperature of 9.315 ± 0.007 K at that redshift.
One subtle question often asked about the CMB temperature is: where did the
photons’ energy go? The emitted energy of any photon would have been h νemitted ,
but the energy received is h νemitted /(1 + z).
Could it be a gravitational redshift? A gravitational field can certainly redshift
photons. A light signal sent from the surface of the Earth will be received in space
with a slightly redder wavelength, because energy has been lost by climbing up
the gravitational potential well. However, this cannot be the case in the expanding
Universe, because it is homogeneous and isotropic.
An analogy is sometimes made to the adiabatic expansion of a photon gas. If you
have a photon gas contained in a box, and expand the box by a factor (1 + z) (so
the volume increases by (1 + z)3 ), the energy density will decrease by a factor
(1 + z)−4 and the temperature by (1 + z)−1 , as we have argued above. However,
in this case the photon gas does p dV work against the sides of the box. In the
cosmological case, though, the issue of the p dV work is much more subtle.
Unfortunately, this issue takes us into very deep waters. In general relativity, the
separate concepts of energy conservation and momentum conservation are
replaced by zero derivatives of the ‘energy–momentum tensor’. The short answer
is that ‘energy per unit volume’ becomes dependent on the reference frame, so
‘energy’ on its own cannot be said to be conserved, although a more general
conservation law does apply (see the further reading section).
To see why ‘energy per unit volume’ is dependent on the reference frame, imagine
that the Universe is filled with a pressureless gas of particles, and for simplicity
assume just the flat Minkowski metric of special relativity. Suppose that the
particles each have a rest mass m and that their number density is n particles per
unit volume. The energy density will therefore be mc2 × n. If we now make
a Lorentz transformation to a reference frame that’s moving relative to the
first, we can see that the mass will be increased by a factor of γ to γm, while
Lorentz contraction of the volumes will result in an increase in n by the same
factor, to γn. The moving observer will therefore see an energy density of
γmc2 × γn = γ 2 mc2 n. Therefore the energy density isn’t an invariant scalar,
because different observers see different energy densities. It also can’t be a
component of a four-vector, because when you Lorentz transform a four-vector
you get only one γ factor. So ‘energy density’ has to be part of a different sort of
mathematical object. This object is a tensor; we shall meet more examples of
tensors later in this book.
Exercise 2.3 Show that the redshift of matter–radiation equality zeq , when the
energy densities of matter and radiation (including neutrinos) were comparable, is
given by
1 + zeq = 23 800 Ωm,0 h2 (TCMB,0 /2.725 K)−4 , (2.4)
where TCMB,0 is the present-day CMB temperature. Evaluate zeq . ■
44
2.3 Why is the CMB a black body?
Exercise 2.4 In Section 1.6, we showed that a universe with Ωm = 1 and all
other density parameters zero obeys a = R/R0 ∝ t2/3 . Make a similar analysis
for a universe in which Ωr = 1 and the other density parameters are negligible,
and show that in this case a = R/R0 ∝ t1/2 . ■
and partly because they struggle to account for other observations (microwave
background and its anisotropies, the evolution of large-scale structure, Big Bang
nucleosynthesis, etc.) without appearing contrived to many. For these reasons we
won’t discuss these models further in this book.
2.4 Baryogenesis
One of the deep unsolved problems in fundamental physics is why our Universe
has more matter than antimatter. In the standard model of particle physics,
baryon number is strictly conserved. (A baryon such as a proton or a neutron has
a baryon number of B = +1, while an antibaryon such as an antiproton or
antineutron has a baryon number of B = −1.) Baryons are created and destroyed
in baryon–antibaryon pairs, conserving baryon number, yet somehow we find
ourselves in a Universe composed almost entirely of matter. We could adopt the
position that it’s somehow set in the initial conditions, as is also argued for
‘explaining’ the expansion of the Universe, but in both cases this arguably evades
the question. One might suppose that there are distant galaxies made of antimatter
rather than matter, but the intergalactic medium is not empty and there should be
very clear observable signatures from ongoing annihilation along the boundary
between the matter and antimatter regions.
While the Universe was hot enough that kT was above the proton rest mass
energy, protons and antiprotons would have been being created and destroyed
from particles colliding at their thermal velocities. As the Universe expanded and
cooled, this baryon–antibaryon creation eventually ceased and the protons
and antiprotons were free to annihilate, and it turns out that the collision rates
were high enough for this to happen very efficiently. What we see now as a
Universe with mainly matter (rather than antimatter) is in fact the relic of a subtle
asymmetry in the early baryon versus antibaryon numbers, of the order of
1 + 10−9 protons for every antiproton. What generated this initial imbalance?
Why is there any matter left in the Universe?
This on its own tells us that there must be new physics beyond the standard model
of particle physics. What sort of theory could explain baryosynthesis? The answer
may come from so-called grand unified theories (GUTs) that unify three of the
four fundamental forces: the strong nuclear force, the weak nuclear force and
electromagnetism. These forces appear distinct with different strengths, but their
strengths are predicted in GUTs to converge at energies of ∼1015 GeV. GUT
reactions above these energies could be the source of the present-day baryon
asymmetry in the Universe. However, it’s by no means certain that this is the
correct answer. An epoch of inflation (which we shall meet in Section 2.8) would
erase any pre-existing baryon asymmetry. Inflation is thought to be triggered by a
GUT-scale phase transition (see Section 2.8), and once inflation has finished, it
leaves the Universe at a lower temperature than the GUT scale. This would leave
us with no baryon asymmetry, unless generated during the processes that end
inflation.
There must also have been a primordial lepton asymmetry, otherwise the excess
number of protons over antiprotons would have left the Universe with an overall
electric charge. GUTs view baryons and leptons as different states of one common
species of particle, and in many GUTs, B − L is conserved (L being the lepton
46
2.5 The entropy per baryon
number, e.g. +1 for e and νe , −1 for e+ and ν e ), so this lepton asymmetry would
be a natural consequence of the baryon asymmetry. The present-day lepton
asymmetry would now reside in the cosmic neutrino background, which we shall
meet in Section 2.6.
much rarer relative to photons, compared to the Sun’s core. Nuclear reactions at
that time were therefore slow. However, it turns out that the nuclear reaction rates
were significant earlier on, at higher temperatures of around 109 K.
It’s an extraordinary intellectual triumph of the Big Bang theory that it’s possible
to calculate the nuclear reaction rates in the early Universe and estimate the
abundances of nuclei and particles from this primordial nucleosynthesis. These
estimates are in fairly good agreement with observations, as we shall see in this
section. The key concept is freeze-out, which we shall meet first in the context of
protons and neutrons.
In thermal equilibrium, the relative numbers of protons and neutrons, np and nn
respectively, will be related through a Boltzmann distribution:
C D C D
nn −Δm c2 −1010.176 K
= exp 1 exp , (2.7)
np kT T
where Δm 1 1.29 MeV/c2 is the mass difference between a neutron and a
proton. (Strictly speaking this is true only when the protons and neutrons are
non-relativistic, but this is the case in the following discussion.) Protons
and neutrons will be converting between each other through the reactions
p + e− ! n + νe and p + ν e ! n + e+ . The rate v of either reaction
can be calculated through the theory of the weak nuclear force, and in the
high-temperature limit it turns out that the reaction rates are the same and have a
very strong temperature-dependence:
C 10.135 D−5
10 K
v= s−1 . (2.8)
T
Meanwhile, the ambient temperature is changing as the Universe expands, varying
as (1 + z) (see Section 2.2). The early Universe was approximately spatially flat
(because Ωk 1 0 in the early Universe — see Chapter 1) and radiation-dominated
(also Chapter 1), from which we find that the scale factor satisfies R ∝ t1/2
(Exercise 2.4). From this it follows that the temperature varies with time as
t ∝ T −2 , and putting in the constants of proportionality gives
C 10.125 D2
10 K
t= s. (2.9)
T
As a result, the reaction timescale (1/v) for p + e− ! n + νe will quite suddenly
become longer than the age of the Universe at a time t = 1/v. After this point the
neutron–proton reactions cease and the neutron–proton ratio (Equation 2.7) is
frozen out at the equilibrium value that it had at that time. The time t = 1/v
corresponds to a temperature of
C 10.125 D2 C 10.135 D5
10 K 10 K
= , (2.10)
T T
which we can rearrange to find T = 1010.142 K. Plugging this into Equation 2.7,
we find that the relic neutron–proton ratio must be
C D
nn −1010.176 K
1 exp 1 0.34. (2.11)
np 1010.142 K
At this time, the Universe was only about one second old (Equation 2.9).
48
2.6 Primordial nucleosynthesis: a thousand seconds that shaped the Universe
There are some slight corrections to this, and more detailed calculations
give nn /np 1 1/7 for the present-day relic abundance of neutrons and
protons. For example, we’ve assumed an instantaneous transition. Also,
the temperature-dependence of the rate will be slightly modified at lower
temperatures, because Equation 2.8 is the high-T limiting case. Another potential
complication is the fact that other reactions will be happening at the same time as,
for example, p + e− ! n + νe . For instance, atomic nuclei will form from the
protons and neutrons. While the ambient thermal energy kT of particles is much
larger than nuclear binding energies, these nuclei will quickly be destroyed again,
so we can ignore these reactions for now.
We’ve gone carefully through this freeze-out process because it’s one of the
key physical principles in the nuclear reactions of the primordial fireball. For
example, the electron–positron annihilation creates neutrinos through the weak
interaction reaction e− + e+ ! νe + ν e (as well as annihilating through the
electromagnetic interaction e− + e+ ! γ + γ). Neutrino production freezes out
at a temperature of approximately 1010.5 K, corresponding to a cosmic time of
about 0.18 seconds, leaving most of the electrons and positrons to annihilate to
make more photons, which happens at a temperature of T 1 me c2 /k 1 109.77 K,
corresponding to a time (Equation 2.9) of about five seconds, i.e. shortly after the
neutron–proton freeze-out. Neutrinos interact only very weakly with other matter,
and their freeze-out happened before the epoch of the CMB; there should be a
cosmic neutrino background from much earlier cosmic epochs than the CMB.
Detailed calculations of the relic abundances of protons and neutrons take
into account the ongoing changes in the neutrino population, though we won’t
discuss this in this book. There are formidable experimental challenges to the
direct detection of the primordial neutrino background, though the presence
of these neutrinos can be inferred indirectly from the structures in the CMB.
(Nevertheless, neutrinos from astrophysical sources have been detected, most
famously from the supernova SN 1987A.)
Neutrons aren’t stable but instead decay with a half-life of τ = 885.7 ± 0.8 s.
Why are there any left? Why isn’t the Universe pure hydrogen? Luckily for us,
the temperatures soon dropped enough to allow the formation of atomic nuclei. To
see why, compare the binding energy of the first nuclide heavier than hydrogen
(deuteron, 2.225 MeV) with the electron rest mass energy (0.511 MeV) and the
neutron–proton mass difference (1.3 MeV). The Universe was still only a few
seconds old.
A note on terminology
Children are taught at school that nuclear reactions are not combustion, so
it’s not correct to refer to nuclear reactions as ‘burning’. This is quite right.
However, at this level it’s usually felt that there is no danger of confusing
this with combustion, so the technical literature makes free use of the verb
‘to burn’ and related words. For example, the early Universe is sometimes
referred to as the primordial fireball. I once heard a supernova described in a
seminar as ‘like a forest fire, but the trees can run away’.
49
Chapter 2 The cosmic microwave background
The nuclear reactions in the next 1000 seconds or so shaped the baryonic content
of the Universe. To calculate the final mix of elements left at the end of these
nuclear reactions, you need to consider all the different reactions. We won’t do
this here (see the further reading section for more details), but the main processes
are summarized in Table 2.1 and Figure 2.3.
1 1
1H
−1 4
10 n 2 He
10−2
10−3
2
10−4 1H
10−5 3
1H
3
mass fraction
10−6 2 He
10−7
10−8
10−9
10−10 7
7 3 Li
4 Be
10−11
10−12
10−13
10−14
10 102 103 104
time/seconds
Figure 2.3 Predicted light element abundances during the primordial fireball.
Many of the reactions involved in changing these abundances are listed in
Table 2.1.
1
4
10−1 2 He
number abundance relative to hydrogen
10−2
10−3
10−4
10−5
3
10−6 2 2 He + 21 H
1H
10−7
10−8
7
10−9 3 Li
10−10
Figure 2.4 The light element abundances relative to 11 H, as a function of the present-day baryon density times h2 .
The vertical area is the constraint on the baryon density from WMAP. The curves show the predictions from
nucleosynthesis calculations, while the horizontal boxes show the observational constraints. There is a broadly
consistent picture apart from the 73 Li abundance, but this element can be destroyed in stars so is difficult to measure.
The best constraints on the deuterium abundance have come from the absorption
lines in neutral hydrogen clouds. Here, the light from a background source
(such as a quasar) passes through a foreground dense neutral hydrogen clump.
The abundance of deuterium is low, but if that clump is sufficiently dense,
51
Chapter 2 The cosmic microwave background
52
2.7 The need for new physics
hints. For example, where did the matter–antimatter asymmetry of the Universe
come from (Section 2.4)? Also, what caused the initial inhomogeneities in the
Universe? If the Universe were perfectly homogeneous, it would have stayed
homogeneous and no stars or galaxies would have formed. Something must have
given the Universe its initial density perturbations. Also, what triggered the initial
expansion? This is often written off as part of the initial conditions, but isn’t
that evading the question? We’ve also met the fine-tuning problems for the
cosmological density parameters Ωm and ΩΛ in Chapter 1 (Section 1.7), known as
the flatness problem.
There is another very fundamental problem posed by the uniformity of the
microwave background itself, known as the horizon problem. Suppose that you
are at some place in the early Universe, arbitrarily close to the time of the Big
Bang. You send a photon out. Neglecting the opacity of the Universe, how far will
that photon travel? Anything further could not have been in causal contact with
you since the Big Bang, so the distance that the photon travels sets the size of the
causally-connected region.
We start from the Robertson–Walker metric (Equation 1.6) with the approximation
of a spatially-flat universe (Figure 1.11) so k = 0. We can set the origin to where
the photon starts so the light ray is radial, so dθ = dφ = 0. Also, all light rays
have ds = 0, so we have that R(t) dr = c dt, where R(t) is the scale factor of the
Universe, as in Chapter 1. The proper distance travelled by the photon will
therefore be
! t
c dt
r= . (2.12)
0 R(t)
In the early radiation-dominated Universe, R ∝ t1/2 (Exercise 2.4) so this integral
converged to a finite value. In the later matter-dominated Universe (before ΩΛ
became significant), we had R ∝ t2/3 , which again converges. The size of this
causally-connected region is known as the particle horizon. Note that this is
different to the event horizon (Section 1.11). To calculate the event horizon, you’d
integrate from t to infinity, not from 0 to t.
We saw in Section 1.9 that Equation 1.44 implies that the size of the comoving
distance to z = ∞ in an Ωm = 1, Λ = 0 universe is 2c/H0 . The proper distance
to the particle horizon must therefore be 2c/H(z), where H(z) is the Hubble
parameter at redshift z. (If we had assumed a radiation-dominated universe, this
would come out as c/H(z), which is a factor of two smaller.) The value of H(z)
at the time of recombination comes out as 18 200 × H0 using Equation 1.33 and
the values of the density parameters in Section 1.5. The particle horizon size then
comes out as 2c/H = 2c/(H0 × 18 200) = (2c/H0 ) × 5.5 × 10−5 , or about
0.46 Mpc for an H0 of 72 km s−1 Mpc−1 .
The horizon problem is that 0.46 Mpc on the CMB is very small: by
numerically integrating Equation 1.44, we find that the comoving distance
to z = 1090 (the redshift of CMB) is 14 189 Mpc, so the angular diameter
distance dA = dcomoving /(1 + z) = 14 189/1091 ! 13 Mpc. Using Eqn 1.47,
the angular size of the particle horizon at the time of recombination is
θ = D/dA = 0.46/13 radians ! 2 degrees (slightly less if we take into account
the early radiation-dominated phase). We’ve just shown that objects further apart
than this distance could not have been in causal contact, so how is it that parts of
the CMB sky more distant than two degrees ever managed to look so similar?
53
Chapter 2 The cosmic microwave background
There is also a problem that arises from almost all grand unified theories (GUTs)
that seek to unify three of the four fundamental forces (electromagnetism, strong
nuclear force, weak nuclear force). As the Universe expanded and cooled, the
GUT field (whatever it was) would settle into particular configurations. This
is rather like paramagnetism, where below a critical temperature (the ‘Curie
temperature’) the magnetic moments of molecules align with those of their
neighbours into magnetic domains. In the cosmological case these domains can
have various sorts of boundaries, including a monopole state where the local field
points radially away from a particular point. Macroscopically this would look like
a magnetic monopole. GUTs predict about one monopole per horizon size at the
time when the Universe was at the critical GUT temperature, but as this was very
early in the Universe, the horizon size was small. Therefore the present-day
Universe should have many magnetic monopoles — so many, in fact, that they
would dominate the energy density of the Universe. Why do we not see them in
the Universe? This is known as the monopole problem.
Perhaps the solution to all these problems is at the Planck epoch. We currently
have no consistent theory that unifies quantum mechanics and general relativity.
Where should we expect such a theory to be needed? Presumably the theory
would need to use !, G and c, so we can use these to derive a characteristic length,
mass and time:
#
!c
mPl = 1 1019 GeV/c2 , (2.13)
G
#
!G
rPl = 1 10−35 m, (2.14)
c3
#
!G
tPl = 1 10−43 s. (2.15)
c5
These are known as the Planck mass, Planck length and Planck time,
respectively. When a mass, length or time interval under consideration is of
the order of the Planck scales, we should expect an unknown theory of
quantum gravity to be needed. Clearly the initial singularity at t = 0 in the
Robertson–Walker metric is an example, as is the singularity at the centre of a
black hole (Chapter 6). Note that the GUT energy scale of 1015 GeV is a factor of
104 from the Planck scale. While 104 might be considered a large factor, the
current temperature of the CMB of 2.7 K is equivalent to about 2 × 10−4 eV, i.e. a
factor of about 1028 from the GUT epoch.
Exercise 2.5 Show on dimensional grounds that - the only characteristic
timescale involving !, G and c is proportional to !G/c5 . ■
It may be that in order to solve all these problems (monopole, flatness, horizon,
baryon asymmetry, density perturbations, initial expansion) we need the unknown
theory of quantum gravity at the earliest times in the Universe (t 1 tPl ). However,
there has been a proposal to solve these problems at the later GUT epoch in the
Universe’s history when the characteristic temperature was around the GUT scale
of approximately 1015 GeV, in which GUT-scale physics triggers a very rapid
phase of expansion known as inflation. Before describing what triggers this
phase, we’ll first look at how this solves some of these problems.
)t
We’ve shown that the particle horizon size is r = 0 c dt% /R(t% ). If we want
this to diverge, we’ll need an expansion rate R(t) much faster than the t1/2
54
2.7 The need for new physics
Exercise 2.6 Show that the inflation condition that α > 1 is equivalent to the
scale factor accelerating, i.e. d2 R/dt2 > 0. ■
Inflation can also explain why the Universe is so close to being spatially flat (the
flatness problem). From thermodynamics, an adiabatic expansion of a gas with an
equation of state parameter w satisfies p ∝ V −(1+w) , where p is the pressure and
V is the volume. If the rest mass density is negligible, then ρ ∝ V −(1+w) too. In
this case we can write ρ ∝ R−3(1+w) , where R is the scale factor. (As in the
photon gas in Section 2.2, we’re neglecting the issue of p dV work, but this
equation turns out to be true in the fully general relativistic case.) The key to
solving the flatness problem is in Equation 1.7 from Chapter 1, which we’ll
reproduce here:
C D
dR 2 8πGρR2 Λc2 R2
= − kc2 + . (Eqn 1.7)
dt 3 3
Again, we’ll neglect the Λ term because it makes a negligible contribution
to the dynamics in the early Universe. Suppose that the Universe had
some arbitrary curvature constant k before inflation. As inflation
expanded the Universe, the ρR2 term in the equation above will vary as
ρR2 = R−3(1+w) R2 = R−3(1+w)+2 = R−(3w+1) . If w < −1/3, then the ρR2
term increases with the scale factor. This means that the ρR2 term will eventually
dominate and the kc2 term will be very small in comparison, and can be neglected
or taken to be zero. Thus after inflation, the Universe is left in a state that is close
to spatial flatness. Another way of thinking of this is that the process of inflation
takes one tiny local patch that appears locally flat, and expands it enormously.
Thus no matter how wrinkly the initial state of spacetime before inflation, a small
enough local region will appear locally flat, so the result after inflation is a
spacetime that’s spatially flat. Spatial flatness is in fact a key prediction of
inflation.
Finally, inflation also changes the estimates of the age of the Universe, because
the epoch of inflation could be arbitrarily long. In principle this is one way
of solving the singularity at t = 0 in the Robertson–Walker metric. Having
said that, we can calculate the minimum time needed for inflation to solve the
56
2.8 The inflaton field
57
Chapter 2 The cosmic microwave background
V (φ)
0.05 false
Figure 2.7 Illustration of
vacuum how temperature-dependent
effects can create a false
0 vacuum. Early in the history
of the Universe, the inflaton
V (φ)
field is around the energy
minimum at φ = 0, but as the
−0.05
Universe cools, a second,
φ deeper minimum appears
true
vacuum elsewhere. The Universe
Figure 2.6 Schematic −0.1 slides or (if necessary)
representation of the value of the 0 0.5 1 quantum tunnels to the new
inflaton field φ, versus the φ
minimum.
energy associated with the field
V (φ). In either case, there is an energy difference between the upper and lower vacuum
states of ΔV . If we take V = 0 as the true vacuum, then the elevated state has an
effective cosmological constant (though strictly speaking this is a misnomer as it
would not be constant in this case). The order of magnitude for ΔV expected in
GUTs is huge, giving prima facie plausibility to inflation: the characteristic
energy density can be shown to be (on, for example, purely dimensional grounds)
4
ρ 1 EGUT /(!3 c5 ) 1 (1015 GeV)4 /(!3 c5 ) 1 1080 kg m−3 .
Going back to our analogy of an object sliding inside a potential well (Figure 2.6),
the full equation of motion turns out to be (in natural units of c = ! = 1, see box):
dV (φ)
φ̈ + 3H φ̇ − ∇2 φ + = 0. (2.18)
dφ
The derivation of this formula is lengthy, but it follows ultimately from energy
conservation considerations in the quantum scalar field. The gradient ∇ is with
respect to proper spatial coordinates (not comoving ones), and the dots are time
derivatives. Note that it involves the Hubble parameter H. In the analogy of an
object sliding down the valley, the H φ̇ term is equivalent to a friction term, while
dV /dφ is the force acting on the object.
Natural units
Particle physicists sometimes opt to use ‘natural units’ in which ! = c = 1
to keep the algebra simpler, avoiding fiddly factors of ! and c that can be
determined at the end from dimensional analysis. The thinking is to treat !
and c as implying ‘conversion factors’ between different dimensions. For
example, c could be thought of as the conversion factor between space
measurements and time measurements. What defines these conversion
factors? Well, for us it’s about how we choose to measure lengths
(e.g. metres) and times (e.g. seconds). The Universe doesn’t care whether
we use metres or seconds, or miles and years, so why not choose units in
which c is set to one? For us c has dimensions LT−1 (e.g. metres per
second), so c = 1 has the effect of treating space units in the same way as
time units. In natural units, energies have the same dimensions as mass
(because E = mc2 ) and 1/time (because E = hν).
58
2.8 The inflaton field
A similar consideration turns out to give the pressure and energy density, again in
natural units (c = ! = 1):
p = 12 φ̇2 − 16 (∇φ)2 − V (φ), (2.19)
1 2 1 2
ρ= 2 φ̇ + 2 (∇φ) + V (φ). (2.20)
In natural units the equation of state parameter w = p/(ρc2 ) is written as
w = p/ρ. Equations 2.19 and 2.20 could generate a negative equation of state
parameter: for example, if V 0 φ̇2 and spatial derivatives are negligible, then
w = −1. If we define φ to have the units of energy, then Equations 2.19 and 2.20
come out in conventional units as
1 2 1
p= φ̇ − (∇φ)2 − V (φ), (2.21)
2!c3 6!c
1 2 1
ρc2 = 3
φ̇ + (∇φ)2 + V (φ), (2.22)
2!c 2!c
so each term has the dimensions of energy density.
It’s usual in inflationary calculations to assume that the spatial derivatives are
negligible, because we’re inflating a small, locally-homogeneous region to a giant
size, so any inhomogeneities will become negligible. This means that in most
contexts, φ is the value of the field throughout the observable Universe. If we
assume that the field φ is approximately the same everywhere, then ∇φ 1 0 and
∇2 φ 1 0.
We also found in Section 2.7 that we need w < −1/3 for inflation to happen. In
order to achieve this, we need that the potential V in Equations 2.19 and 2.20
starts off by dominating over the kinetic energy term involving φ̇. When this is no
longer true, inflation will cease. At this point the ‘object’ in Figure 2.6 will
oscillate around the minimum, with the oscillations damped by the H φ̇ term. In
addition, it’s expected that the inflaton field will then decay into conventional
matter and radiation. This particle generation would appear as another
friction-like term in the equation of motion. At this point in the history of
the Universe, the temperature would have been very cold, because the energy
densities of matter and radiation will have been reduced by factors of a3 and a4 ,
respectively, where a is the dimensionless scale factor of the Universe. The
subsequent particle generation process is known as reheating, but the exact
mechanism is not known since the underlying physics of the inflaton field is not
known. During this process, the matter–antimatter asymmetry of the Universe
may have been generated. The end result of inflation is that the Universe is left
with more or less the same energy density as when it started, but in the form of
radiation and matter, and with an imbalance of matter over antimatter.
The requirement that V starts out much bigger than the kinetic energy term can
also be shown to imply that we need φ̈ to be small, and that φ is homogeneous.
The proof of this is very involved, but we can sketch a demonstration. Suppose
that φ has some intrinsic variation over a spatial scale δx. We’d expect there also
to be intrinsic temporal variations over timescales of δt = δx/c. We could
think of this as being equivalent to a kinetic energy term φ2 /(δt)2 . In order for
the potential to dominate, we need that V (φ) is much bigger than φ2 /(δt)2 ,
i.e. V (φ) 0 φ2 /(δt)2 . If we differentiate this with respect to φ, we find that
dV /dφ 0 2φ/(δt)2 . But this will be of the order φ̈. We should therefore expect
to be able to neglect the φ̈ term in Equation 2.18. This approximation is known as
59
Chapter 2 The cosmic microwave background
The expression ‘slow-roll’ is the slow-roll approximation. The result is that the slow-roll approximation leads
perhaps misleading because it to us approximating the equation of motion as
seems to suggest that the object dV
in Figure 2.6 acquires some 3H φ̇ = − = −V % . (2.23)
dφ
angular momentum. To avoid
this, we’ve used the verb ‘slide’ The next step is to substitute this into the Friedmann equation (Equation 1.7)
in preference to ‘roll’ where we rewritten in natural units. (To do this, we replace the factor of G with one of the
can, but be aware that most of Planck scales in Equations 2.13–2.15 — conventionally, mass.) We then use
the technical literature and V 0 φ̇2 to show that
textbooks use ‘roll’ in this C D
2 8π 1 1 2 1 2
context. H = φ̇ + (∇φ) + V (φ)
3m2Pl !c 2!c3 2!c
8π
1 V (φ). (2.24)
3m2Pl
Putting these together, one can show that the requirement that V 0 φ̇2 can be
expressed as constraints on two new dimensionless quantities:
C D
−Ḣ m2Pl V % 2
ε= 2 = * 1, (2.25)
H 16π V
C D
φ̈ m2Pl V %%
η= = * 1, (2.26)
H φ̇ 8π V
where V % = dV /dφ and V %% = d2 V /dφ2 .
These equations are a dimensionless way of expressing the constraint that the
potential V must be shallow and flat enough to allow slow-rolling. These criteria
are requirements for inflation to start, and inflation will end when, for example,
ε 1 1.
We don’t know the shape of the inflation potential. There are many varieties of
inflation, each of which hypothesizes a differently shaped potential. However, the
observational consequences of inflation all rely on the last stages of inflation when
the ‘object’ in Figure 2.6 is close to the minimum, so they don’t depend strongly
on the shape of the potential. In a sense this is a pity, because it restricts our
ability to constrain this new physics experimentally, but it also greatly simplifies
the predictions of inflation and makes them more robust to changes in the
underlying assumptions. We’ll describe some of the observational consequences
of inflation in the next section.
60
2.9 The primordial density power spectrum
Note how similar these equations are. F (k) is known as the Fourier
transform of f (x). Transforming twice gets you almost back where you
61
Chapter 2 The cosmic microwave background
started: if you make a Fourier transform of F (k), you get f (−x) back.
Fourier transforms occur throughout physics. For example, diffraction in
optics involves Fourier transforms. The image of a star seen through a
telescope is the Fourier transform of the telescope aperture — well, almost.
The amplitudes of the waves hitting your detector are the Fourier transform
of the telescope aperture, but what you measure is the energy of the light on
your detector, which is proportional to the amplitude of the electromagnetic
wave squared, so your image will be the Fourier transform of the telescope
aperture, squared.
Let’s write the average density of matter as ρ and the deviation from this average
as δρ. This deviation will vary with position. It’s common to express the
clumpiness in terms of the fractional overdensity or underdensity, (δρ)/ρ. Often
this fractional overdensity is simply abbreviated as δ. By definition, the mean
value of δ is zero. Since δ will vary as a function of position, we’ll write this as
δ(x).
Imagine that we are considering a large box in the Universe with side length L.
We can write δ(r) as a Fourier series in three dimensions. For simplicity for now,
however, let’s just consider a one-dimensional universe so the ‘box’ is just a
length L. The density of matter expanded as a Fourier series will be
δρ E
δ(x) = = Ckn eikn x , (2.35)
ρ n=−∞,∞
where the sum is over all wave numbers k = (kx , ky , kz ) in the box, e.g.
kx = 2πn/L and similarly for y and z. The conventional symbol used to
represent the Fourier coefficients is δk :
E
δ(r) = δk eik·r , (2.38)
*
1
δk (k) = 3 δ(r) e−ik·r dr. (2.39)
L within L3
So far we’ve just written down Fourier series; how can we use these to
characterize the clumpiness? One approach is to measure how much variation
there is in the Fourier coefficients. Since root-mean-square (RMS) is a measure of
the standard deviation of the random sample, we can estimate the variance by
averaging the squares of the Fourier coefficients over different realizations of the
density field for a fixed k, i.e. (|δk |2 /, where |δk |2 = δk δk∗ (remember that δk is a
complex number, so δk∗ is its complex conjugate).
62
2.9 The primordial density power spectrum
How might this work in practice? First, if the δ(x) distribution is isotropic on
average, the Fourier coefficients won’t on average depend on the direction of k.
Second, the amount of clumpiness could depend on how closely you look at the
density field map. For example, the density distribution could be clumpy on
medium-sized scales, but look smooth on larger scales and on smaller scales. For
this reason it’s useful to calculate the variance in the Fourier coefficients as a
function of the length of the wave number vector, k = |k|. This is known as the
power spectrum and is written as
P (k) = (|δk |2 /. (2.40)
In the present-day Universe, an overdensity (δ(x) > 0) will attract surrounding
matter through gravity and will tend to increase the value of δ. Similarly,
underdensities (δ < 0) will empty out of matter, causing the value of δ to become
more negative. The density perturbations δ(r) will initially evolve from
self-gravity in such a way that each Fourier mode evolves independently. This is
also referred to as the ‘linear regime’ in the evolution of the density field. This is
one reason why the power spectrum is used in cosmology, rather than other
measures of clustering.
These effects of self-gravity can be neglected during inflation, but inflation makes
very clear predictions for the initial density power spectrum. The key idea for
inflation is that the gravitational potential laid down by the inflating Universe was
invariant under time translation, i.e. the Universe should look the same on average
if you make the transformation t → t + Δt, regardless of your choice of Δt (as
long as it’s shorter than the duration of inflation). Therefore there must be a
constant level of fluctuations on (say) the scale of the horizon. In other words,
there must be a continuous time-invariant process in which quantum fluctuations
are being created within the Hubble volume then inflated out of it. Also, these
fluctuations cannot have any characteristic length scale (or the Universe would not
look the same regardless of the choice of Δt). This is a fractal universe.
To see what this means in terms of the power spectrum, we need to express the
fluctuations in a scale-invariant way, then state that the fluctuations are constant.
For example, it’s no good measuring the power spectrum on scales of k = 1 m−1
to k = 2 m−1 , because this invokes the characteristic scale length of the metre.
What we can do, however, is measure the power spectrum in an interval between Note that k here is the wave
any wave number k and double that wave number, 2k. We’d then require that the number not the curvature
value of the power spectrum shouldn’t depend on the choice of k. In other words, parameter!
in any factor-of-two interval in k, we should measure the same power spectrum.
Another way of expressing this is to use the natural log of the wave number, ln k,
and require that the power spectrum is the same in any logarithmic interval
Δ ln k = ln 2. Of course, there’s nothing special about the choice of 2. The
general way of expressing this scale-invariance is to say that the variance in the
density field per logarithmic k interval is constant:
Δσ 2 dσ 2
1 = constant. (2.41)
Δ ln k d ln k
This is known as the scale-invariant spectrum or the Harrison–Zel’dovich
spectrum.
63
Chapter 2 The cosmic microwave background
V (φ) φ
δφ
δt
The difference in density between these two regions at the scale of the horizon δH
will be roughly H δt, where H is the Hubble parameter during inflation. Quantum
field theory (and in fact dimensional analysis) predicts that the RMS of δφ on the
scale of the horizon is equal to H/(2π) in natural units, so the horizon-scale
fluctuations will be of the order δH = H 2 /(2π φ̇). This will depend on the shape
of the inflation potential and is one of the free parameters in fitting inflationary
models to data on the large-scale structure of the Universe.
Scale-invariance breaks down once the expansion ceases to be exponential, so we
expect a slight deviation from scale-invariance to be imprinted on the fluctuations
as inflation ends. This will depend on the shape of the inflation potential near
the end of inflation, i.e. on the parameters ε and η (defined in Equations 2.25
and 2.26). The result is that ns 1 1 + 2η − 6ε for the values of ε and η near
the end of inflation. There is also a prediction of a clustered background of
gravitational waves (which we shall meet in Section 2.16), which also has a
dependence on ε.
Two other key predictions of inflation are worth mentioning. First, the
perturbations of the matter and radiation number densities should be equal; these
are known as adiabatic perturbations because adiabatic expansion conserves the
ratio of matter and radiation number densities. Second, the phases of the Fourier
decomposition should be random and uncorrelated with each other. Intuitively
this seems reasonable since the quantum fluctuations at one time should be
uncorrelated with the quantum fluctuations at a later time; the earlier quantum
fluctuations give rise to the Fourier components on larger spatial scales, while the
later quantum fluctuations are responsible for the Fourier modes on smaller spatial
scales. It can be shown that random phases imply that the fluctuations are a
Gaussian random field, which means that the joint probability distribution for
the density at any number of points must be a multivariate Gaussian distribution.
Because a Gaussian random field has no information contained in the phases
65
Chapter 2 The cosmic microwave background
(i.e. they are all uniformly randomly distributed), all the statistical information
about the density field is contained in the amplitudes, so the power spectrum
completely characterizes the density fluctuations.
The overall amplitude of the initial perturbations depends on the shape of the
inflation potential, as does the deviation from scale-invariance. We’ll see in
Section 2.16 that the gravitational wave background does too. The CMB
fluctuations have been shown to be consistent with adiabatic perturbations, so we
won’t discuss alternative sources of perturbations here (e.g. ‘isocurvature’
Trotta, R., 2007, Monthly perturbations); if there is a non-adiabatic contribution, it must be small. Many
Notices of the Royal tests have been made of the CMB clustering to search for non-Gaussian character,
Astronomical Society, 375, L26. though no unequivocal signal has yet been found. The expectation is that the
reheating at the end of inflation was the time of baryogenesis, which set the
subsequent entropy per baryon, but the GUT-scale physics that determined these
processes (and inflation itself) is still uncertain.
One implication of inflation is that there may be regions far off the minimum
V (φ) that inflate eternally. If φ is very large, the quantum fluctuations in φ would
make φ perform a random walk that overwhelms the drift towards the minimum
V (φ). Our observable portion of Universe could be just an infinitesimal part of a
much, much larger complex. One of the enduring surprises of observational
cosmology is that it is possible at all — that is, we can build telescopes that are
big enough to detect light from most of the way back to the Big Bang, and
observe galaxies throughout most of the Hubble volume (Section 1.9). However,
if one of these variants of inflation is correct, the observable part of the Universe
is a very tiny part of it indeed. This boggles the mind.
Finally, it’s worth remembering that one of the motivations of inflation is to
solve the horizon problem and many others without invoking Planck scale
physics such as quantum gravity. We’re describing a general relativistic Universe,
which inevitably involves the gravitational constant G, and quantum mechanics,
which inevitably involves Planck’s constant ! = h/(2π). It’s therefore perhaps
inevitable but a little disappointing that the Planck scale should occur in various
forms in the inflation equations. The following exercise will demonstrate the
inevitability of the Planck scale in inflation.
)
Exercise 2.7 The number of e-foldings of inflation is roughly N = H dt.
Use the slow-roll approximation to show that
*
−8π φ1 V
N= 2 dφ, (2.47)
mPl φ2 V %
where φ2 and φ1 are values of the inflaton field at the start and end points of
inflation, respectively. We can choose φ1 = 0 without loss of generality.
Next, make the assumption that V % is roughly of the order of V /φ (which should
be true if the potential is reasonably smooth and slowly-varying) to show that
N ∼ (φ2 /mPl )2 and hence that we need φ2 significantly larger than mPl . ■
66
2.10 The real music of the spheres
Similarly, the same criteria can be used to show that the parameters ε and η are
both * 1 (Equations 2.25 and 2.26). Inflation ends when φ ∼ mPl , so we have
not escaped consideration of the Planck scale.
(−1)m 1 (
2 m/2 d
l+m 1 (l
Plm (x) = l
1 − x l+m
x2 − 1 , (2.49)
2 l! dx
where dl+m /dxl+m is the (l + m)th-order derivative. When the CMB structure is
expanded (like a Fourier series) in terms of spherical harmonics, the coefficients
used are named the monopole, dipole, quadrupole, octopole, and so on. The l = 1
Legendre polynomials have just one trigonometric function of θ (e.g. sin θ), while
the l = 2 polynomials have two (e.g. sin2 θ), and so on. Spherical harmonics are
also used in quantum mechanics, especially in describing electron orbits in atoms,
and in helioseismology.
We calculate (δT )/T CMB , where T CMB is the average CMB temperature and
δT = T − T CMB . As with the power spectrum above, we shall write this as δ,
though in this case δ will depend on the angular position q = (θ, φ) on the sky
rather than the spatial position x.
67
Chapter 2 The cosmic microwave background
more of the Cl spectrum, shown in Figure 2.9. We shall see why the power
spectrum has peaks in Section 2.12. Many ground-based and balloon-borne
experiments have made constraints on the highest-l region, though the maps at
this resolution are not yet all-sky. This will change shortly with the European
Space Agency Planck mission, which launched on 14 May 2009. Planck is
also expected to make tremendous advances in measuring the clustering of the
polarized CMB, about which we shall hear more later.
5000
4000
3000
2000
1000
0
90◦ 2◦ 0.5◦ 0.2◦
Figure 2.9 The CMB power spectrum measured in the first five years of the WMAP satellite. The curve shows
the best fit to the data, in which the cosmological parameters are inferred. The grey region shows the scatter in the
data that one would expect from cosmic variance, i.e. the fact that you’re sampling only a finite region of the
2
Universe. The fluctuations plotted in the y-axis are l(l + 1)Cl T CMB /2π.
We’ve seen how the theory of inflation predicts a roughly scale-invariant spectrum
of density perturbations, and that the real horizon size at recombination was in
fact much larger than one would predict without inflation. Nevertheless, the
apparent (i.e. non-inflationary) horizon size is still a useful scale length. On sizes
much smaller than this scale, regions will have had time since inflation to affect
each other. On sizes much larger than this scale, the only causal contact could
have been during or prior to inflation. We’d therefore expect that the power
spectrum on large scales should have the roughly scale-invariant behaviour
predicted by inflation. This is known as the Sachs–Wolfe plateau and is indeed
what’s seen in observations. (The clustering amplitude on the Sachs–Wolfe
plateau also agrees with the amplitude of matter fluctuations in the local Universe
on 8 Mpc scales, known as σ8 , of which more later.) However, we’ll see in
Section 2.13 that the passage of photons through the Universe over the past 13 or
so billion years can cause some additional distortions on the largest scales.
Another effect that might leave its imprint on the CMB is the topology of the
Universe. If we travel for long enough in one direction, might we go right round
69
Chapter 2 The cosmic microwave background
the Universe and back to where we started? It’s possible to show that in any k > 0
(spatially spherical) universe, the expansion is too fast to permit this motion.
However, there’s another way to make this happen. Imagine a sheet of paper. We
can easily draw geodesics on this surface: they are just straight lines that you
would draw with a ruler. Now curl the paper into a tube. The lines previously
drawn are still geodesics. However, if you travel in one direction for long enough,
you get back where you started, despite the fact that geodesically the surface is
spatially flat. This is rather like the 1970s arcade game Asteroids in which you
can disappear off the edge of the screen in one direction and reappear at the
opposite edge. Curvature that changes the geodesics is called intrinsic curvature,
while curvature that doesn’t is called extrinsic curvature. Einstein’s theory
of general relativity makes predictions only for intrinsic curvature; we have
no theory making any prediction for extrinsic curvature. Going back to our
tube made of a piece of paper, we can’t link the two ends of the tube in three
dimensions without bending the tube and so generating intrinsic curvature, but if
we had four spatial dimensions, we could link the two ends and still have zero
intrinsic curvature. The paper would then be arranged into a torus shape, i.e. it has
assumed a different topology to a single sheet. What is the topology of our
Universe? A complex topology would leave characteristic imprints on the CMB if
the wrap-around scales were small enough. No such features have been found,
implying that any wrap-around topology in our Universe has to be at least around
the size of the Hubble volume.
We’ll see below how some of the fluctuations that are seen are due to acoustic
oscillations in the early Universe. Some audio representations of the acoustic
oscillations after the Big Bang can be found in the further reading section. The
cosmologist Peter Coles estimated the amplitude of these acoustic oscillations in
decibels (setting aside the obvious objections that there were no people to hear
them and the conditions were too hot and dense for terrestrial life anyway) and
found that the Big Bang was no louder than a rock band.
motion is along the x-axis and we’ll consider a light ray in the xy-plane. There is
no z-axis component of the light ray’s motion, so the z-axis component of the
wave vector is zero, which is also true for all observers. We’ll also assume that
there is a CMB rest frame in which it appears uniform.
First, imagine a stationary observer on the Earth. He or she receives a CMB
photon in the xy-plane. Now we imagine making a Lorentz transformation to the
CMB rest frame, which we’ve chosen to be a velocity boost along the x-axis.
We’ll give the CMB rest frame primed coordinates. Applying the Lorentz
transformation (Appendix B, Section B.4), we find that an observer moving
relative to the Earth along the x-axis with velocity v will see a wave vector
C % D ! F
% ω % % % ω v v ω
k = , kx , ky , kz = γ − γkx , γkx − γ , ky , 0 . (2.56)
c c c c c
Focusing on the time-like (zeroth) component, we find that the observer in the
CMB rest frame will see the light at a different frequency:
ω% ω v
= γ − γkx . (2.57)
c c c
We can relate kx to ω using the null length of the wave vector (Equation 2.55) and
kz = 0:
! ω F2
= kx2 + ky2 . (2.58)
c
This is Pythagoras’s theorem, with the hypotenuse of the triangle as ω/c.
The angle that the light ray makes with the x-axis, θ, can be found from
trigonometry: cos θ = adjacent divided by hypotenuse, or kx /(ω/c). Therefore
kx = (ω/c) cos θ. Plugging this into Equation 2.57 and rearranging, we find that
! v F
ω % = γω 1 − cos θ , (2.59)
c
i.e. there is a θ-dependent blueshifting or redshifting. We’ve already found that a
redshifted or blueshifted black body spectrum is still a black body spectrum,
though with a different temperature. Therefore we can write
! v F
T % = γT 1 − cos θ , (2.60)
c
where T % is the temperature in the CMB rest frame, while T is the temperature as
seen from Earth. But we’ve assumed that the CMB has a uniform temperature in
the CMB rest frame, i.e. T % = constant, so we must see a fractional temperature
variation
T 1
= 1 (
T % γ 1 − vc cos θ
C ! v F2 D1/2 ! F−1
v
= 1− 1 − cos θ
c c
C ! F DC D
1 v 2 v v2 2
= 1− + ··· 1 + cos θ + 2 cos θ + · · ·
2 c c c
2
C D
v v 1
= 1 + cos θ + 2 cos2 θ − + ··· .
c c 2
Comparing this to Section 2.10, we see that our motion relative to the CMB
induces a dipole as well as having smaller effects on higher-order multipoles.
71
Chapter 2 The cosmic microwave background
Exercise 2.8 Imagine that you are in the early Universe shortly after
recombination, watching the surface of last scattering recede from you. Would the
CMB at this time have the same acoustic peaks that we see today? ■
The first acoustic peak is determined mainly by the sound horizon size. This in
turn is mainly dependent on the Hubble parameter at that time, H, and therefore
on H0 . The angular size of this structure is found by calculating the angular
72
2.12 The acoustic peaks in the CMB
73
Chapter 2 The cosmic microwave background
the numbers of baryons also increases the rate of collisions that the photons
experience. The shape of this damping tail is an important consistency check for
the cosmological parameters derived from the acoustic peaks.
One final surprise is that this frozen cosmological sound wave structure has never
gone away — we’ll see in Chapter 3 that it’s still visible in the galaxy distribution!
alone, they took small segments of the WMAP sky map around each of their
galaxy clusters and averaged these small images together. They did the same on
the voids. The resulting averaged images are shown in Figure 2.11. The resulting
detection is significant at over 4 standard deviations.
Figure 2.10 The CMB, with the positions of known foreground galaxy clusters
(red) and voids (blue) marked.
voids clusters
20
8◦
10
4◦
0◦ 0
µK
−4◦
−10
−8◦
−20
−8◦ −4◦ 0◦ 4◦ 8◦ −8◦ −4◦ 0◦ 4◦ 8◦
Figure 2.11 Average CMB images at the positions of voids and clusters.
We know from the acoustic peaks in the CMB that we live in a spatially flat
universe to a good approximation (Section 2.12); the detection of a late-time ISW
effect can be reconciled with spatial flatness only if there is a cosmological
constant (or, more generally, dark energy — see later). We’ll see below that it is
very difficult to measure ΩΛ from the CMB alone, so cross-correlating foreground
populations with the CMB to search for the ISW is a very valuable additional use
of the CMB maps.
This technique (averaging images of separate objects in order to detect the
average signal from the population) is an example of a stacking analysis. This is
used very widely in observational cosmology.
75
Chapter 2 The cosmic microwave background
2.14 Reionization
After the Big Bang, what generated the first light in the Universe? Were stars the
first luminous objects in the Universe that illuminated the darkness, or accreting
black holes? In Chapter 8 we’ll discuss what observational constraints we have on
the first light in the Universe, and also another way of finding Ωb . However, the
CMB gives its own unique constraint on the first light in the Universe.
The effect is similar to Silk damping (Section 2.12). Once the first luminous
objects have reionized the Universe, the free electrons liberated by ionizing the
‘Reionization’ is perhaps a atoms can once again scatter CMB photons through Thomson scattering. As with
misleading term. We speak of Silk damping, the effect is to suppress the acoustic peaks in a characteristic
‘recombination’ at z 1 1000 to manner. This time, however, the Universe is transparent and the mean free path of
describe the formation of the the photons is much larger, of the order of the horizon size. (This also implies that
first neutral atoms. These are the new acoustic oscillations won’t form.) The suppression therefore acts on both
first atoms, so we can hardly large and small scales. The overall effect of reionization resembles a change in the
speak of their ‘recombining’, overall normalization of the fluctuations, except on the largest scales.
since they are combining for the Another effect of reionizing the Universe is to change the polarized components
first time! Nevertheless, the term of the CMB, because scattered light is polarized. We’ll discuss the polarization
‘recombination’ is the one of the CMB in Section 2.16. The current CMB constraints on the epoch of
used. Similarly, ‘reionization’ reionization from the five-year WMAP results are shown in Figure 2.12. While
is open to criticism, since the redshift of reionization is not very well determined, the optical depth τ to
these atoms are being ionized Thomson scattering is better measured: the probability of a photon undergoing
for the first time. They are Thomson scattering is defined as (1 − e−τ ), where the five-year WMAP CMB
re-making an ionized plasma data set the constraint τ = 0.087 ± 0.017.
that existed at z > 1000, but this
is the first example of the
1.0
ionization process, so the term 1
‘reionization’ at z > 6 could be 0.8
considered as inappropriate as
‘recombination’ at z 1 1000. 0.6
Nevertheless, these are the terms
in use. One can only apologize. 0.4
0.2
0 0.0
0 5 10 15 20 0 5 10 15 20 25 30
Figure 2.12 The constraints on reionization from the WMAP CMB maps. The left-hand panel is the likelihood
constraints from the WMAP 3-year and 5-year data sets (note the improved constraint from the extra two years of
data), assuming an instantaneous reionization at a redshift zr . But the CMB data don’t in themselves require the
reionization to be instantaneous. The right-hand panel shows the constraints if we assume a two-step model: the
reionization was instantaneously set to a level of xe at redshift zr , then instantaneously completely ionized at
redshift 7. The dark shaded region is the 1σ contour, i.e. there is an approximately 68% chance that the underlying
value is in that region, while the light shaded region is the 2σ contour (1 95%).
80
60
ΔT /µK
40
20
Ωtot,0 ΩΛ,0
80
60
ΔT /µK
40
20
Ωb,0 h2 Ωm,0 h2
77
Chapter 2 The cosmic microwave background
The positions of the acoustic peaks are only very weakly dependent on ΩΛ,0 , for a
fixed Ωk,0 . If Ωb,0 is fixed and Ωk,0 = 0, then we can constrain Ωm,0 (panel (d) of
Figure 2.13), but there is not enough information in the CMB to constrain Ωk,0
and Ωm,0 and Ωb,0 and ΩΛ,0 all simultaneously.
The fact that constraints on one parameter can correlate with constraints on
another is known as parameter degeneracies. These degeneracies are intrinsic to
CMB experiments, but the degeneracies can be broken by including a comparison
with other, non-CMB experiments. For example, Figure 2.14 shows the
constraints on Ωm,0 and ΩΛ,0 from the CMB and from high-redshift supernovae
(which we shall meet in Chapter 3). The supernovae and CMB constraints
both have degeneracies, but they are in opposite directions, so combining the
constraints makes it possible to measure Ωm,0 and ΩΛ,0 separately. Another
example that we’ve already met is the late-time ISW (Section 2.13), from which
one can infer ΩΛ,0 if Ωk,0 is known.
H0 : 30 40 50 60 70 80 90 100
1.4
1.2
1.0
0.8
ΩΛ,0
0.6
0.4 fla
t
0.2
Figure 2.14 The constraints on the cosmological density parameters from the WMAP CMB measurements and
from high-redshift supernovae. The contours show the 1σ (inner ring), 2σ (middle ring) and 3σ (outer ring)
allowed range for the supernova data of Kowalski et al (2008). The dots show some Monte Carlo (i.e. random)
realizations of the WMAP data, in which model universes are selected in proportion to their likelihood of fitting the
WMAP data of Dunkley et al. The value of the Hubble parameter H0 is colour-coded for these model universes.
The combination of the WMAP data and the supernova data leaves only a very small region of this plane mutually
allowed, thus breaking the parameter degeneracy in the WMAP data alone.
Together, the parameter constraints from the CMB and other sources have
converged remarkably on the parameters. Overall, the level of agreement between
the CMB and other constraints has led to the resulting cosmological model being
called the concordance cosmology.
A completely different (and controversial) type of constraint is also worth
mentioning. The Universe contains intelligent life. Can we use this fact on its own
to constrain the cosmological parameters? This line of argument is known as the
anthropic principle. This might explain why, for example, we find ourselves in a
Universe after recombination and at a time when Ωm > Ωr so gravitational
78
2.16 The polarization of the CMB
collapse is possible and stars can form. The Universe also cannot be so old that
stars no longer form, as might be the case if the age were measured in hundreds of
billions of years, or when Ωm 1 0. Using anthropic arguments to constrain our
position in time and space within a given cosmological model is known as the
weak anthropic principle. This is quite a departure from the Copernican
principle! It is also an example of a ‘selection effect’, about which we shall say
more in Chapter 4. Anthropic arguments attracted some interest prior to precision
cosmology, but the experimental constraints on the concordance cosmological
model are now much stronger than the anthropic constraints.
A more radical variant is to suppose that an ensemble of universes exists (a
so-called ‘multiverse’), each universe having different fundamental physical
constants. Anthropic arguments can then be used to constrain which parts of this
ensemble intelligent observers could inhabit. The underlying assertion that our
own Universe must be suitable for the formation of intelligent life (from which
one might constrain the fundamental physical constants) is sometimes known as
the strong anthropic principle. While this could explain various apparent
fine-tunings in physical constants, the disadvantage of these arguments is that they
give no physical mechanism for explaining parameter values. There is also
predictably some disagreement over how to best calculate likelihood distributions
for fundamental physical constants on anthropic grounds. Also, is this part of
testable science? It may be or may become so if a testable theoretical framework
could be found for explaining this ensemble. In any case, is ‘science’ exclusively
concerned with things that are testable in practice, now? We won’t rehearse the
debates in this book but you will no doubt sense that this is an area that can
generate a great deal of controversy.
wave! It’s not essential to our story, but if you want details on this analogy, see
the box below.) There’s a physical reason for doing this: it turns out that the
B-modes are sometimes also primordial B-mode is due entirely to primordial gravitational waves!
called tensor modes, though we
don’t usually use that expression
The Helmholtz–Hodge theorem
in this book. The term is related
to how they can be expressed as The electromagnetic analogy is as follows. There’s a general mathematical
perturbations of the metric. theorem, known as the Helmholtz–Hodge theorem, that any vector field v
can be expressed in two parts: v = B + ∇φ, where φ is a scalar field and B
has no divergence, i.e. ∇ · B = 0. This is like electromagnetism, where the
electric field is E = ∇φ and φ is the scalar potential in electromagnetism;
also, the lack of magnetic monopoles in electromagnetism implies that
∇ · B = 0, where B is the magnetic field. It’s also generally true that a curl
of a gradient is zero, i.e. ∇ × ∇φ = 0 for any scalar field φ, implying
∇ × E = 0. So any vector field can be broken into a ‘magnetic’
(i.e. divergence-free) component and an ‘electric’ (i.e. curl-free) component.
Now, in the case of our CMB linear polarizations we are dealing not with a
vector field but with something subtly different, because if we rotated the
polarization by 180◦ we’d get the same polarization, which isn’t true of a
vector. The mathematical expressions for the ‘electric’ and ‘magnetic’
components of CMB polarization are therefore slightly different. We won’t
go into these differences here, but there is more information in the further
reading section.
How do you measure the clustering of the polarized CMB? Using a mathematical
formalism similar to the unpolarized CMB structure, it’s possible to quantify the
structure in the polarized background as a function of angular size. Instead of
measuring the difference in unpolarized temperatures between two locations
(known as the T T power spectrum, with T standing for temperature), we could
measure the differences in (say) the temperatures of the E-component of the
polarization between two locations. This is sometimes referred to as the EE
power spectrum. Similarly, we could measure the clustering of the B-mode,
which would be called the BB power spectrum. We could also compare, say, the
E-mode polarization temperature in one place with the unpolarized temperature
in another. This cross-correlation would go by the name T E. In all there are
six possible permutations: T T , EE, BB, T E, T B, EB. Of these, it can be
shown from parity arguments that T B and BE should be zero, so there are four
astrophysically useful cross-correlations. The predicted levels of these clustering
strengths are shown in Figure 2.15. Note that the EE oscillations are out of phase
with the T T oscillations, for reasons related to how the light is scattered at the
time of recombination (see the further reading section for more details).
The detection of B-mode polarized clustering would be tremendously exciting,
because the primordial gravitational wave background constrains the shape of the
inflation potential (Section 2.8) and would be the first genuine consistency test of
inflation. If we describe the scalar clustering power spectrum as ClS ∝ lns −3
(where ns is as defined in Equations 2.44 and 2.45) and the tensor clustering as
ClT ∝ lnT −3 , then inflation’s predictions for powers are ns 1 1 + 2η − 6ε and
nT 1 1 − 2ε, respectively, where ε and η are defined in Section 2.8 and are related
80
2.16 The polarization of the CMB
100
10 TE
reionization
EE
ΔT /µK
1
BB
0.1 gravitational
lensing
gravitational
waves
0.01
10 100 1000
multipole moment, l
Figure 2.15 The predicted unpolarized CMB power spectrum (upper line),
compared to the T E power spectrum, the EE power spectrum and the BB power
spectrum. Negative values are dashed, and the predicted 1σ (i.e. 68% confidence)
uncertainties from the European Space Agency Planck satellite are shown as bars.
Note that the polarized signals are much weaker than the unpolarized signal.
to the shape of the inflation potential. These powers are sometimes referred to as
spectral indices of the density perturbations. There is also a prediction for the
amplitudes of the scalar and tensor clustering to be related by ClT /ClS = 12.4ε.
The consistency test of inflation is whether the tensor spectral index agrees
with the relative strengths of the scalar and tensor modes. Unfortunately, the
closer the scalar power spectrum is to scale-invariance, the harder it is to test
inflation; the current experimental constraint on the scalar spectral index is
0.0081 < 1 − ns < 0.0647 (WMAP five-year constraint).
Experimentally, this is a very challenging experiment for the current generation of
detectors, as can be seen from Figure 2.15. Currently, the only constraints on the
polarized signals are from the T E cross-correlation power spectrum from the
WMAP satellite, shown in Figure 2.16. The detection of primordial gravitational
waves is the next great challenge for CMB experiments and could show us the
path to new physics. One of the problems is the existence of astronomical
foregrounds that could contribute to the BB power spectrum. Gravitational
lensing of the CMB by the intervening large-scale structure of the Universe
deflects the CMB photons so changes the Cl clustering signal (see Figure 2.15). It
can produce a BB correlation, particularly at smaller scales of l > 500 or so. We
discuss gravitational lensing in more depth in Chapter 7. It’s by no means certain
that there is a gravitational wave background to detect: a rival theory to inflation
known as the ‘ekpyrotic Universe’ predicts no detectable CMB gravitational wave
background. This model is based on an extension of superstring theory known as
81
Chapter 2 The cosmic microwave background
M-theory. These theories predict more dimensions than our usual three space and
one time. Our Universe is imagined to be a ‘sheet’ in a higher-dimensional
spacetime, and the trigger for the Big Bang in this model was the collision
between two such sheets.
2
TE
0
2
−1
10 100 500 1000
multipole moment, l
1.0
TB
[T CMB l(l + 1)ClT B /2π]/µK2
0.5
0
2
−0.5
10 100 500 1000
multipole moment, l
Figure 2.16 The T E polarized CMB clustering detected by the first seven
years of WMAP, and WMAP’s non-detection (as expected) of the T B clustering
signal. The boxes show WMAP’s five-year constraint, showing the improvement
from adding a couple of years’ more data.
We shall meet gravitational wave observatories briefly in Chapter 6. The direct
detection of primordial gravitational waves in the new generation of gravitational
wave observatories is also very challenging, partly because the energy density in
gravitational radiation redshifts in the same way as the photon energy density.
Another source of CMB photon scattering is the electrons that are liberated at the
epoch of reionization. As with the surface of last scattering at recombination, the
scattered CMB light will be partly polarized. The polarized structure of the CMB
is therefore simultaneously sensitive to physical processes at both recombination
and reionization. This shows up as a bump in the E-mode polarization on large
angular scales, around l of a few.
82
2.17 Dark energy and the fate of the Universe
83
Chapter 2 The cosmic microwave background
Exercise 2.10 List at least four major differences between dark matter and
dark energy. ■
Figure 2.17 shows the observational constraints on the dark energy equation of
state. Note that there is a strong parameter degeneracy between the present-day
dark energy density ΩΛ and the equation of state. However, combining this with
other observational constraints narrows the field considerably, such as the Hubble
Space Telescope determination of the Hubble parameter, baryonic acoustic
oscillations and high-redshift supernovae (which we shall meet in Chapter 3).
If we admit the possibility of a time-varying equation of state, the constraints
worsen considerably. If we write w = w0 + w% z/(1 + z) for the dark energy
equation of state, the corresponding constraints are shown in Figure 2.18.
However, there are good reasons for believing that the constraints will improve.
For example, the late-time ISW is sensitive to a late phase of accelerated
expansion in the Universe, and changes in the dark energy equation of state
produce changes in the ISW signal.
84
2.17 Dark energy and the fate of the Universe
0
0.8
−0.1 0.7
ΩΛ,0
Ωk,0
0.6
−0.2 WMAP
0.5 WMAP + HST
WMAP + BAO
WMAP + SN
−0.3 0.4 WMAP + BAO + SN
−2.5 −2.0 −1.5 −1.0 −0.5 0 −2.5 −2.0 −1.5 −1.0 −0.5 0
w w
0.04 0.04
0.02 0.02
0 0
−0.02 −0.02
Ωk,0
Ωk,0
−0.04 −0.04
−0.06 −0.06
−0.08 −0.08
−0.10 −0.10
−2.5 −2.0 −1.5 −1.0 −0.5 0 −2.5 −2.0 −1.5 −1.0 −0.5 0
w w
Figure 2.17 The constraints on the dark energy equation of state w from the WMAP CMB maps and other
surveys, compared to the constraints on other parameters. The dark shaded regions are the 1σ contours, i.e. an
approximately 68% likelihood that the underlying value is in that region, while the lighter shaded regions are 2σ
(1 95% likelihood). The bottom-right panel uses BAOs (Chapter 3) from SDSS luminous red galaxies, while the
bottom-left panel uses a wider compilation.
Are there any theoretical reasons for expecting w to depend on time? We could
follow a similar line of reasoning to inflation and imagine that spacetime is filled
with (another!) scalar field, which we’ll call φΛ . (We’ll use the Λ subscript to
distinguish this field from the inflaton field.) If it has a potential of VΛ (φΛ ), then
following Section 2.8, the field will satisfy φ̈Λ + 3H φ̇Λ = VΛ% (φΛ ), where dots are
time derivatives and the prime is a derivative with respect to φΛ . (Compare
Equation 2.18 — again we’re assuming that the field is the same everywhere in
85
Chapter 2 The cosmic microwave background
w"
The dark shaded regions are
the 1σ contours, i.e. there is
−1.0 an approximately 68%
likelihood inferred from the
CMB data that the underlying
−2.0 value is in that region, while
−1.4 −1.2 −1.0 −0.8 the lighter shaded regions are
w0 2σ (1 95% likelihood).
space.) Pursuing the analogy of a ball rolling or sliding down a hill, we could
regard VΛ as a potential energy, and 12 φ̇Λ2 as a kinetic energy that we’ll call KΛ . As
with inflation, it turns out the energy density comes out as ρΛ ∝ KΛ + VΛ , while
the pressure is pΛ ∝ KΛ − VΛ (compare Equations 2.19 and 2.20), so the equation
of state is w = (KΛ − VΛ )/(KΛ + VΛ ). This would naturally vary with time. If the
field is slowly-rolling, then KΛ * VΛ so w 1 −1, i.e. it would look like a
cosmological constant. Before that point it would have w in the range −1/3 to −1
(Section 2.7). These models are sometimes called quintessence since they
postulate a fifth fundamental field in addition to the four fundamental forces of
nature. The mass of the particle associated with this field turns out to be very
small indeed in particle physics terms, namely 10−33 eV, which leads to a whole
new set of fine-tuning problems.
One curious regime that is not prohibited by experimental constraints is w < −1,
sometimes called ‘phantom energy’. These models radically change the projected
fate of the Universe. To see why, we’ll use the variation of energy density for an
adiabatic expansion: ρ ∝ (R/R0 )−3(1+w) , which we also met in Section 2.7. We
won’t prove this thermodynamic formula here, but note that for a photon gas
(w = 1/3) this gives ρ ∝ (R/R0 )−4 , while for pressureless matter (w = 0) this
gives ρ ∝ (R/R0 )−3 . Adapting Equation 2.61, we find that
C D−3(1+wi )
H2 E R
= Ω i,0 , (2.64)
H02 i
R0
where the wi are the equations of state of each of the contents of the Universe.
Once we get to an epoch when dark energy dominates, we’ll have that
C D−3(1+w )
R Λ
H2 ∝ ,
R0
where we’re now writing wΛ for the equation of state of the only remaining
component, dark energy. If wΛ = −1 (i.e. a cosmological constant), then the
right-hand side becomes a constant, so H tends to a constant value, which is what
we found in Section 1.11. However, if wΛ < −1, then H is perpetually increasing.
This means that the radius of the cosmological event horizon will shrink. When it
86
2.17 Dark energy and the fate of the Universe
becomes smaller than the size of any gravitationally-bound object, that object will
become unbound. Clusters of galaxies will be unbounded, then galaxies, then
stars and planets. Eventually even atoms will become unbound. This model
universe has been called the ‘big rip’. The event horizon size reaches zero in a
finite time, at which point the scale factor of the Universe becomes infinite. This
singularity represents an effective end of the Universe. For wΛ = −3/2, the
Universe ends in about 21 Gyr. Galaxy clusters would be ripped apart about
1 Gyr before the end, galaxies about 60 Myr before the end, solar systems about
3 months before the end, planets about half an hour before the end, and atoms
about 10−19 seconds before the end. The Universe would also spend most of its
history in a dark-energy-dominated phase, avoiding the need for anthropic
arguments. There are, however, ongoing debates as to whether any viable
phantom energy model could be generated from particle physics considerations.
Finally, it’s worth linking the discussion of dark energy with our earlier discussion
of inflation. When we calculated the effective pressure and density of the inflaton
field in Section 2.8, we found that V 0 φ̇ led to an effective equation of state
w = −1 (Equations 2.19 and 2.20). As we’ve seen in this section, this is
equivalent to a cosmological constant, so the slow-roll approximation leads to a
Universe very much like a cosmological-constant-dominated de Sitter universe.
This inflation was driven by the difference between the initial value of V (φ)
and the minimum. We implicitly assumed that when the Universe reaches the
minimum in V (φ), inflation stops, implying V = 0. However, it’s by no means
clear that V = 0 is the natural minimum value. The framework of inflation only
deals with potential differences, but in general relativity the absolute value of the
energy (in the form of the energy–momentum tensor) determines the curvature
and dynamics of spacetime.
One might, for example, expect the zero-point energy to be set by Planck scale
physics. Since Λ has dimensions of one over length squared, we might expect
−2
Λ ∼ rPl 1 1070 m−2 . However, as the following exercise shows, this is wildly
out of kilter with the observations.
Exercise 2.11 Using the experimental values ΩΛ,0 = 0.742
and H0 = 72 km s−1 Mpc−1 , show that the observed value of Λ is
1.3 × 10−52 m−2 . ■
This dimensional analysis gets the answer wrong by 122 orders of magnitude! We
might make a slightly more physically-motivated estimate by imagining that
the quantum vacuum is made up of quantum mechanical simple harmonic
oscillators. It’s a standard result in quantum mechanics that the wave function of a
particle of mass m has a zero-point energy of E0 = hω/(4π), where ω is the
angular frequency of the oscillator. We could consider a box in the Universe
with a side L and , add up all the zero-point energies of these oscillators:
E = (1/4π) × j hωj , where ω 2 = k 2 c2 + m2 c4 /!2 and k = 2π/λ, where λ is
the de Broglie wavelength. We can let the dimension of the box L tend to infinity.
The periodic boundary conditions of the box imply that the only wavelengths
allowed are λx = L/nx in the x-direction (where nx is an integer). Therefore in
the interval kx to kx + dkx , there should be (L/2π) dk separate values of kx . The
same applies in y and z, in which case the energy per unit volume becomes
*
E h ω(k) 3
3
= d k.
L 4π (2π)3
87
Chapter 2 The cosmic microwave background
Unfortunately, this integral diverges. Perhaps this is OK, since we should expect
our low-energy quantum mechanical calculations to break down at some length
scale, which should be 1/λ = k 0 mc/!. If we set this minimum length scale to
the Planck length, the vacuum energy density comes out again as 120 orders of
magnitude greater than the observed Λ. As the cosmologist John Peacock said:
‘We are left with the strong impression that some vital physical principle is
missing. This should make us wary of thinking that inflation, which exploits
changes in the level of vacuum energy, can be the final answer.’
So this is the awkward position that we find ourselves in. We’ve tried to solve the
horizon problem and other problems avoiding Planck scale physics with the
GUT-scale inflation model, but we found that inflation doesn’t quite escape all
considerations of the Planck scale. We have no way of predicting even the order
of magnitude of the cosmological constant from fundamental principles. We’ve
filled space with at least three fundamental scalar fields (Higgs, inflaton, dark
energy), while being unable to reconcile the fundamental conceptual bases of
general relativity and quantum field theory at the Planck scale. Perhaps there is a
tremendous conceptual breakthrough coming soon, but it’s overdue, as the
problems have been around for some decades. Perhaps there’s some vital piece
of experimental evidence that will provide the trigger, or perhaps an insight
will provide a thrilling breakthrough. There are some contenders, but with
experimental constraints and/or quantitative testable predictions hard to come by,
they’re still a little speculative to cover in depth in a book such as this. As a
professional scientist one has a choice: do you take a punt on chasing these
fundamental problems, or perhaps do you try something fundamental but more
tractable? We have the astonishing capability to build telescopes that can detect
galaxies throughout most of the Hubble volume, i.e. almost the entire observable
Universe. In the next chapters we shall cover some of what’s been discovered
about the evolution of the Universe since the time of the CMB.
Summary of Chapter 2
1. The cosmic microwave background (CMB) is the light from the surface of
last scattering from when the Universe was last opaque. At the time of last
scattering the motion of the photon gas became decoupled from the baryonic
matter.
2. A redshifted black body spectrum is also a black body spectrum.
3. The CMB is expected to be a black body because the photon–baryon
collision rate scales as a−6 (where a is the dimensionless scale factor of the
Universe), which increases faster as a tends to zero than the dynamical
inverse timescale, so thermalization was increasingly easy at early epochs.
The CMB spectrum is observed to be an excellent black body.
4. Baryon number conservation predicts perfect matter–antimatter symmetry
(unless an asymmetry is incorporated into the initial conditions of the
Universe), so baryon number non-conservation is expected in grand unified
theories.
5. The η parameter is the ratio of the number density of baryons to that of
photons. Since CMB photons dominate the entropy density of the Universe,
88
Summary of Chapter 2
18. On the smallest scales, the Silk damping effect damps down the acoustic
peaks. This effect is due to photon diffusion during the process of
decoupling, between the times when photons were tightly coupled to matter
and when the Universe became transparent.
19. The Sachs–Wolfe effect applies during the passage of photons through the
Universe since the time of last scattering. The change in gravitational
potentials as photons pass through the Universe leaves imprints that are
detectable through stacking analyses of galaxy clusters in CMB maps. The
strength of the late-time Integrated Sachs–Wolfe effect is sensitive to the
evolution of dark energy.
20. The reionization of the Universe by the first luminous objects in the
Universe (stars or accreting black holes) also increased the optical depth to
Thomson scattering experienced by CMB photons. This optical depth is a
free parameter in fitting the acoustic peaks of the CMB.
21. Primordial gravitational waves (also known as tensor modes) generate
B-mode polarization. The spectral index and intensity of the gravitational
wave background is determined by the same inflation parameters that predict
the departure from scale invariance in the scalar perturbations, so detecting
these primordial gravitational waves (either in B-mode CMB maps or in
gravitational wave observatories) would be a direct test of the inflationary
framework.
22. Dark energy models generalize the cosmological constant, whose effective
equation of state parameter is w = −1, and consider the possibility of w
varying with time. This could occur, for example, if Λ is associated with a
scalar field, analogous (but not identical) to the inflaton.
Further reading
• Audio representations of CMB acoustic peaks can currently be found on Mark
Whittle’s web pages ([Link] and John
Cramer’s web pages ([Link]
• For more details on energy and momentum conservation in general relativity,
see Lambourne, R., 2010, Relativity, Gravitation and Cosmology, Cambridge
University Press.
• It’s also possible for advanced undergraduate-level students to achieve a deeper
knowledge of general relativity, though this is usually outside the scope of most
undergraduate degrees. For readers who would like to try this immensely
rewarding intellectual adventure, we would recommend an accessible text on
general relativity such as Hobson, M.P., Efstathiou, G.P. and Lasenby, A.N.,
General Relativity: An Introduction for Physicists.
• The discovery papers of the CMB are: Penzias, A.A. and Wilson, R.W., 1965,
‘A measurement of excess antenna temperature at 4080 Mc/s’, Astrophysical
Journal, 142, 419; Dicke, R.H., Peebles, P.J.E., Roll, P.G. and Wilkinson, D.T.,
1965, ‘Cosmic black-body radiation’, Astrophysical Journal, 142, 414.
• Steigman, G., 2007, ‘Primordial nucleosynthesis in the precision cosmology
era’, Annual Review of Nuclear and Particle Systems, 57, 1 (available at
arXiv:0712.1100).
90
Summary of Chapter 2
• For more on the statistical physics that connects the microscopic world of
molecules and collisions with the macroscopic world of densities and
pressures, see, for example, Mandl, F., 1988, Statistical Physics, Wiley.
• For more about neutrino astronomy, see Spiering, C., 2008, ‘High energy
neutrino astronomy: status and perspectives’, Proceedings of the 4th
International Meeting on High Energy Gamma-Ray Astronomy, AIP
Conference Proceedings Vol. 1085, pp. 18–29 (also available at
arXIv:0811:4747), or Learned, J.G. and Mannheim, K., 2000, ‘High-energy
neutrino astrophysics’, Annual Review of Nuclear and Particle Science, 50,
679.
• For more on Fourier series and transforms, see, for example, Gillett, P., 1984,
Calculus and Analytic Geometry, Houghton Mifflin Harcourt.
• For more about the physics of the CMB, try Hu, W. and Dodelson, S., 2002,
‘Cosmic microwave background anisotropies’, Annual Review of Astronomy
and Astrophysics, 40, 171, or Hu, W. and White, M., 1997, ‘A CMB
polarization primer’, New Astronomy Reviews, 2, 323.
• A more substantial introduction to many themes in this chapter can be found in
the graduate-level text Peacock, J.A., 1999, Cosmological Physics, Cambridge
University Press.
91
Chapter 3 The local Universe
‘When I use a word,’ said Humpty Dumpty, ‘it means exactly what I intend
it to mean.’ ‘The question is,’ said Alice, ‘can you use words this way?’
‘The question is,’ said Humpty, ‘who is to be the Master?’
Lewis Carroll, Alice’s Adventures in Wonderland
Introduction
You may have been surprised at the number of fundamental unknowns in the
basic processes in the early Universe. In this chapter we’ll cover our cosmic
neighbourhood and find out some of what is, and what isn’t, understood even
here. We’ll cover some of the phenomenology and terminology (a necessary evil)
and some more tools of precision cosmology, and also start assembling the
evidence for how the birth of galaxies like our own Milky Way happened.
100
NGC 1560
0 2 4 6 8 10
(a) (b) radius/kpc
Figure 3.1 (a) An optical image of the approximately edge-on spiral galaxy NGC 1560 and (b) the galaxy’s
rotation curve. The data points show the observed velocities. The dashed curve shows the predicted contribution to
the velocities from the inferred mass of stars, while the dotted line shows the contribution from the inferred mass of
the gas. These are not enough on their own to explain the observed velocities. The dash–dotted line is the dark
matter halo required to make up the remaining mass.
(where the sum is over the neutrino species) is too low to account for all the dark
matter.
Two leading dark matter particle possibilities are the neutralino and the axion.
The neutralino is a partner particle to the neutrino predicted in supersymmetry,
which is an extension of the standard model of particle physics. The neutralino is
an example of a weakly interacting massive particle (WIMP). If neutralinos exist
in cosmologically-significant numbers, they may be found by direct detection
experiments such as the one at Boulby Mine in the UK, through detecting the
recoils of nuclei colliding with WIMPs. The interaction probability between a
WIMP dark matter particle and normal matter is predicted to be extremely low,
but with a sufficient flux of particles through the Earth (as the Solar System
traverses the Galaxy) very rare collisions may be detected. As yet there are
no uncontested claims of WIMP detection from these experiments. Axions,
meanwhile, are particles proposed in quantum chromodynamics with a very low
mass (10−3 –10−4 eV) that couple only very weakly to electromagnetism. They
are predicted to decay into two photons in the presence of a strong magnetic field.
Some recent laboratory claims of axion direct detection exploiting this decay (or
its time reverse) were unfortunately later withdrawn or proved unrepeatable in
other experiments.
In 2008 there was a flurry of excitement over the cosmic ray positrons detected by
the PAMELA satellite (Payload for Antimatter Matter Exploration and
Light-nuclei Astrophysics), as well as the ATIC (Advanced Thin Ionization
Calorimeter) balloon-borne measurements of electrons and positrons (which
ATIC couldn’t distinguish). Both experiments inferred a peak in the spectrum of
cosmic ray electrons at about 500 GeV. Could this be the signature of dark matter
93
Chapter 3 The local Universe
94
3.2 The Hubble tuning fork
96
3.3 Spiral galaxies and the Tully–Fisher relation
transmission
transmission
Z Y J H K
transmission
u g r i z
transmission
U B V R I
00
00
00
10 00
0
0
00
00
00
40
60
80
90
20
30
λ/Å
Figure 3.3 A selection of astronomical filters. The bottom row shows the UBVRI system at use at the Kitt Peak
National Observatory. The next row up shows the Sloan Digital Sky Survey filters. The second from top shows the
ZYJHK infrared filters in use at the United Kingdom Infrared Telescope, while the top row shows the filter set of
the COMBO-17 survey (Chapter 4). The y-axis gives the relative transmission from the top of the atmosphere to
the detector in the telescope, in arbitrary units.
the ‘low surface brightness’ galaxies. These also appear to follow the same
Tully–Fisher relationship, but with a bigger scatter. The correlation is therefore
somewhat surprising, and the low scatter about the best fit for high surface
brightness galaxies is doubly surprising, so reproducing the observed Tully–Fisher
relation has been a key constraint of models of spiral galaxy discs. Empirically,
the star formation rate in galactic discs is proportional to the gas density ρ to the
power n, where n 1 1–2, but star formation is a very complex process and the
theoretical underpinning of this relation (known as the Schmidt law) is still
sketchy.
Having said that, we’ll see below that the Tully–Fisher relation is also empirically
useful because it can be used to determine the luminosity distance to a galaxy,
from which the Hubble parameter can be found.
The profiles of spiral discs have a surface brightness I that varies with radius
typically as I ∝ exp(−r/rscale ), where rscale is in this case known as the scale
length. Similarly, the disc of our Galaxy (and indeed all spiral galaxies) has a
density of stars that varies exponentially with the vertical height above (or below)
the disc: I ∝ exp(−h/hscale ), where hscale is known as the scale height of the
disc. This scale height appears to vary with the type of stars: the youngest stars
have hscale ∼ 100 pc, while the oldest stars have hscale ∼ 1.5 kpc. This latter
population is also known as the thick disc. It’s not known what generated
97
Chapter 3 The local Universe
the thick disc in our Galaxy, though these stars are observed to have a lower
metallicity so may represent a relic from an earlier phase of the formation of our
Galaxy, or perhaps they are a fossil of a disruption that our Galaxy underwent
early in its history from another passing galaxy. Further out, stars are found
throughout the halo of the Galaxy. These halo stars are typically much less
enriched in heavy elements, suggesting that they are from a very early phase in
the formation of our Galaxy.
The discs of spiral galaxies are far more abundant in gas than elliptical galaxies,
which we’ll meet in the next section. Hydrogen (mainly in the form of
molecular H2 , but also neutral H I and ionized H II) dominates the gas mass of
galaxies, but very often the easy-to-measure CO emission at radio wavelengths is
used as a substitute measure for the gas content. CO molecules have a J = 1 → 0
transition at 115 GHz, and other transitions at integer multiples of this frequency.
The CO-to-H2 abundance is sometimes taken as a fixed quantity in extragalactic
astronomy, but the CO-to-H2 conversion factor (known as XCO ) is only accurate
to ±50% at 1σ (see the further reading section for more information). We’ll
see in later chapters that many galaxies are strongly luminous at far-infrared
wavelengths, caused by thermal radiation from dust.
galaxies in Equation 3.2. If we assume that the galaxies are gravitationally bound,
then a dimensional analysis (or the virial theorem — see below and the further
reading section) implies σv2 ∝ M/r0 . Now suppose that there is a mass-to-light
ratio that is a weak function of mass, so M/L ∝ M a . Together with L ∝ I0 r02
this can be shown to imply that
L1+a ∝ σv4−4a I0a−1 , (3.3)
which fits the observations provided that a 1 1/4.
The fundamental plane is also sometimes known as the Dn –σ relation. The
parameter Dn is the diameter in which a galaxy’s surface brightness is larger than
a given constant. This diameter will depend approximately on the fundamental
plane parameters as power laws: Dn ∝ r0α I0β for some constants α and β. If the
reference surface brightness is chosen carefully, one can find a definition for Dn
in which it is also proportional to some power of σv , and so approximates the
fundamental plane. In practice the Dn –σ relation is used as a distance indicator:
knowledge of σv and a measurement of the angular size gives an angular diameter
distance to the galaxy.
99
Chapter 3 The local Universe
1/2 1/2
Mgas /Mtotal ∝ r0 LX . So, if we know the distance to a cluster, we can calculate
its size and X-ray luminosity, and find the baryonic mass fraction. This is further
evidence for dark matter, i.e. the gas is not massive enough on its own to entrain
itself in hydrostatic equilibrium. It’s often assumed that the baryonic mass ratio
shouldn’t depend on redshift, since the mass is accreted into the cluster purely
gravitationally, regardless of whether it’s baryonic or not. This leads to another
constraint on the cosmological parameters: if the angular size is θ0 and the X-ray
flux is FX , we have that θ0 = r0 /dA (where dA is the angular diameter distance),
while LX ∝ FX d2L (where dL is the luminosity distance). If Mgas /Mtotal is
√ √
constant, then r0 LX must be constant, which equals θ0 FX (1 + z)1/2 r3/2 ,
where r is the proper motion distance (using Equations 1.50) or the comoving
distance if we’re assuming a spatially-flat universe. The expression for r depends
on the cosmology (Equation 1.44), so the assumed constancy of Mgas /Mtotal can
be used to derive cosmological parameters.
Galaxy clusters were first discovered in optical photographic imaging surveys.
Abell, G.O., 1958, Astrophysical The Abell cluster catalogue was compiled by an exhaustive visual inspection of
Journal Supplement, 3, 211. photographic plates. Clusters were classified according to their ‘richness’, i.e.
their density of galaxies, with Abell richness class 0 being the sparsest and 5 the
densest. Abell found galaxy clusters out to around z = 0.2, though with some
incompleteness in the catalogue at the higher redshifts and lower richness classes.
X-ray sky surveys and modern digital optical sky surveys have both been used to
find galaxy clusters to higher redshifts, and in the future many more are expected
to be found in the next generation of CMB maps using the Sunyaev–Zel’dovich
effect (Section 3.6).
The number counts of clusters are also a cosmological constraint. The formation
of virialized clumps of dark matter can be calculated using a knowledge of the
matter power spectrum and the cosmological model. If the Universe underwent an
accelerated expansion in recent history (e.g. z < 0.5), this will have a strong
effect on the gravitational collapse of clumps. The mass function of clusters is
therefore a strong function of the dark energy parameter w(z).
At the centre of a galaxy cluster is usually a large galaxy, usually the brightest in
the cluster (known as the brightest cluster galaxy or BCG). Often this galaxy is a
giant elliptical; otherwise it is a very large S0 galaxy known as a D galaxy (the D
stands for ‘diffuse’, since they have unusually extended envelopes) or cD galaxy
(a larger variant of D galaxies). These galaxies sometimes have multiple cores,
suggesting that they are built up by ongoing galaxy mergers.
case), the length ΔY should be the same as the angular size, so we can derive an
angular diameter distance to the cluster. The Hubble parameter can be derived
from cluster Sunyaev–Zel’dovich measurements, and though the constraints are
not currently as tight as other determinations (61 ± 3 (random) ± 18 (systematic)
km s−1 Mpc−1 , where the systematics are from, for example, uncertainties in the
radial temperature profiles in the cluster), it is important to have independent
checks.
The Compton scattering conserves the number of CMB photons, but redistributes
their energies. The CMB photons in the Rayleigh–Jeans regime are suppressed,
but on the other side of the black body peak the number of photons is enhanced.
This characteristic wavelength-dependent suppression and enhancement, known
as the Sunyaev–Zel’dovich effect (or S–Z effect), will be searched for in the
next generation of CMB maps to make comprehensive all-sky galaxy cluster
surveys. Figure 3.4 shows the predicted absorption and emission from the
Sunyaev–Zel’dovich effect.
wavelength/mm
10 5 2 1 0.5
500
0.2
200
0.1
ΔIν /MJy sr−1
Iν /MJy sr−1
100
50
0
kinetic SZE
20
thermal SZE
−0.1
20 50 100 200 500 100 200 300 400 500
frequency/GHz frequency/GHz
Figure 3.4 The Sunyaev–Zel’dovich effect on the intensity Iν of the CMB. The left panel shows an exaggerated
demonstration of the effect, while the right panel shows the change in the CMB intensity from the thermal and
kinetic S-Z effect for an electron temperature of 10 keV, a Compton y-parameter of 10−4 and 500 km s−1 peculiar
velocity.
An additional S–Z effect can occur if the electrons have additional kinetic
energies from bulk motion of the cluster relative to the cosmic rest frame, though
this effect (termed ‘kinetic S–Z’ to distinguish it from the ‘thermal S–Z’ discussed
in this section) is much smaller, as shown in Figure 3.4. A more lengthy
calculation of the S–Z distortion (see the further reading section) gives
*
ΔT kT
= f (x) y = f (x) ne σT dY, (3.4)
TCMB me c2
101
Chapter 3 The local Universe
Cross sections
Cross sections express the probability in quantum mechanics that objects
interact, such as a photon being absorbed by a hydrogen atom. If atoms were
classical spheres, the probability that the photon is absorbed would be 1 if
the photon passed into the sphere, and 0 if it did not. The cross section σ in
this case would be σ = πr2 , where r is the radius of the sphere. This is the
size of the target that the photon must hit if it is to be absorbed. However,
the atom does not have a sharp boundary. Instead, we can define the total
cross section of N atoms to be R/S, where S is the number of incident
photons per unit area per unit time, and R is the absorption rate, i.e. the
number of absorptions per unit time. The cross section for one atom would
then be σ = R/(N S).
102
3.7 The morphology–density relation
0.9
E
S0
0.8
S + Irr
0.7
0.6
fraction of population
0.5
0.4
0.3
0.2
0.1
Figure 3.5 The morphology–density relation. The fractions of elliptical, S0 and spiral + irregular galaxies are
plotted as a function of the logarithm of the projected galaxy density.
However, this doesn’t appear to be the whole story as to how elliptical galaxies
themselves form. The tightness of the fundamental plane relations and their old,
red colours (consistent with highly evolved stellar populations) suggest instead a
formation early in the history of the Universe (e.g. redshifts z > 2) in a large
starburst (a strong burst of star formation), followed by passive stellar evolution
(i.e. no subsequent star formation, so the galaxy colours change purely through
the passage of their stars through the Hertzsprung–Russell diagram). This became
known as the monolithic collapse model and reproduced the fundamental plane The classic reference for this
and the colour–magnitude diagrams of local elliptical galaxies. This model also model is Eggen, O.J.,
explained why the metal-poor stars in the Galactic halo have highly elliptical Lynden-Bell, D. and Sandage,
orbits: they were formed when the initial gas cloud was in the process of collapse. A.R., 1962, Astrophysical
Metal-rich stars in the disc, meanwhile, are formed later in this model. The Journal, 136, 748.
alternative is the hierarchical structure formation model in which, as we’ll see,
galaxies are formed piecewise by the accretion and influence of neighbours. In
particular, numerical simulations predict that the merger of spiral discs will
result in an elliptical galaxy morphology. The relative dominance of these two
mechanisms is currently the subject of some debate, and we shall return to this
nature versus nurture issue in later chapters. The fundamental difficulty is
that while the evolving distribution of dark matter can be (reasonably) easily
calculated, because it only interacts through gravity, the evolving distribution of
baryons such as stars and gas is much more complicated.
The following exercises will derive the Jeans mass: the mass above which an
object is unstable to gravitational collapse.
103
Chapter 3 The local Universe
Exercise 3.2 Imagine a uniform sphere of radius R and density ρ. Write down
the gravitational potential energy of a shell inside this sphere of radius r and
thickness dr, then integrate these shells to show that the gravitational binding
energy of the sphere is
−3 GM 2
EGR = , (3.6)
5 R
where M is the total mass of the sphere.
Exercise 3.3 According to the virial theorem, an object is stable against
gravitational collapse when 2EK + EGR = 0, where EK is the kinetic energy.
Assuming that the sphere is made of an ideal gas, show that the condition for
gravitational collapse can be stated as
C D C D
5kT 3/2 3 1/2
M> . (3.7)
Gmp 4πρ
This mass threshold is known as the Jeans mass. Put in the numbers from
previous chapters to show that a baryonic overdensity of about a million solar
masses was unstable to collapse at the epoch of recombination. ■
This mass is tantalizingly similar to the masses of present-day globular
clusters, which are some of the oldest bound objects known in the Universe.
However, it’s unlikely that globular clusters formed as early as this, because of
observed relationships between globular clusters and the galaxies that they
inhabit. Nevertheless, you’ll see in Exercise 6.6 of Chapter 6 that seeds of around
this mass at redshifts higher than all known quasars are needed to explain the
existence of supermassive black holes in quasars. The main difficulty, however, in
making predictions is that baryons can do much more than just collapse under
gravity, as we’ll see in later chapters.
104
3.9 The cooling flow problem
105
Chapter 3 The local Universe
relationships for the two types are different; only type I Cepheid variables are
used as primary distance indicators.
• Planetary nebula luminosity function: this is the number of planetary nebulae
per unit volume per unit luminosity. The luminosity is usually measured in the
[O III] 500.7 nm emission line of oxygen. It has a characteristic shape with a
sharp cut-off above a fixed luminosity, which can be used as a standard candle.
• Globular cluster luminosity function: this is found to be approximately
Gaussian. The mean of this Gaussian can be used as a secondary distance
indicator (a standard candle).
• The light echo from the supernova SN 1987A is a standard rod.
• The rotation velocity of the water maser (a naturally-occurring microwave
laser) in the galaxy NGC 4258 can be compared to its proper motion, finding a
distance of 7.2 ± 0.5 Mpc.
• Surface brightness fluctuations: if there are on average N stars per square
arcminute on the sky in a galaxy, this number will be
√ subject to Poisson
statistics and will fluctuate with standard deviation N . The fluctuations in
surface brightness can therefore give an estimate of the number of stars per unit
area on the sky in that galaxy, from which an angular diameter distance to that
galaxy can be derived.
• Type Ia supernovae are standard candles that will be discussed in more detail
below.
Figure 3.6 summarizes how larger-scale distance indicators depend on earlier
stages. The techniques have been labelled as standard rods, candles etc., but note
that for the nearest objects the various types of distances are indistinguishable.
From a comparison with the redshifts, the Hubble parameter can be found through
Equation 1.44 and its variants. In the low-redshift limit, the various distances
(angular diameter, luminosity, comoving) are indistinguishable and the expression
for H0 reduces to H0 = cz/d, where d is the distance. One of the primary goals
of the Hubble Space Telescope (HST) was the determination of the Hubble
parameter to 10% accuracy. This was achieved by measuring the periods and
magnitudes of Cepheid variables in 18 spiral galaxies at distances < 20 Mpc. The
high angular resolution of the HST was needed to identify the stars in these
crowded fields. This provided the fundamental calibration of the Tully–Fisher
relation, the fundamental plane, type Ia supernovae and surface brightness
fluctuations; the final result was H0 = 72 ± 8 km s−1 Mpc−1 .
Perhaps the most striking recent progress with the distance scale has been with
type Ia supernovae. These supernovae are caused when a white dwarf is accreting
matter from a companion star, which eventually sends the star above the threshold
for ignition of carbon. The star explodes and briefly has a luminosity of about ten
billion times that of our Sun. Type Ia supernovae occur only about once every two
hundred years in our Galaxy (the last one in our Galaxy was seen by Tycho Brahe
in 1572), but may occur at the rate of one per second in the observable Universe.
Since white dwarfs have a small range of masses, type Ia supernovae have a very
restricted range of luminosities, making them ideal for consideration as standard
candles. Empirically they have a peak luminosity that appears to be related to the
106
3.10 The cosmological distance ladder
SN Ia Dn − σ
100 Mpc Tully–Fisher
SBF
RGB
10 Mpc tip
Local Group NGC 4258
novae
and maser GCLF
PNLF
1 Mpc HST Cepheids
Local Group
RR Lyrae
100 kpc LMC Cepheids
SN 1987A
light echo
globular cluster
10 kpc Galactic RR Lyrae RR Lyrae
novae statistical π
cluster Cepheids
Figure 3.6 The cosmological distance ladder, showing how each method
depends on the calibration for more nearby methods (arrows). Insecure calibration
steps are shown as dashed lines. The pink boxes refer to methods that are useful in
star-forming galaxies, while the blue boxes are useful in early-type galaxies. Open
boxes are geometric distance determinations. The PNLF box (planetary nebulae
luminosity function) works for all galaxy populations in the local supercluster.
900 000
empty model
flat dark-energy model
800 000
closed dark-energy model
Einstein–de Sitter model
700 000
dusty E–d S model
closed matter-only model
600 000
de Sitter model
cz/km s−1
evolving supernova
500 000
binned data
400 000
300 000
Figure 3.7 Constraints on the
Hubble diagram of type Ia 200 000
supernovae. The flat dark energy
model has Ωm = 0.27 and 100 000
ΩΛ = 0.73. It is, of course, also
possible to fit the data with a
model in which supernovae have 0 1 2 3 4 5 6 7 8 9 10 11 12
a specially-tailored evolution. luminosity distance/Gpc
108
3.11 The large-scale structure of the Universe
200
Leo I Leo II
150
100
UMi
z/kpc (b = +90◦ )
M.W.
50
Sex
Dra
0
LMC
−50 Sgr
−100
0
x/kpc (l = 0◦ ) 100 200
−100 0
200 −200
y/kpc (l = 90◦ )
Figure 3.9 The neighbourhood of our galaxy. MW marks the Milky Way; l and Figure 3.10 The Andromeda
b are the galactic longitude and latitude coordinates. The other annotations are galaxy, M31, as seen in the
abbreviated names of the nearby galaxies. optical.
Our Galaxy is falling towards and will eventually itself be accreted by its more
massive neighbour, the Andromeda galaxy (Figure 3.10), also known as M31. In
about a billion years’ time, Andromeda will appear about the size and brightness
of the Magellanic Clouds today. Even today, though, it is visible by the naked eye
on a dark enough night in the northern hemisphere. M31 also has its own satellite
galaxies currently being accreted.
Moving slightly out, both our Galaxy and M31 are members of the Local Group,
known to contain over 40 galaxies. Edwin Hubble was the first to recognize that
these neighbours represent an overdensity relative to the average galaxy density.
Moving slightly out again, the Local Group is itself interacting with other galaxy
groups, such as the Maffei 1 Group, the Sculptor Group, the M81 Group and
the M83 Group (each named after one of their members). Galaxy groups are
gravitationally bound but are not as massive as galaxy clusters; there is no
established dividing line between the two, but > 1014 M) systems are usually
regarded as clusters. Our nearest galaxy cluster is the Virgo cluster (Figures 3.11
and 3.12). The Virgo cluster currently has a positive redshift, but the Local Group
is still likely eventually to be accreted by it. The Virgo cluster and the next nearest
cluster, Coma, are themselves part of a structure known as the local supercluster. Figure 3.11 The centre of the
On these progressively larger scales (Figure 3.13), the Universe looks filled with Virgo cluster of galaxies, as seen
sheets and filaments — indeed, the local supercluster galaxies are preferentially in optical light. The cluster
(though not exclusively) in a particular plane. contains over 2000 galaxies and
has a diameter ten times bigger
than the full Moon on the sky,
but is too faint to be seen by
unaided eyesight.
109
Chapter 3 The local Universe
Figure 3.12 The location of our Local Group of galaxies and the Virgo cluster.
110
3.11 The large-scale structure of the Universe
Figure 3.13 The Virgo cluster in relation to nearby superclusters. On this truly
gigantic scale, the Local Group is barely discernible on the page.
Compare the scale to Figure 3.14. The large-scale structure of the Universe looks
cobwebby, with superclusters, walls and giant voids. Note that the sensitivity in
Figure 3.15 tapers off at the largest distances. On the very largest scales the
Universe is beginning to look homogeneous, which we shall quantify with the
power spectrum.
111
Chapter 3 The local Universe
13 hr 12 hr
14 hr 11 hr
15 hr 10 hr
16 hr 9 hr
17 hr 8 hr
Figure 3.14 The famous
‘stick man’ in the CfA galaxy 10 000 km s−1
redshift surveys. The angular
axis is right ascension, and the 5000 km s−1
distance from the apex is
galactocentric.
10 h 0.2
3h
ift 0.15
d sh
11 h re 0.1 2h
0.05
12 h 1h
0h
13 h
Milky
23 h
14 h Way
Figure 3.15 The distribution
right
of galaxies seen in the 2dF ascension
22 h
galaxy redshift survey.
of God and the Kaiser effect flattening. This flattening is determined by the value
of Ω0.6
m /b, where b is known as the bias parameter and expresses how much
stronger the clustering of galaxies is compared to dark matter (see also Chapter 4).
20
π/h−1 Mpc
113
Chapter 3 The local Universe
10
0.1
Δ2 (k)
Abell
radio
Abell × IRAS
0.01 CfA
Figure 3.17 The APM/Stromlo
dimensionless power spectrum radio × IRAS
IRAS
of galaxies, selected through a
0.001 APM (angular)
variety of means. The bias
parameter depends on the nature
of the selection, so the power
0.01 0.1 1
spectra have been normalized to
k/h Mpc−1
b = 1.
In other words, the correlation function is the Fourier transform (see the box in
Section 2.9) of the power spectrum. If the density field is isotropic, then the
power spectrum will depend only on the magnitude of the k vector, not its
direction: (|δk |2 / = |δk |2 (k). Since ξ(r) must be real, we can replace eik·r with
cos(kr cos θ), and integrating over all angles in three dimensions can be shown to
give
*
V sin kr
ξ(r) = 3
P (k) 4πk 2 dk. (3.11)
(2π) kr
Similarly, we can write the power spectrum as a transform of the correlation
function:
*
2 V 3 2 3 ∞ sin kr 2
Δ (k) = 3
4πk P (k) = k ξ(r) r dr. (3.12)
(2π) π 0 kr
In summary, the power spectrum P (k) (or Δ2 (k)) is very closely related to the
correlation function ξ(r).
So far we’ve talked about three-dimensional power spectra and the correlation
function of galaxies in three dimensions. To measure this we need the positions of
galaxies on the sky, and the distance (e.g. redshift) to the galaxies. But what if we
don’t have redshifts? Can we determine anything about the clustering? It turns out
that we can, though we need to make some assumptions about the number of
galaxies that we see per unit redshift. This might evolve for example because of
galaxy merging, and needs to take account of the fact that less-luminous galaxies
can be seen only nearby while the rarer bright galaxies can be seen further away.
We can write down an angular correlation function w(θ) on the sky in a similar
way to the spatial correlation function, as the excess neighbours that you’d see
over an unclustered population:
d2 Pr(θ) = n2 (1 + w(θ)) dΩ1 dΩ2 , (3.13)
114
3.11 The large-scale structure of the Universe
where dΩ1 and dΩ2 are the solid angles of nearby patches of sky, and n is the
number of galaxies per unit area on the sky. We won’t prove the relationship here
between ξ(r) and w(θ) (it is known as Limber’s equation), but if w(θ) is a power
law with w(θ) = Aθ1−γ (where A is some constant), then it turns out that ξ(r) is
also a power law with ξ(r) = (r/r0 )−γ (where r0 is a scale length known as the
clustering length scale). The constants A and r0 can be related with a knowledge
of the number of galaxies per unit redshift. See, for example,
The clustering of galaxies was different at high redshifts. We can add an evolution Gonzalez-Solares, E.A. et al.,
term into the correlation function: 2004, Monthly Notices of the
C D−γ Royal Astronomical Society,
r 352, 44.
ξ(r) = (1 + z)−(3+ε) . (3.14)
r0
If ε = 0, then the galaxy clustering is constant in physical coordinates, i.e. it’s
unaffected by the expansion of the Universe. If ε = 3 − γ, then the clustering is
constant in comoving coordinates. Equation 3.14 still yields a power law angular
correlation function w(θ) = Aθ1−γ . Just so you have it, in this case Limber’s
equation (which we won’t prove) is See, for example, Phillips, S. et
) 1−γ −1 al., 1978, Monthly Notices of the
γ dA g (z) (1 + z)−(3+ε) (dN/dz)2 dz Royal Astronomical Society,
A = Cr0 1) (2 , (3.15)
(dN/dz) dz 182, 673.
where dA is the angular diameter distance, g(z) is the derivative of proper
distance with redshift, and dN/dz is the number of galaxies
√ per unit redshift. The
normalization C is a function of γ and is given by C = π Γ((γ − 1)/2)/Γ(γ/2),
where Γ is a standard function (usually calculated numerically) known as the
gamma function.
We’ll end our tour of the large-scale structure of the Universe with the clustering
of quasars, shown in Figure 3.18. Quasars, as we’ll see in Section 4.6, are
supermassive black holes in the centres of galaxies. The accretion discs around
these black holes can outshine the rest of the galaxy, as we’ll discuss in Chapter 4.
This extraordinary luminosity makes them visible in large numbers throughout
most of the Hubble volume (Section 1.9). The Universe appears far more
homogeneous on these scales, but some clustering signal is still present. The
strength of quasar clustering today is comparable to the predicted dark matter
clustering, but in the early Universe the amplitude of the quasar correlation
function was higher (unlike the dark matter). In general, virialized dark matter
haloes (when considered as objects in their own right) cluster more strongly than
the dark matter distribution as a whole, so the enhanced clustering strength of
quasars could reflect the haloes that they inhabit. The observed quasar clustering
could be accounted for by assuming that quasars inhabit dark matter haloes with a
typical mass of around 1013 M) . The phenomenon can be characterized by the
bias parameter b, mentioned briefly above:
C D C D
δρ δρ
=b , (3.16)
ρ QSOs ρ dark matter
where the bias parameter can be a function of redshift. There isn’t much
theoretical basis for assuming this relationship; it’s more of an empirical rule of
thumb.
115
Chapter 3 The local Universe
1 1
116
Summary of Chapter 3
−0.05
0.1 0.2
k/h Mpc−1
log10 [P (k)/P (k)smooth ]
−0.05
0.1 0.2
k/h Mpc−1
Figure 3.19 The deviations
(c) all from a smooth power spectrum
0.05 seen in the galaxy power
spectrum, compared to the
0 predictions for baryon wiggles.
−0.05
The galaxy surveys are the
2dFGRS, the whole SDSS
galaxies, and the subset of
0.1 0.2
k/h Mpc−1
luminous red galaxies from
SDSS.
Exercise 3.4 If LBAO is the comoving size of the BAOs at some redshift z, and
θBAO is the observed angular size in a flat universe, show that the comoving
distance to redshift z is dcomoving = LBAO /θBAO .
Exercise 3.5 If δz is the size along the redshift axis of the BAOs, show that
H(z) = c δz/LBAO , where LBAO is the comoving size of the BAOs.
Exercise 3.6 Is the scale length of baryon wiggles dependent on galaxy
bias? ■
Summary of Chapter 3
1. One piece of evidence in favour of dark matter is galaxy rotation curves.
2. Local galaxies can be classified by eye (or by software) along the Hubble
tuning fork. Irregular galaxies lie outside this framework, and low surface
brightness galaxies were unknown in Hubble’s time. Local elliptical
galaxies tend to occur in galaxy clusters, while local spiral galaxies tend to
occur in the field (the morphology–density relation).
3. There is experimental support for the Tolman test (that surface brightness
decreases as (1 + z)4 ) once stellar evolution is taken into account.
117
Chapter 3 The local Universe
4. Spiral galaxies obey the Tully–Fisher relation, that luminosity L scales with
velocity width Δv as L ∝ (Δv)α , with α 1 3–4. This can also be used as a
distance indicator.
5. Elliptical galaxies also have a similar relation, known as the Faber–Jackson
relation: L ∝ σvα , where σv is the velocity dispersion and α 1 3–4, though
the physical underpinnings are different.
6. There is a more general relationship for ellipticals, known as the
fundamental plane: L ∝ I0x σvy , where L is the luminosity, I0 is the surface
brightness within a given radius, and σv is the velocity dispersion. This can
also be used as the basis of distance indicators.
7. Clusters of galaxies show evidence for dark matter, from the virial theorem
(kinetic energy equals minus one half times the potential energy for a
virialized system).
8. If we assume that the baryonic mass fraction of galaxy clusters doesn’t
depend on redshift, then galaxy cluster sizes and X-ray luminosities can be
used to constrain cosmological parameters.
9. The Sunyaev–Zel’dovich effect is the change of temperature of CMB
photons passing through a galaxy cluster, due to Compton scattering. It can
be used to find the line-of-sight size of a cluster, and assuming spherical
symmetry (true on average) gives an angular diameter distance estimate.
10. Galaxy surveys to determine the large scale structure of the Universe reveal
redshift space distortions known as the fingers of God and the Kaiser effect.
These are due to the velocity dispersion of galaxies within a cluster and the
peculiar velocities of galaxies falling into clusters respectively.
11. The clustering of galaxies can be measured with the correlation function
ξ(r), which is a measure of the number of neighbours that a galaxy has in
excess of what’s expected in an unclustered population. This is proportional
to the Fourier transform of the galaxy power spectrum. It’s also related to
the angular correlation function, i.e. the clustering of galaxies as they appear
on the sky.
12. The acoustic peaks imprinted into the cosmological density field in the early
Universe are still present in the present-day clustering of galaxies. These
features, known as baryonic acoustic oscillations or simply baryon wiggles,
are a standard rod that can be used to derive angular diameter distances and
also the variation in the Hubble parameter H(z). This is a route to
constraining the dark energy equation of state parameter w.
Further reading
• The Galaxy Zoo website is currently [Link].
• Gaitskell, R.J., 2004, ‘Direct detection of dark matter’, Annual Reviews of
Nuclear and Particle Science, 54, 315.
• For more on the virial theorem, see, for example, Chapter 2 of Ryan, S.G. and
Norton, A.J., 2010, Stellar Evolution and Nucleosynthesis, Cambridge
University Press.
118
Summary of Chapter 3
• Young, J.S. and Scoville, N.Z., 1991, ‘Molecular gas in galaxies’, Annual
Review of Astronomy and Astrophysics, 29, 581.
• Reese, E.D., 2004, ‘Measuring the Hubble constant with the
Sunyaev–Zel’dovich effect’, in Freedman, W.L. (ed.) Measuring and Modeling
the Universe, Carnegie Observatories Centennial Symposia, Cambridge
University Press (also available at level 5 of [Link]
• Freeman, W.L., 2001, ‘Final results from the Hubble Space Telescope key
project to measure the Hubble constant’, Astrophysical Journal, 553, 47.
• For an accessible account of recent developments in the cosmological distance
ladder, see, for example, Rowan-Robinson, M., 2008, ‘Climbing the
cosmological distance ladder’, Astronomy and Geophysics, 49, 3.30–3.33.
• An accessible review of the WiggleZ baryon oscillation survey is in Blake, C.,
2008, Astronomy and Geophysics, 49, 5.19–5.24.
• The classic graduate-level text for galaxy clustering (e.g. galaxy correlation
functions and power spectra) is Peebles, P.J.E., 1980, The Large-scale
Structure of the Universe, Princeton University Press.
• Albrecht, A. et al., 2006, Report of the Dark Energy Task Force, available at
astro-ph/0609591.
119
Chapter 4 The distant optical Universe
It’s one of nature’s ways that we often feel closer to distant generations than
to the generation immediately preceding us.
Igor Stravinsky
Introduction
Perhaps one of the most astonishing surprises about the Universe is that it’s
possible to build telescopes that can observe galaxies throughout almost all the
Hubble volume, most of the way back to the Big Bang. In this chapter we’ll
discuss some of the profound insights that optical telescopes have brought us on
the evolution of galaxies in our Universe.
120
4.2 Cold dark matter and structure formation
This changed with the advent of precision cosmology and the concordance model
(Chapters 1 and 2) with Ωm,0 h2 = 0.1326 ± 0.0063, ΩΛ,0 = 0.742 ± 0.030 and
h = 0.72 ± 0.03. A flat universe with a cosmological constant has both more time
and more volume at high redshifts than a flat Λ = 0 universe. Both these effects
increased the expected numbers of galaxies at the faint end of the B-band number
counts. This resolved most of the faint blue galaxies problem, but perhaps it’s a
pity that optical galaxy number counts are no longer demanding such strong
changes to our picture of galaxy formation and evolution. (The number counts of
galaxies at submm wavelengths nevertheless turned out to be an unexpectedly
strong constraint, as we’ll see.)
Although this chapter is mainly about optical astronomy, it’s worth just adding
that radio source counts were found as early as 1959 to be steeper than an S −5/2
power law. This was one of the first pieces of evidence that we don’t live in a flat,
steady-state universe, and it predated the discovery of the CMB.
121
Chapter 4 The distant optical Universe
λ/(h−1 Mpc)
104 103 102 10 1
105
103
102
CMB
2dF galaxies
10
Figure 4.1 Matter power cluster abundance
spectrum from the largest to the weak lensing
smallest scales, derived from Lyman α forest
multiple methods. This figure
pre-dates WMAP. The solid red 1
10−3 10−2 10−1 1 10
curve is a model based on CDM −1
k/(h Mpc )
in the concordance cosmology.
122
4.2 Cold dark matter and structure formation
The density of the surrounding universe can be found by using the results of
Exercises 1.4 and 1.5 in Chapter 1, and applying Ωm = 1 (Equation 1.15). The
density of this Einstein–de Sitter universe comes out as
1
ρEdS = . (4.7)
6πGt2
Taking the ratio of these two densities gives
ρ 3Rmax c2 2 9 2
2 t (θ)
= 6πG t (θ) = Rmax c
ρEdS 8πG R3 (θ) 4 R3 (θ)
9 R2 8 (θ − sin θ)2
= Rmax c2 max
4 4c2 Rmax 3 (1 − cos θ)3
9 (θ − sin θ) 2
= .
2 (1 − cos θ)3
The function of θ is singular at θ = 0, but by writing a Taylor expansion around a
point arbitrarily close to θ = 0, it can be shown that
(θ − sin θ)2 2 1
3
1 + θ2 + · · ·
(1 − cos θ) 9 30
and therefore
C D
ρ 9 2 1 3 2
1 + θ2 + · · · = 1+ θ + ··· .
ρEdS 2 9 30 20
This is still in terms of θ, but if we now Taylor-expand Equation 4.3, we find that
Rmax θ3
t(θ) 1 + ···
2c 6
and so
C D2/3
ρ 3 12ct
11+ .
ρEdS 20 Rmax
We can therefore identify the fractional overdensity as
C D
3 12ct 2/3
δ1 . (4.8)
20 Rmax
Despite the high orders of θ that we’ve used, this is known as the linear theory
approximation (we’ll see why in Exercise 4.2). In this approximation, density
perturbations grow as t2/3 , i.e. proportional to the dimensionless scale
factor a. However, by the time the overdensity reaches the turn-around radius
Rmax , linear theory has broken down. This is at θ = π, i.e. t = πRmax /(2c)
(Equation 4.3). The overdensity will then have a turn-around density of exactly
ρ = 3M/(4πRmax 3 ). Using Equations 4.5 and 4.7 we find that the density ratio is
C D
ρ 3M 2 9 M G πRmax 2 9π 2 M G
= 3
6πGt = 3
= 2
ρEdS 4πRmax 2 Rmax 2c 8c Rmax
9 2
= π 1 5.552,
16
so the overdensity is δ = 5.552 − 1 = 4.552. Linear theory would predict
δ = (3/20)(6π)2/3 1 1.062 at this point (Equation 4.8).
124
4.2 Cold dark matter and structure formation
If the cycloid solution applied at θ = 2π, then the overdensity would collapse to a
point of infinite density. The linear theory prediction (now infinitely wrong)
3
is δ = 20 (12π)2/3 1 1.686 at this point. However, it’s more likely that the
overdensity will stabilize at the virial equilibrium, where the potential energy
equals −2 times the kinetic energy. It’s usually assumed that virialization happens
around the time θ = 2π. It’s not too hard to calculate the final virialized density.
Total energy is conserved, and since there’s no kinetic energy at turn-around, the
total energy is just the binding energy at turn-around (E ∝ 1/Rmax , Equation 3.6).
But since the virial binding energy Ev must be related to the virial kinetic energy
EK,v by Ev + 2EK,v = 0, and the total energy is E = Ev + EK,v , we must have
E = Ev /2. This means that the virialized radius must be half the turn-around
radius (Equation 3.6), i.e. the virialized density is eight times the turn-around
density. The collapse time is at θ = 2π, which is twice the turn-around time
(proved either by symmetry in Figure 4.3 or using Equation 4.3), so the density of
the surrounding Universe has gone down by a factor of 4 (Equation 4.7). The
overdensity is therefore 8 × 4 = 32 times bigger than the turn-around overdensity,
i.e. about 146. This collapse also sheds light on the fingers of God. Figure 4.4
shows how the collapse of a spherical overdensity would appear in redshift space.
squashing effect
linear regime
collapsed
turnaround
Of course, you might quite reasonably object to this analysis, because real Deriving Equation 4.9 would
overdensities are much more likely to be highly aspherical. The growth of these take us too far off-topic, and
density perturbations can be modelled with numerical N-body simulations. introduce mathematical
However, the spherical collapse model contains more elements of the underlying machinery that we won’t use
physics than you might expect, as we’ll see. We’ve also assumed an Ω = 1 elsewhere, but if you’re not
universe — but what happens more generally? satisfied with having this used
without proof, you can find
In general, the growth of dark matter perturbations in the Universe is governed by
proofs in Chapter 15 of Peacock,
fluid dynamics equations. Taking the linear approximation of these equations
1999, or Chapter 11 of Coles
gives the relation
and Lucchin, 1995, in the
ȧ Further reading section below.
δ̈ + 2 δ̇ = 4πGρm δ, (4.9)
a
125
Chapter 4 The distant optical Universe
where δ is the shorthand for the matter overdensity (δρ)/ρ0 , and a = R/R0 is the
dimensionless scale factor (Chapter 1).
Exercise 4.2 Show using Equation 4.9 that in a flat matter-dominated universe
with no cosmological constant, matter perturbations grow following the power
law δ(t) ∝ t2/3 , i.e. δ(t) is just proportional to the scale factor a. (You may
assume that H(t) = 2/(3t) in this universe.) ■
See Peacock, 1999, in the The solution for general cosmologies can be expressed as
Further reading section.
δ(z = 0, Ωm,0 , ΩΛ,0 )
δ(z = 0, Ωm,0 = 0, ΩΛ,0 = 0)
% 1 (1 (=−1
4/7
1 52 Ωm,0 Ωm,0 − ΩΛ,0 + 1 + 12 Ωm,0 1 + 1
70 ΩΛ,0 . (4.10)
Convolution
If you’ve ever experimented with image software like Photoshop, you’ll
know that it’s possible to smooth or blur an image. How does this work?
Well, at any particular point in the image, the software will consider a little
box around that point and take the average value in that little box. Perhaps
all pixels in that little box will be given the same weight, or perhaps it’ll be a
126
4.2 Cold dark matter and structure formation
weighted average with more weight given to pixels in the middle of the little
box than the edge. The software repeats this little-box-averaging around all
the positions in the image, and the result is a blurred or smoothed image.
It turns out that Fourier analysis has a very simple and beautiful way of
expressing this. Let’s suppose that the pixels are small enough that we can
consider an integral instead of a) sum. In one dimension, the blurring
∞
could be expressed as B(x) = −∞ I(x) P (x + α) dα, where I(x) is the
un-blurred data, P is the weighting function (e.g. 1 in a range −L to +L,
representing the little box size, and 0 otherwise), and B(x) is the blurred
data. The weighting function P is sometimes called the kernel. In two or
more dimensions, we’d change x into a vector x and α into a vector α, but
otherwise it’s the same. We call this the convolution of I with P , and
it’s sometimes written as B = I ⊗ P . Now, it turns out that the Fourier
transforms of B, I and P , which we can write as B, B IB and PB , are related by
BB = IB × PB. In other words, a convolution in real space can be expressed as
just a multiplication in Fourier space. The reverse is true too: a convolution
in Fourier space is the same as a multiplication in real space.
An example is the treble control on a stereo. High musical notes have high
frequencies and short wavelengths, and since the Fourier wave number k is
proportional to the reciprocal of wavelength, the high musical notes are in
the high Fourier k modes. Now suppose that we don’t want the treble (high)
notes to be so loud, so we rig up some electronics that suppresses these
high k modes, with (say) an exp(− 12 (k/k0 )2 ) factor, where k0 is some
constant. We picked this factor because it’s a Gaussian, and it turns out that
the Fourier transform of a Gaussian is another Gaussian. Multiplying
the Fourier transform of our sound with a Gaussian to suppress the high
treble notes is therefore the same as convolving our sound with another
Gaussian. So, when you’re turning down the treble control on a stereo,
you’re smoothing out the sound waves coming out of the speaker, just like
blurring an image in Photoshop by convolving it with a Gaussian.
The number density of dark matter haloes per unit mass, known as the mass
function, can be predicted by assuming that the density fluctuations are Gaussian.
First we convolve the density field with a spherical kernel with radius R, which
corresponds to a mass scale
M = 43 πR3 ρ0 . (4.11)
The probability that a random point has an overdensity δconvolved is just
Pr(δconvolved ) dδconvolved
C 2 D
1 −δconvolved
= J exp 2 dδconvolved , (4.12)
2
2π σconvolved 2σconvolved
2
i.e. just a Gaussian distribution. It turns out that the variance σconvolved See, for example, Coles and
n
can be calculated from the power spectrum P (k) ∝ k (Equation 2.44):
s Lucchin, 1995, in the Further
σconvolved ∝ M −(3+ns )/6 . In particular, the probability that a region is above the reading section.
127
Chapter 4 The distant optical Universe
function on small scales (e.g. less than one comoving Mpc), where this function
reflects the way that galaxies populate the haloes.
Figure 4.5 shows why baryons must behave differently to dark matter: if they
didn’t, galaxies would have far more satellites, just as galaxy clusters have
many satellite clumps containing galaxies. Recent discoveries of Milky Way
satellites have alleviated this problem (Section 3.11), but there is still a factor of
∼×4 deficit at some mass ranges. It’s possible to explain this deficit if the
star formation in the lowest-mass haloes is suppressed, which numerical
simulations of feedback (Chapters 5 and 6) suggest may be the case. If so, our
Milky Way galaxy is surrounded by puddles of ‘failed dwarf galaxies’ that
have yet to undergo their first star formation. Another dark matter puzzle that
baryonic physics might solve is the lack of sharp density cusps at the centres
of dark matter haloes: N -body simulations predict that haloes should have a
Navarro–Frenk–White profile — which we shall meet in detail in Chapter 7, and
which predicts a density profile varying as ρ(r) ∝ 1/r at the centre. Local galaxy
observations suggest otherwise, perhaps because winds from supernovae or black
hole accretion drive out matter and smooth out the density profile.
129
Chapter 4 The distant optical Universe
6
0.4 Gyr
5
relative flux, Fλ
4
1.4 Gyr
3
3.4 Gyr
6.4 Gyr
Figure 4.6 Simulated galaxy 2
9.4 Gyr
spectra following a 1 Gyr-long
burst of star formation. Ages 13.4 Gyr
are marked in Gyr on each
1
spectrum. The spectra have been 17.4 Gyr
normalized to 1 and offset 1 Gyr burst model
vertically for clarity. Some
spectra have lines or continua 0
2000 4000 6000 8000 104
off the top of the figure, and λ/Å
have been truncated for clarity.
The initial mass function is one of the key unknowns in observational astronomy.
Often the initial number of stars per unit mass, dN/dm, is assumed to have
a simple form such as dN/dm ∝ m−2.35 over a range 0.1–100 M) or so,
which matches inferences from present-day mass distributions of Galactic stars.
However, it’s not at all clear whether the IMF varies from place to place within a
galaxy, or between galaxies. We’ll see in Chapter 5 that when star formation rates
of galaxies are estimated, the estimates all depend on light generated ultimately
by the most massive stars. In order to find the total number of stars being formed,
we need to extrapolate from these largest stars to the smallest, which is very
sensitive to the shape of the IMF.
Dust reddening is another unknown. Dust preferentially absorbs blue optical light
compared to red. This changes the (B–V) colour of a background star by an
amount symbolized as E(B–V). We can write the observed value of (B–V) as
(B–V)obs , and it’s related to the true value (B–V)true by
(B–V)obs = (B–V)true + E(B–V). (4.15)
130
4.3 Population synthesis
presolar silicates
interstellar organic matter Figure 4.7 A large
interplanetary dust grain
collected from the upper
atmosphere of the Earth
2 µm by a NASA aircraft in
2003, during the Earth’s
supernova passage through the
olivine
dust stream from comet
interstellar 26P/Grigg-Skjellerup.
nanoglobule Parts of the dust grain
appear to be pre-solar, and
other parts originated in
the interstellar medium.
SMC bar
LMC2 supershell
6 LMC average
Milky Way (RV = 3.1)
Aλ /AV
2
Figure 4.8 The relative
extinction as a function of
wavelength, for dust found in
0 2 4 6 8
the LMC in two locations, in
(1/λ)/µm−1
the SMC and in our Galaxy.
131
Chapter 4 The distant optical Universe
This assumes that the gas is One way of measuring the amount of dust attenuation is the Balmer decrement.
optically thick to the Lyman The hydrogen Balmer lines Hα (656.3 nm) and Hβ (486.1 nm) are emitted by gas
continuum — see, for example, that’s ionized by hot stars, with a characteristic ratio that’s calculable from atomic
Osterbrock, D.E. and Ferland, physics. We’ll meet these lines again in Chapters 5 and 8. If we write the optical
G.J., 2005, Astrophysics of depth to Hα photons as τ Hα , the Hα fluxes are by definition attenuated by e−τ Hα .
Gaseous Nebulae and Active The optical depth turns out to be related to AV by τ Hα 1 0.7AV . (It shouldn’t be
Galactic Nuclei, University a surprise that it’s a linear relation, since magnitudes are a logarithmic system and
Science Books. the optical depth appears in an exponential, e−τ Hα .) However, as we see in
Figure 4.8, the extinction and optical depth to Hβ will be greater. For Galactic
dust, τ Hβ 1 1.45 τ Hα . If we compare the observed flux ratio of Hα and Hβ,
See, for example, Diplas, A. SHα /SHβ , and compare the predicted ratio of 2.8 from atomic physics, we
and Savage, B.D., 1994, can infer τ Hα and hence AV . The extinction is also empirically related to the
Astrophysical Journal, 427, 274. gas column density (which we shall meet in subsequent chapters) through
NH /E(B–V) = 4.93 × 1021 atoms cm−2 mag−1 .
It’s not just the type of dust that affects the colours, it’s the location of the dust, as
the following (optional) exercise shows.
Exercise 4.4 The previous discussion of Balmer decrements assumed that the
dust is placed in a screen in front of the emission-line-emitting gas. Suppose
instead that the dust is evenly mixed in with this gas. Calculate the Balmer
decrement SHα /SHβ that you’d measure in this situation. Now suppose that you
observed this but wrongly assumed that the dust was in a screen in front of the
stars. What AV would you wrongly infer? (This is a more difficult and slightly
more open-ended exercise than most in this book, so for quantities not supplied in
the question, improvise!) ■
Another way of looking at this exercise is that the optical depth of extinction to a
given position depends on the wavelength, so Hα photons would have a lower
optical depth than Hβ photons or ultraviolet photons. However, the light that we
receive will always be dominated by the regions with optical depths of τ < 1, so
the observed light at shorter and shorter wavelengths will be dominated by regions
with lower and lower extinctions. The modern approach in population synthesis is
to include assumptions or predictions for the dust location, density variation and
composition.
wavelength λem , and hence infer the redshift from the observed wavelength λobs
and 1 + z = λobs /λem (see Equation 1.10). But how can we find which emission
line is which? Often in practice certain features are characteristic enough to be
instantly recognizable, but this is not always the case.
One way is to use the wavelength ratios. Suppose that we have two emission lines
observed at wavelengths λobs,1 and λobs,2 . If they’re at the same redshift, then
λobs,1 λem,1 (1 + z) λem,1
= = . (4.17)
λobs,2 λem,2 (1 + z) λem,2
There are a limited number of astrophysically-plausible emission lines in galaxies,
so the observed wavelength ratios can quickly identify the emitted wavelengths.
Even when a galaxy doesn’t have emission lines, the absorption lines can be used
to find redshifts. Another absorption feature is the ‘break’ in the spectra in
Figure 4.6 at about 400 nm, known as the 4000 Å break, caused by Balmer
continuum absorption in the atmospheres of stars.
Spectroscopy is not always available, because it’s harder to get good
signal-to-noise ratios in spectra (where you need enough photons in each of
hundreds of pixels along the wavelength axis) than in broad-band imaging (where
all the photons passing through the filter are directed onto a few pixels in the
image). No matter how big your telescope, it’s always possible to make images of
galaxies too faint to find redshifts from spectra with that telescope. The next best
approach is to use the broad-band colours of galaxies to estimate the redshift. If a
galaxy has photometric measurements in many filters (e.g. UBVRIJHK —
see Section 3.3), this is effectively a spectrum with a very coarse wavelength
resolution. Instead of using emission lines, this approach uses the broad shape
of the spectrum to estimate the redshift. With good enough photometric
measurements in enough filters, it’s possible to distinguish redshifting from the
reddening from dust or the presence of old stars. The most likely redshift is
usually found from the minimum-χ2 fit to the photometric data using template
spectra. Figure 4.9 shows that this works on the whole, but it’s hard to avoid a few See, for example,
strong outliers known sometimes as ‘catastrophic failures’ of the fitting. Reducing Bolzonella, M., Miralles, J.-M.
these outliers is one of the main challenges in calculating photometric redshifts. and Pelló, R., 2000, Astronomy
The Sloan Digital Sky Survey (SDSS) used a custom-made set of broad-band and Astrophysics, 363, 476.
filters named u, g, r, i, z. The sharper boundaries of the filter curves help with
photometric redshifts.
As we’ll see in Chapter 8, high redshift galaxies can be identified from the fact
that intervening neutral clouds cause absorption of Lyman series lines. This
means that at observed wavelengths shorter than the redshifted Lyman α line (the
n = 2 → 1 hydrogen transition at an observed wavelength of 121.6(1 + z) nm),
and particularly those below the redshifted Lyman limit (91.2(1 + z) nm,
n = ∞ → 1), the continuum from the high redshift galaxy is strongly suppressed.
This means that z > 4 galaxies can be found by searching for galaxies with
detections in the B or g filters and longer wavelengths, yet non-detections in the U
or u filters. This photometric redshift technique is known as U-band dropouts or
Lyman break galaxies. Searches for galaxies at higher redshifts tend to focus on
dropouts at longer wavelengths, though it becomes progressively more difficult
because we start running out of cosmological volume, since dV /dz tends to zero
as z increases (the Hubble volume has a finite comoving size). Follow-up
133
Chapter 4 The distant optical Universe
6
HDF-S
5 MS-1054
CDF-S
4 MUSYC
HDF-N
3
zphot
Figure 4.9 A comparison of
1
photometric redshift estimates
with spectroscopic redshifts
(which have much higher
precision). Note that while the
technique works for most
objects, there are some strongly
outlying points known as 0
‘catastrophic’ failures. Five 0 1 2 3 4 5 6
survey data sets are used and are zspec
marked in different symbols.
spectroscopy often finds a Lyman α emission line (rest wavelength 121.6 nm), as
shown in Figure 4.10. Note the weak continuum at λ above Lyman α, and the lack
of continuum at λ below. However, the presence of the rest-frame 4000 Å break in
lower-redshift galaxies and the nearby [O II] 372.7 nm emission line has resulted
in some high-z galaxy claims that failed under closer examination.
z = 5.34
6 Lyα
4
Fν /µJy
4
2
0
Fν /µJy
7700 7800
λ/Å
2
134
4.5 Luminosity functions
Exercise 4.5 Figure 4.11 shows the number density of galaxies per absolute
magnitude. How is this related to the number density of galaxies per unit
luminosity? ■
Calculating the number density of galaxies in principle is very simple: count the
number of galaxies N within a volume V (we’ll assume that the galaxies aren’t
evolving), to find the number density ρ = N/V . This can be done if we know all
the galaxies within a particular volume V0 , known as a volume-limited sample.
135
Chapter 4 The distant optical Universe
−1
−2
−4
For example, if there’s only a probability of 1/10 that a particular sort of galaxy
is in the volume, then each time we see that type of galaxy in the volume,
there must be nine more we’ve missed, so each one we see is ‘worth’ ten. If
it’s a volume-limited sample, then pi = 1 for every galaxy, and it reduces to
Equation 4.19. The root-mean-square (RMS) estimate of the uncertainty in ρ from
Equation 4.20 is just
?
7N
174
E 1
σρ = . (4.21)
V0 p2
i=1 i
key test is therefore whether they seem to evolve. There are many ways of testing
this, but one method is particularly free of other assumptions: the (V /Vmax / test.
The trick here is to use the cumulative probabilities. For any probability
distribution p(x), a sample x will be in the 10th percentile exactly 10% of the
time, in the 50th percentile 50% of the time, and so on. These percentiles are
essentially the cumulative probabilities. If we plotted a histogram of the values
of x, the shape would be p(x) (within the uncertainties). However, if we plotted
a histogram of the percentiles associated with each x, the histogram would
be uniform (again within the uncertainties).
)x In other words, the cumulative
probability distribution c(x) = 0 p(x% ) dx% is uniform from 0 to 1.
In the case of our flux-limited sample, the cumulative probability that a galaxy at
distance di is seen at any distance from 0 to di is Vi /Vmax,i , where Vi is the
volume enclosed by the distance di , and Vmax,i is, as before, the volume enclosed
by dmax,i . The test that there’s no evolution is therefore that Vi /Vmax,i should be
uniformly distributed from 0 to 1. In particular, this means that the average value
should be (V /Vmax / = 1/2. To test the null hypothesis of no evolution, we need a
measured value of (V /Vmax / and its uncertainty. You might think that the best
approach is to propagate the uncertainties on the redshifts, but in practice these
are effectively negligible. However, even if the null hypothesis holds and there is
no evolution, there would still be some variation in (V /Vmax /, as Exercise 4.6
shows. The usual procedure is to use this expected variation (in the case of no
evolution) as the uncertainty (V /Vmax /, which is then used to test whether the
sample data are consistent with no evolution.
138
4.6 Active galaxies
Exercise 4.9 Assume that the galaxy luminosity function has a Schechter
function form, with a faint-end slope α < 1. Show that most of the light emitted
by galaxies per unit volume is dominated by galaxies near L∗ (also near the Milky
Way luminosity). ■
139
Chapter 4 The distant optical Universe
80 Hα
60
Hβ
flux/mJy
40 Hγ [O III]
20
0
5000 6000 7000 8000
λ/Å
Figure 4.14 Optical spectrum of the quasar 3C273. Note the broad optical
emission lines from the quasar’s broad-line region (e.g. Hα).
Quasars typically (though not always) have very blue optical–ultraviolet spectra
and broad emission lines, sometimes with widths suggesting Doppler motions of
thousands of km s−1 . The optical–ultraviolet continuum varies on timescales of
weeks or less, suggesting physical sizes of light-weeks. These observations
suggest accretion onto a central massive object, and the size and velocity
constraints make it likely that this object is a supermassive black hole, i.e. one
with a mass " 106 M) , if only because any astrophysical alternatives would very
rapidly evolve into a single black hole.
140
4.6 Active galaxies
We’ll cover some of the detailed physics of quasars in Chapter 6 (see also
the further reading section, especially Kolb’s book Extreme Environment
Astrophysics). For now, we’ll state a few general results to define the terminology.
The modern picture of quasars and radiogalaxies is of a dusty torus that makes the
optical appearance depend strongly on orientation. Figure 4.15 shows a schematic
diagram of these models of active galactic nuclei (AGN).
radio lobe
jet
10−4 10−2 1
10−5 10−3 10−1
distance from centre
in parsecs (b)
(a)
Figure 4.15 Schematic view (not to scale) of the dust-torus-based unified model of radio-loud active galaxies.
Where the broad-line region is visible, the active galaxy is seen as a type 1 object (i.e. with broad and narrow
lines), such as a Seyfert 1 or quasar. When the torus obscures the line of sight to the broad-line region, the active
galaxy is seen as a type 2 object (i.e. with narrow lines only), such as a Seyfert 2 or radiogalaxy. When the jet is
pointed directly along the line of sight, the jet luminosity can sometimes swamp the rest of the active nucleus;
these examples are referred to as blazars.
When the central black hole accretion (sometimes called the central engine) is
visible along the line of sight, the broad (> 1000 km s−1 ) emission lines and blue
optical–ultraviolet continuum are visible; when the host galaxy is visible, these
are known as Seyfert 1 galaxies, and in general these are type 1 AGN. When the
dusty torus obscures the line of sight to the central engine, only narrow emission
lines are visible and the continuum is dominated by starlight from the host
galaxy; these are Seyfert 2 or type 2 AGN. The broad line region and narrow line
region are shown in Figure 4.15. These AGN also show evidence for gas with
higher ionization than found in starbursts, such as having high ratios of [O III]
495.9 + 500.7 nm to [O II] 372.7 nm or Hβ 486.1 nm, or a high ratio of [N II]
654.8 + 658.4 nm flux to Hα 656.3 nm flux.
141
Chapter 4 The distant optical Universe
Figure 4.16 illustrates how star-forming galaxies and AGN emission lines
differ. About 10% of active galaxies are radio-loud, i.e. they have luminous
radio-emitting lobes. It’s not clear why some active galaxies have radio lobes
while others (the radio-quiet AGN) do not, but the lobes appear to be
caused by particle jets emanating from the central engine. These impact on
the interstellar/intergalactic medium and create a cocoon of plasma, in which
electrons spiral along magnetic fields lines and emit synchrotron radiation at radio
wavelengths. Even the radio-loud objects come in at least two distinct types:
Fanaroff–Riley (FR) types I and II (not to be confused with Seyfert type). FR-I
radiogalaxies have less luminous radio lobes that taper off in brightness towards
the edges (‘edge-darkened’), while FR-II radiogalaxies have more luminous,
edge-brightened lobes. About half the energy output of an FR-II radiogalaxy
comes out as jet kinetic energy — an astonishing output given that a quasar’s
luminous energy can exceed that of the rest of the galaxy combined.
10
intensity of [O III] 5007 Å
intensity of Hβ 4861 Å
0.1
0.01 0.1 1 10
intensity of [N II] 6583 Å
intensity of Hα 6563 Å
Figure 4.16 The relative strengths of narrow emission lines can be used to
diagnose whether black hole accretion or star formation is present in a galaxy.
This diagram is known as the ‘Baldwin–Phillips–Terlevich diagram’ or sometimes
‘BPT diagram’. Both wavelength pairs are close in wavelength, so are insensitive
to dust reddening. Open circles represent galaxies with emission lines from
H II regions (i.e. star formation), while the closed symbols are active galaxies.
(Filled circles are Seyfert 2 galaxies, and triangles are weaker AGN known as
‘low-ionization nuclear emission regions’ or LINERs.) AGN can be separated
from star-forming galaxies using the curved line.
Baade’s original question on the triggers of active galaxies is still with us. As
we’ll see in Chapter 6, the formation of supermassive black holes and their
growth through accretion are closely related to the formation of stars in their
host galaxies. The luminosity function of quasars is not in general well fit
by the Schechter function; instead, researchers have opted for an arbitrary
double-power-law parameterization:
φ∗
φ(L) = , (4.24)
(L/L∗ )α + (L/L∗ )β
where φ∗ , L∗ , α and β are free parameters to be determined from the data. This
model doesn’t have any underlying physical motivation, but if α > β then it
142
4.6 Active galaxies
has the property that φ ∝ L−β at luminosities far below the break luminosity
(i.e. L * L∗ ), and φ ∝ L−α at L 0 L∗ . The overall shape is therefore of a
shallow power law at faint luminosities, steepening at luminosities around L∗ to a
steeper power law at high luminosities.
Quasars held another surprise: the luminosity function of quasars evolves very
strongly. Figure 4.17 shows the evolution of bright quasars from SDSS.
10−7
ρ(z|Mi < −27.6)/Mpc−3
10−8
10−9
0 1 2 3 4 5 6
redshift, z
Figure 4.17 The total comoving number density of SDSS quasars with i-band
absolute magnitudes brighter than −27.6.
It appears that quasars were far more common at a redshift of z = 2 than at the
present. The first measurements of this evolution parameterized these changes as
evolution in φ∗ only (pure density evolution or PDE) or in L∗ only (pure
luminosity evolution or PLE). Initial indications were that PLE fitted better,
suggesting a long-lived population of quasars that dimmed over cosmological
times; however, the radio-loud subset also showed PLE, while the electron energy
loss from synchrotron radiation in their radio lobes implied ages of only tens of
millions of years. The PLE decline in quasar number density from z = 2 to z = 0
varies approximately as (1 + z)3 . We’ll return to the underlying causes of this
sudden increase and decline in quasar activity in the cosmos in Chapters 5 and 6,
but an interesting clue comes from comparing the evolution in galaxy–galaxy
major merger rates (i.e. mergers of similarly-sized galaxies) inferred from optical
galaxy surveys: the merger rate varies as (1 + z)2.7±0.6 at least up to z = 1.
Numerical simulations have shown that galaxy–galaxy mergers could drive gas to
the final common centre, so a link is plausible. At the highest redshifts, the quasar
number density appears to drop quickly (Figure 4.17), known as the redshift
cut-off of quasars. Again the underlying physical causes of this change are still
debated.
Exercise 4.10 How would PDE and PLE translate a luminosity function in the
(log φ, log L) plane? ■
143
Chapter 4 The distant optical Universe
Exercise 4.11 (This is a more open-ended and difficult exercise than most in
this book.) The luminosity L of radio lobes depends on the kinetic energy output
per unit time of the radio jet, Q, on the density of the surrounding medium ρ
(which can be inferred independently from other observations), and on time. As
the radio jet burrows deeper into the surrounding medium, the cocoon of ionized
plasma increases in size with time, so the linear size r of the radio lobes increases.
See, for example, Miller, P., It can be shown that the jet power output Q is related to these observables roughly
Rawlings, S. and Saunders, R., as Q ∝ L6/7 r−4/7 ρ−1/2 . Comment on whether radio lobe surface brightness is
1993, Monthly Notices of the suitable for the Tolman test. ■
Royal Astronomical Society,
263, 425. One advantage that radiogalaxies have over radio-loud quasars is that the
observed-frame optical luminosities are not dominated by the central active
nucleus, so the host galaxy is visible. The K-band light is much less sensitive to
young stars than the ultraviolet (Figure 4.6), so should be a measure of the
assembled stellar mass in the host galaxy. How does the K-band luminosity of
radiogalaxies evolve? It turns out that the K-band Hubble diagram (apparent
magnitude versus redshift) has a tight scatter, consistent with just passive stellar
evolution in the host galaxies that locally are giant ellipticals. However, this
scatter increased considerably above a redshift of around 2. Is this the formation
epoch of giant elliptical galaxies? Another effect conspired to increase the
observed scatter: at z > 2 the [O III] 500.7 nm emission line redshifts into the
K-band window at 2–2.5 µm, so the added dispersion in the K–z relation might
just reflect variations in the emission line contribution.
Figure 4.18 The Ursa Major constellation (top left), with a square degree region marked in red. A zoom into
this region shows the location of the Hubble Deep Field North (HDF-N) in red. Note that the region of sky is
unremarkable. The location was chosen to avoid bright objects at all wavelengths (e.g. bright stars in the optical,
bright galaxies in the far-infrared or radio); not every patch of sky is suitable for a blank-field survey. The panel to
the right shows the final zoom into the HDF-N itself.
145
Chapter 4 The distant optical Universe
These are part of what was originally called the ‘faint blue galaxy’ population.
These Lyman break galaxies cluster strongly. Combined with their number
density, it became clear that the bias parameter for these galaxies was high,
effectively cancelling the weaker clustering of dark matter at earlier epochs, so the
overall clustering strength resembled present-day galaxies.
Searches for U-band dropouts yielded another surprise: a population of galaxies
with a strong Lyman α emission line but weak continuum. (See, for example,
Steidel, C.C. et al., 2000, Astrophysical Journal, 532, 170.) These have become
known as Lyman α blobs. The widths of the lines suggest that the ionization
causing the Lyman α line is from star formation, though at least one candidate has
been found with an obscured X-ray core (Figure 4.20), suggesting hidden AGN.
We’ll return to the coupled formation of black holes and galaxies in later chapters.
It’s not clear what fraction of the ionizing radiation escaped from galaxies at high
Figure 4.20 A composite redshifts; we’ll return to this in Chapter 8.
image of a Lyman α blob, with Other new galaxy populations have been discovered in these deep field surveys.
Lyman α coloured yellow, The Extremely Red Objects (ERO) were found to be very faint in optical
Spitzer infrared data marking wavelengths, but very bright in the near-infrared, with colours (R–K) > 5. It
the sites of dust-shrouded star was not initially clear whether these were very dusty star-forming galaxies or
formation coloured red, and redshifted old stellar populations. In fact, both appear to be present in the ERO
Chandra X-ray data marking the population. EROs cluster strongly, suggesting that they are associated with more
site of supermassive black hole massive dark matter haloes than Lyman break galaxies. We’ll return to this in the
accretion marked in blue. This next chapter. A related population is the BzK galaxies, selected in the (B–z)
may be an example of the versus (z–K) colour–colour plane to satisfy (z–K)–(B–z) > −0.2 (when the
feedback processes in action magnitude zero points are given in the AB system). This appears to select z > 1
(Chapters 5 and 6). star-forming galaxies independently of reddening because reddening moves
See Daddi, E. et al., 2004, galaxies parallel to the (z–K)–(B–z) = −0.2 threshold.
Astrophysical Journal, 617, 746.
Like quasars, the galaxy luminosity function evolves strongly at all wavelengths
investigated so far. Figure 4.21 shows the evolution in M∗ and φ∗ in the best-fit
Schechter functions at several wavelengths. The deepest survey so far is the
Hubble Ultra Deep Field (UDF). This is a further ultra-deep survey with the
more sensitive Advanced Camera for Surveys on the upgraded HST, some
details of which are shown in Figure 4.22. The most recent (and possibly final)
refurbishment of the HST introduced the WFC3 instrument, which has yielded
z ∼ 10 galaxy candidates in the UDF using the Lyman dropout technique, shown
in Figure 4.23. A surprise from these observations has been that the luminous
output from these galaxies is declining so quickly that these galaxies could not
have reionized the Universe (see Chapter 8).
We can use the evolution in the luminosity function to estimate the cosmic
star formation history, i.e. the amount of mass being turned into stars per unit
comoving volume per year, as a function of redshift. Because heavy elements
are synthesized in stars, this also traces the metal enrichment of the Universe
throughout its history. Recall that astronomers use ‘metal’ to refer to any elements
heavier than helium.
146
4.7 Deep-field surveys and wide-field surveys
−5 1
1500 Å
2800 Å
−4 0.8
SDSS data
"
−3 u
0.6
M ∗ − M0∗
φ∗ /φ∗0
−2
g" 0.4
−1
0.2
0 SDSS data
0
0 2 4 6 0 2 4 6
redshift, z redshift, z
Figure 4.21 The evolution of φ∗ and M∗ of the best-fit Schechter function for
galaxies, at various wavelengths, determined from the FORS Deep Field survey.
Also shown are the local SDSS galaxy constraints.
Figure 4.22 Details from the Hubble Ultra Deep Field, taken with the HST’s
Advanced Camera for Surveys.
147
Chapter 4 The distant optical Universe
V +i+z Y J H
We’ve seen that star-forming regions have hot young O and B stars that are most
luminous at ultraviolet wavelengths. The amount of ultraviolet light being
emitted per unit comoving volume could therefore be used as an estimator of the
volume-averaged star formation rate. If we integrate L φ(L) (where L is the
ultraviolet luminosity and φ(L) is the luminosity function at this ultraviolet
wavelength), we can calculate the ultraviolet luminosity density. The next step is
to extrapolate from the O and B star formation to calculate the formation of stars
of all types. For this, one needs to assume an initial mass function for the stars.
The next step is to account for dust obscuration, to which ultraviolet luminosity is
particularly prone. This is also difficult. As we’ve seen, the extinction depends on
both the geometry of the dust and the dust composition. One approach (known
See Calzetti, D., Kinney, A.L. as the Calzetti extinction law) is to use empirical correction derived from a
and Storchi-Bergmann, T., 1994, comparison of models with optical–ultraviolet rest-frame spectra of galaxies.
Astrophysical Journal, 429, 582. Figure 4.24 shows the cosmic star formation history derived from ultraviolet
observations with this extinction law. This was first known as the ‘Madau plot’
Madau, P. et al., 1996, after the 1996 Madau et al. paper that first appeared to detect the high-redshift
Monthly Notices of the Royal decline. Later usage changed to refer to the Madau diagram or Madau–Lilly
Astronomical Society, 283, diagram (the latter acknowledging earlier work that detected the initial increase
1388. from z = 0 to z = 1). Equivalently, the term cosmic star formation history is
used. This diagram has been enormously influential, but it’s wise to keep in mind
the uncertainties in the underlying assumptions. We’ll see in Chapter 5 other
approaches to constraining this diagram. It is superficially similar to the evolution
of quasars, which immediately suggested a physical connection between the
formation of stars and the growth of black holes.
Another way of constraining the cosmic star formation history is with the
present-day optical spectra of nearby galaxies. Population synthesis models can
be used to determine (or at least constrain) when the bulk of the stars formed, i.e.
the location of the peak in the Madau diagram. The result of this analysis applied
to the SDSS spectra of about 3 × 105 galaxies shows a decline from z = 1 to
148
4.7 Deep-field surveys and wide-field surveys
t/Gyr
10 5 2 1 0.6 0.4
−1
$
star formation rate
M! yr−1 Mpc−3
−2
z = 0, but the analysis doesn’t have the redshift resolution to find the location of
the peak.
In the previous chapter we spent some time discussing local galaxy scaling
relationships, so it’s worth saying briefly how these relationships change at higher
redshift. At redshifts z < 1, ellipticals show some evolution in the fundamental
plane, at least some of which may be due to passive stellar evolution, though there
are discussions of how the selection effects of magnitude-limited samples affect
the result.3 Meanwhile, spiral galaxies show evolution in the Tully–Fisher
relation, suggesting ‘differential’ evolution,4 meaning that different types of
galaxies evolve differently: low-mass galaxies appear to have undergone more
star formation more recently than higher-mass galaxies. The z 1 1 edge-on
spiral discs appear to have thicker widths than their z = 0 counterparts, and
disturbances such as warps were more common5 at z = 1.
The HST seems to be a long way from being limited by galaxy–galaxy overlaps,
but other telescopes have not been so fortunate. There is a threshold known as the
confusion limit beyond which one can’t rely on being able to separate individual
objects. Even the HST can reach this limit when observing a dense star cluster.
Intuitively, there has to come a point where the RMS fluctuations of the image
√ are
no longer dependent on the noise in the image (which would reduce as 1/ time)
but instead are dominated by the fluxes of the faint sources lying below the
detection threshold.
The confusion limit is usually defined as three or five times the fluctuations from
background objects. The location of the confusion limit varies depending on the
slope of the source counts, and is surprisingly high, as the following worked
example shows. We shall use the term beam to mean the area on the sky occupied
3
See, for example, Treu, T., 2003, astro-ph/0307281, and Almedia, C., Baugh, C.M. and Lacy,
C.G., 2007, Monthly Notices of the Royal Astronomical Society, 376, 1711.
4
See, for example, Böhm, A. and Ziegler, B.M., 2007, Astrophysical Journal, 668, 846.
5
See, for example, Reshetnikov, V.P., Dettmar, R.-J. and Combes, F., 2003, Astronomy and
Astrophysics, 399, 879.
149
Chapter 4 The distant optical Universe
by a point source, i.e. an object that’s spatially unresolved by that telescope. For
example, if a point source has a Gaussian shape with a (standard deviation) width
of r arcseconds, then the beam area Ω could be regarded as πr2 square arcseconds
(though conventions differ slightly on this choice). We’ll use a detection limit of
five times the noise, because then less than one beam in 3.5 million will have a
random detection. This may seem excessively cautious, but remember that
megapixel cameras are now very common and it’s quite possible to have many
millions of beams in an image, particularly in wide-field astronomical mosaics.
Solution
In an interval S → S + dS, the total number of sources per
unit area will be dN = αkS −α−1 dS. We can write this as
dN = α(N (> S)/S) dS. Therefore the number of sources in one beam will
be Ω dN = α Ω(N (> S)/S) dS. This will be subject to Poisson statistics,
so the variance will equal the mean, i.e. α Ω(N (> S)/S) dS. The variance
of the flux (as opposed to the number) will be S 2 times the variance in the
number, i.e. αSΩ N (> S) dS. Integrating this from S = 0 to S = Slim gives
* Slim
Var(S) = αSΩkS −α dS
0
α
= k S 2−α Ω
2 − α lim
α 2
= Ω N (> Slim ) Slim .
2−α
-
Therefore the noise σ = Var(S) will be
#
α
σ= Ω N (> Slim ) Slim .
2−α
Now, the quantity Ω N (> Slim ) is the number of sources per beam.
We’ll find it more useful to work in the number of beams per source,
(Ω N (> Slim ))−1 . If we want the detection limit Slim to be five times the
noise level, i.e. Slim = 5σ, then
C D−1/2
α
Ω N (> Slim ) = 5,
2−α
which rearranges to give
25α
(Ω N (> Slim ))−1 = .
2−α
If the source count slope is Euclidean, then α = 1.5, so we’d need one
source per 75 beams! In practice the source counts flatten at the very
faintest fluxes, and one source per 20–40 beams is usually considered as the
confusion limit.
150
4.7 Deep-field surveys and wide-field surveys
The confusion limit severely constrains what can be said about individual objects,
but it may still be possible to constrain the shape of the source count slope from
the histogram of pixel values in a confusion-limited map. This is known as a P(D)
analysis, where D is the deflection from the mean in the map, and P is the
probability of that deflection. The observed histogram is compared to the pixel
value distribution predicted by a given source count model. We’ve also assumed
that the underlying point sources are not clustered, but clustering will increase the
confusion limit. It may also be possible to constrain the clustering of the sources
below the confusion limit by measuring the angular power spectrum of the
distribution of pixel values, known as a fluctuation analysis.
Exercise 4.12 You might think that a limit of five times the RMS fluctuations
(a one in 3.5 million chance of a random noise spike for Gaussian noise) is
extraordinarily conservative. For a beam of one square arcsecond, which is typical
in some ground-based optical imaging, calculate how big an image has to be
in square degrees to have one random 5σ noise spike on average, assuming
Gaussian noise. (For comparison, there are several cameras on world-class optical
telescopes with fields of view of at least a half a degree along each side.) ■
The HDF-N and HDF-S are examples of pencil-beam surveys: very deep but
narrow surveys, covering a long and thin volume of the Universe. The opposite
strategy is a wide-field survey: shallower, but wider area. On the widest scales,
we’ve already met the SDSS survey, which covers a quarter of the sky. Digitized
versions of photographic plates taken by Schmidt survey telescopes are available
online for the whole sky, known as the Digitized Sky Survey (DSS). The DSS is available online
from several sites, such as
Exercise 4.13 For many observations, including HST imaging, the depth of an [Link]
image is proportional to the square √
root of the time spent integrating, i.e. the
faintest flux S is proportional to 1/ t. Suppose that the galaxy source counts
are Euclidean. Would you detect more galaxies in a pencil-beam survey or a
wide-field survey, for a fixed amount of observing time? What source count slope
would tip the balance in the opposite direction? ■
The HDF has perhaps had such an enormous impact on cosmology that time
allocation committees on the HST and other telescopes have been emboldened. In
any case, the HDF-N and HDF-S have been supplemented by various wider-area
HST surveys, most notably COSMOS (Cosmological Evolution Survey), which
covered 2 deg2 . We shall meet some of its key results in Chapter 7.
Figure 4.25 shows the redshift histogram in the HDF-N. The non-uniformity
is quite striking. This illustrates one disadvantage of pencil-beam surveys:
large-scale structure fluctuations can have a big effect on some measurements
such as the redshift distribution. This was part of the motivation for making a
second Hubble Deep Field, HDF-S. The similarity of the galaxy populations in
HDF-N and HDF-S is sometimes cited as confirmation of the cosmological
principle, though the redshift histograms differ. We saw in Chapters 2 and 3 how
the variance of the galaxy distribution depends on scale, and we can use this to
give an order-of-magnitude estimate for the fluctuations in our survey, e.g. by
defining k = 1/V 1/3 , where V is the comoving volume of the survey. (This is, in
fact, an underestimate of the fluctuations — see the further reading section.) The
clustering of dark matter is expected to be weaker at high redshift, but to first
order this is cancelled by the evolving bias parameter of galaxies, as we’ve seen.
151
Chapter 4 The distant optical Universe
20
15
N (z)
10
Figure 4.25 Redshift
histogram in the HDF-N
estimated from a sparse
5 sample of 140 objects. Note
the peaks in the redshift
distribution, caused by
0 the pencil-beam survey
0.3 0.4 0.5 0.6 0.7 0.8
passing through large-scale
redshift, z
structures.
Figure 4.26 shows how the sizes of galaxies evolve in the Hubble UDF. Galaxies
3 of a given luminosity are consistently smaller at higher redshifts, with a
size-dependence scaling approximately as (1 + z)−1.1±0.2 . What could cause this
size evolution? One clue comes from the phase space density of a galaxy. This is
2
the volume that a galaxy (or a portion of a galaxy) occupies in an imagined
rhl /kpc
six-dimensional space of three space dimensions (x, y, z) and three velocity axes
(vx , vy , vz ). One can estimate it by dividing the mass density by the volume
1
of an ellipsoid with the axes equal to the velocity dispersions in each of the
three velocity axes. This is a useful quantity because numerical simulations of
0 galaxy–galaxy mergers have shown that phase space density decreases by a factor
2 3 4 5 6 of a few during a merger, and it can be shown by Liouville’s theorem (Chapter 7)
redshift, z that phase space density cannot be increased, unless the stars’ kinetic energy is
dissipated into, for example, gas motions or radiation. Some elliptical galaxies
Figure 4.26 The half-light
have phase space densities consistent with being the merger product of spiral
radii (the radii containing 50%
galaxy collisions, and numerical simulations predict that the final product would
of the light) of galaxies in the
have an elliptical morphology. On the other hand, the cores of giant elliptical
Hubble UDF.
galaxies have phase space densities much higher than those of spiral galaxies, so
they cannot be formed by (dissipationless) mergers. However, Lyman break
galaxies at z > 5 have phase space densities similar to the cores of present-day
massive ellipticals. Could these be the progenitors of today’s giant ellipticals?
Perhaps. We’ll meet another population of galaxies making a similar claim in
Chapter 5: the submm galaxies, which have giant starbursts as expected in the
original monolithic collapse model (Chapter 3). It seems that we’re still some way
from resolving the monolithic collapse versus disc–disc merger debate on the
origin of elliptical galaxies.
The sizes of high-redshift elliptical galaxies have recently thrown up another
fundamental puzzle, which at the time of writing is unresolved. There is a very
See van Dokkum, P.G., numerous population of small, passively-evolving (i.e. not star-forming) elliptical
Kriek, M. and Franx, M., 2009, galaxies at z > 1 whose luminosities imply large stellar masses of > 1011 M) .
Nature, 460, 717, and references These are sometimes called red nuggets, and they have no local counterparts.
therein. Could their luminosities be a misleading measure of the underlying stellar
mass, for some reason? Extremely deep spectroscopy of one example suggests not:
152
4.7 Deep-field surveys and wide-field surveys
1
Fλ /10−15 J s−1 m−2 Å−1
0.5 0.6
0.4
P
0.2
0
100 200 400 600 800
velocity dispersion/km s−1
0
4000 5000 6000 7000
(a) rest-frame wavelength/Å
(b) HST image (c) model (d) difference
0"" .5 5 kpc
Figure 4.27 The spectrum of galaxy 1255-0 (grey), with a smoothed version shown in black and the best-fit
population synthesis model shown in red. The wavelengths of some absorption lines are marked in yellow.
The insert shows the likelihood distribution of the galaxy’s velocity dispersion, using two different methods of
estimating the noise. The panels to the right show the HST 1.6 µm image, a model, and the difference between the
two.
despite a small effective radius of just 0.78 ± 0.17 kpc, this galaxy has an
enormous velocity dispersion of 510+165
−95 km s
−1 (Figure 4.27), implying a stellar
153
Chapter 4 The distant optical Universe
What about elliptical galaxies that form later — why do they conform to the same
fundamental plane? One possibility is that red nuggets are the cores of giant
elliptical galaxies at z > 1, and we simply haven’t imaged deeply enough to see
See Mancini, C. et al., 2009, the diffuse faint outer regions. Clearly, there are still many unanswered questions
arXiv:0909.3088. about the evolution of massive galaxies.
z = 0.75 z = 0.95
The underpopulated narrow region between these two has come to be known
as the green valley. Some authors have used the red sequence as a de facto
morphology-independent definition of early-type galaxies at z < 1; in fact, one
result of the Galaxy Zoo project is that the colour–density relation is stronger Bamford, S. P. et al., 2009,
than the morphology–density relation (Chapter 3) in the local Universe. The red Monthly Notices of the Royal
sequence and blue cloud both evolve with redshift (Figure 4.30). The blue cloud Astronomical Society, 393,
becomes slightly redder with time, perhaps because of stellar ageing or increasing 1324.
dust content. There are also far more luminous blue cloud galaxies at z > 0.5 than
in the local Universe. Similarly, the red sequence becomes redder on average with
time, consistent with passive stellar evolution. Taking into account the effects of
passive stellar evolution, the numbers and magnitudes of galaxies in the red
sequence imply a build-up of stellar mass in early-type galaxies by a factor of 2
since z = 1.
These observations could be explained if some star-forming galaxies in the blue
cloud stop forming stars, then move to the red sequence and evolve passively.
Galaxy evolution in general would then be a story of formation in the blue
cloud (or merger-induced starbursts moving a system into the blue cloud), then
migration across the green valley to join the red sequence. This, however, begs
the question of what mechanism stopped the star formation. There have been
suggestions that active galaxies are more common in the green valley, suggesting
that AGN activity is somehow responsible for truncating star formation (because
the green valley objects might be expected to be transition objects). Furthermore,
ultraviolet estimates of the star formation rate in (morphologically) early-type
galaxies find that the amount of star formation anti-correlates with the velocity
dispersion of the galaxy, but not with the overall galaxy luminosity. We’ll see in Schawinski, K. et al., 2006,
Chapter 6 that this velocity dispersion is closely linked to the mass of the central Nature, 442, 888.
supermassive black hole. The truncation of star formation does appear to have
something to do with the central black hole.
155
Chapter 4 The distant optical Universe
AV = 1 AV = 1 AV = 1
AV = 1 AV = 1 AV = 1
AV = 1 AV = 1 AV = 1
−22 −20 −18 −16 −22 −20 −18 −16 −22 −20 −18 −16
MV − 5 log10 h MV − 5 log10 h MV − 5 log10 h
Figure 4.30 The blue cloud and red sequence, shown at a range of redshifts. The sloping solid lines are a fit to
the red sequence location, and the dashed line marks the distinction between clouds regarded as ‘blue’ or ‘red’.
The sloping dotted line is the approximate apparent magnitude limit of the survey. Simulated evolutionary tracks of
some galaxies are shown with lines and crosses. The predicted movement of a galaxy undergoing a reddening of
AV = 1 is shown as a vector. Only representative error bars are shown.
A detailed comparison of galaxy evolution models with the evolving blue cloud
and red sequence, together with the requirements of needing to reproduce the
Madau diagram and the evolution of the total stellar mass density Ω∗ , revealed
that the number densities of the most massive early-type galaxies could not be
reproduced. One possibility is that these most massive early-type galaxies are the
156
Summary of Chapter 4
result of dry mergers, i.e. mergers of galaxies with very little gas, so no star
formation results. However, there is currently debate in the community as to
whether dry mergers could account for the mass–metallicity relation in early-type
galaxies, and the structure and sizes of early-types. The mass–metallicity relation
is the correlation between stellar mass and the metal enrichment as measured by
(for example) emission line ratios in star-forming galaxies. There is much that we
have yet to understand about the formation of the most massive elliptical galaxies.
Summary of Chapter 4
1. Dark matter is described as ‘cold’, ‘hot’ or ‘warm’ according to the relative
speeds of the dark matter particles.
2. The hierarchical formation model of large-scale structure describes the
merger of dark matter haloes to make progressively larger dark matter
haloes.
3. Population synthesis models describe the evolution of galaxy spectra by
modelling the evolution of stars (and dust) within them.
4. Dust extinction in the V-band is expressed as AV , measured in magnitudes.
This can be measured with the Balmer decrement, among other ways,
though the estimates are sensitive to assumptions about the dust distribution.
5. Redshifts can be determined from galaxy emission lines. Redshifts can also
be estimated by modelling the changes of observed colours with redshift, a
technique known as photometric redshifts. An important example of a
photometrically-selected high-redshift galaxy population is the Lyman break
galaxy population.
6. The luminosity function of galaxies is the number per unit luminosity (or per
decade luminosity), per unit comoving volume. It can be measured using the
1/Vmax statistic.
7. The V /Vmax values can be used to test for evolution, or in local galaxy
samples to test for incompleteness.
8. Type 1 active galaxies have a direct view of the quasar’s broad-line region,
whereas type 2 active galaxies do not. The type 2 systems can be
distinguished from star-forming galaxies using emission line ratios.
9. Quasars evolve strongly, with a peak in number density around z 1 2.5 and
a decline at higher redshifts.
10. The rest-frame ultraviolet luminosity density can be used to measure the
comoving star formation density of the Universe, known as the Madau–Lilly
diagram or the Madau diagram. This is similar, but it seems not identical, to
quasar evolution.
11. The Madau diagram can also be inferred from the ages of stars in local
galaxies.
12. The confusion limit restricts how deep a given telescope can image. This
limit is conventionally set at 3 or 5 times the noise per beam from
background objects.
157
Chapter 4 The distant optical Universe
13. Pencil-beam surveys such as the Hubble Deep Fields have redshift
histograms that show peaks due to the large-scale structures through which
the surveys pass.
14. Although the rest-frame ultraviolet morphologies of local galaxies can be
quite different to the rest-frame optical morphologies, it appears that these
differences are not responsible for most of the changing appearance of
galaxies in deep Hubble Space Telescope (observed-frame) optical surveys.
15. High-redshift galaxy surveys have used galaxy colour–magnitude diagrams
to generalize the local division of early-type and late-type galaxies into the
red sequence, the blue cloud and the green valley.
Further reading
• For more on the evolution of large-scale structure, see the graduate-level text
Peacock, J.A., 1999, Cosmological Physics, Cambridge University Press.
• Alternatively, try Coles, P. and Lucchin, F., 1995, Cosmology, Wiley.
• For more on gamma-ray bursts and active galaxies, see Kolb, U., 2010,
Extreme Environment Astrophysics, Cambridge University Press.
• Antonucci, S., 1993, ‘Unified models for active galactic nuclei and quasars’,
Annual Review of Astronomy and Astrophysics, 31, 473.
• Kennicutt, R.C., 1998, ‘Star formation in galaxies along the Hubble sequence’,
Annual Review of Astronomy and Astrophysics, 36, 189.
• There is a curious analogy between the clustering of galaxies or CMB
fluctuations, and the uncertainty principle in quantum mechanics: see
Tegmark, M., 1995, Astrophysical Journal, 455, 429, and Tegmark, M., 1996,
Monthly Notices of the Royal Astronomical Society, 280, 299.
• For an accessible review on the conundrums posed by red nugget galaxies, see
Glazebrook, K., 2009, Nature, 460, 694.
• Binney, J. and Tremaine, S., 2008, Galactic Dynamics, Princeton University
Press.
• To view John Michell’s famous paper of 1767 go to
[Link]
158
Chapter 5 The distant multi-wavelength Universe
The distant view is not always the truest view.
Nathaniel Hawthorne
Introduction
We’ve seen how strongly dust can affect optical observations — but what
happens hidden behind the dust? One of the great surprises in cosmology in
the past decade has been the tremendous amount of star formation and black
hole accretion hidden behind heavy dust extinction. This is invisible to optical
telescopes but not to other telescopes, as we’ll see in this chapter.
159
Chapter 5 The distant multi-wavelength Universe
Exercise 5.2 We’ve just seen that the flux interval dSν that contributes the
most background will be the one in which Sν dN/dSν is a maximum. Which
logarithmic flux interval d ln Sν contributes the most background? ■
160
5.1 The extragalactic optical and infrared background light
λ = 15 µm
1000
100
Figure 5.2 Galaxy
differential source counts from a
variety of sources, normalized to
the Euclidean prediction, at an
10 observed wavelength of 15 µm.
Also shown are a no-evolution
00
0
00
0
01
10
00
.0
.0
0.
0.
0.
0.
1.
00
10
00
10
10
10
S/mJy evolution model that better fits
the data (red).
Now, we’ve already seen how the galaxies that dominate the ultraviolet luminosity
density can dominate the (optically-derived) cosmic star formation history. What
this means is that the galaxies that dominate the cosmic star formation history at
some redshift are necessarily the same ones that contribute the most to the
extragalactic background, at that frequency and redshift. Finding out which
galaxies dominate the extragalactic background light is a very similar research
problem to measuring the cosmic star formation history.
Reproducing the extragalactic background light is therefore a key objective of
source count models, one of which is plotted in Figure 5.2. These models
aim to account for the observed number counts and (where available) redshift
distributions, at all wavelengths, and reproduce the present-day stellar mass
density Ω∗ . There are many approaches. One approach is to vary the numbers of
galaxies of different types with redshift, and find the best-fit evolution for this
assumed population mix. Another approach is semi-analytic modelling, in which
the locations of dark matter haloes are given by a numerical model, and the haloes
are populated by galaxies based on some physical assumptions with adjustable
free parameters. Ideally, one would like to simulate a cosmological volume right
down to the scales of the formation of stars in molecular clouds, but this is a very
long way from being computationally possible, so many source count models rely
on template galaxy spectral energy distributions (SEDs), i.e. template spectra
from the ultraviolet to the far-infrared and beyond, to extend the galaxy number
count predictions to different wavelengths. These templates are sometimes
predictions from numerical radiative transfer models of dust and stars in particular
galaxies, or sometimes taken directly from observations.
102
10
flux density/mJy
0.1
Ωm,0 = 0.3 op
tic
ΩΛ,0 = 0.7 a l
1.4
h = 0.65
Figure 5.4 The flux variation
GH
0.01 5 × 1012 L!
with redshift of a typical
z
38 K
star-forming galaxy with a fixed
0.1 1 10
luminosity, at a variety of
redshift, z
observed-frame wavelengths.
These galaxies were soon known as ‘SCUBA galaxies’, but as other cameras
became available (with names such as MAMBO, BOLOCAM, AzTEC,
SHARC-II, BLAST, LABOCA) the more generic term of submillimetre galaxies
163
Chapter 5 The distant multi-wavelength Universe
(or SMGs) became current. The selection function of SMGs (or their mm-wave
counterparts MMGs) is strikingly uniform, as you can see in Figure 5.4: an SMG
at z = 10 would have almost the same brightness as an identical galaxy at z = 1.
● Would the histogram of redshifts of SMGs also be uniform?
❍ No. Even if the number density of SMGs didn’t evolve, we’d still be sampling
different amounts of comoving volume at different redshifts, i.e. dV /dz is not
constant.
Figure 5.5 The 850 µm image of the Hubble Deep Field North. This image has
a radius of 100 arcseconds. The zoom shows the corresponding Hubble Space
Telescope data in the region of the brightest SMG.
An intense campaign of optical spectroscopy with the Keck telescopes found a
median redshift of around z = 2.2 for SMGs.6 Even without redshifts, the
far-infrared luminosities suggested star formation rates of around 1000 M) per
year. (Because submm flux is more or less independent of redshift at 1 < z < 10,
the luminosities can be estimated even without redshifts.) We’ll discuss in
Section 5.7 how SMGs changed our physical picture of galaxy formation and
evolution.
Bigger submm- and mm-wave cameras have led to larger surveys of submm- and
mm-wave galaxies. One daring experiment to survey parts of the sky at submm
wavelengths involved dangling a telescope with a 2 m primary mirror from a
weather balloon! In 2006 the Balloon-borne Large Aperture Submm Telescope
(BLAST) flew for 11 days around the South pole and mapped the Chandra Deep
Field South (CDF-S) field (among other fields), shown in Figure 5.6. Highly
redshifted galaxies should be visible at the longest wavelengths, but less visible at
the shorter wavelengths because the peak of the emission has redshifted past
(Figure 5.3). BLAST used the same detector technology as the SPIRE instrument
on the ESA Herschel Space Observatory, which launched in 2009. The first
images from Herschel have been spectacular. Figure 5.7 shows the local spiral
galaxy M74. Even in this short exposure, there are many background galaxies,
which appear to be clustered.
6
Chapman, S.C. et al., 2005, Astrophysical Journal, 622, 772.
164
5.2 Submm galaxies and K-corrections
λ = 0.50 mm
Figure 5.6 Submm-wave maps of the Chandra Deep Field South region from
the BLAST mission. For reference, the same features have been circled in each of
the images. Total intensity = sum of the three wavelengths.
But what are these SMGs, apart from being galaxies detected at submm
wavelengths? To find out, we need to cross-match the submm-wave objects with
images or catalogues at other wavelengths, including the optical if we want to take
optical spectroscopy of the optical counterpart. However, the diffraction limit of
telescopes makes this difficult: the angular resolution of a circular aperture is
1.22λ/D radians, where λ is the wavelength of the light and D is the diameter of
the aperture.
mid-infrared wavelengths are largely the same as the populations that dominate
the submm-wave extragalactic background at < 500 µm. However, much of the
background at 850–1100 µm is still unaccounted for.
We’ve already met the population of Extremely Red Objects (EROs), some of
which appear to be dusty starbursts, particularly at fainter K-band apparent
magnitudes. It turned out that approximately 10–20% of SMGs are also ERO
galaxies, and early indications are that a similar fraction appear to be BzK
starbursts. There is now a considerable variety of definitions of various types of
red galaxies, with often partly overlapping memberships, such as Distant Red
Galaxies (DRGs) with (J–K) > 2.3, or Dust Obscured Galaxies (DOGs) with
S24 /SR > 1000 (where SR is the R-band flux, and S24 is the 24 µm flux). DRGs
and BzK galaxies contribute tens of percent to the cosmic submm background
See, for example: Pope, A. et light.
al., 2008, Astrophysical Journal, One final subtlety is that selection effects may have an insidious effect on the
689, 127; Knudsen, K.K. et al., types of galaxies seen in the far-infrared and submm. If there are populations of
2005, Astrophysical Journal galaxies with lots of cool dust radiating predominantly at longer wavelengths, we
Letters, 632, 9; Takagi, T. et al., might expect these galaxies to be over-represented in SMG samples. Similarly,
2007, Monthly Notices of the galaxies selected in the far-infrared at, say, 70 µm, may tend to have warmer
Royal Astronomical Society, colour temperatures. It is too early to say definitively if this is the case, and to
381, 1154. what extent these biases operate, but several observations have been found
consistent with the presence of these subtle biases.
166
5.3 Ultraluminous and hyperluminous infrared galaxies
15
[O II]
He II
NV
Si IV + O IV]
[Ne IV]
[Ne IV]
[Na V]
He II ?
Mg VI
N IV]
N III]
C III]
Ly α
O VI
N II]
C IV
C II]
−1
flux density/10−20 W m−2 Å
10
5
Figure 5.8 Spectrum of
the redshift z = 2.286
hyperluminous galaxy
IRAS FSC 10214+4724. Note
the numerous emission lines
from ionized gas, some of
0
which are characteristic of
4000 5000 6000 7000 8000 starburst galaxies, and others
characteristic of active galaxy
wavelength/Å
narrow line regions.
hyper-luminous infrared galaxies (or HLIRGs) with > 1013 L) . Some models
predicted different physical mechanisms driving the evolution of these classes, so
these divisions were not without physical motivation, though these interpretations
were by no means unique. Figure 5.9 shows HST morphologies of ULIRGs; it As a joke I once tried to coin the
appears that major galaxy–galaxy mergers are important in the local Universe term ‘überluminous’ in a
in triggering ultraluminous starbursts. There was a twist to the story of paper describing hypothetical
IRAS FSC 10214+4724: the enormous apparent luminosity turned out to be in ∼1014 –1015 L) galaxies, but the
part due to gravitational lensing, as we’ll see in Chapter 7, though it remains a referee (perhaps quite rightly)
prototypical HLIRG, albeit a lensed one. would have none of it!
IRAS was followed by the ESA Infrared Space Observatory (ISO) in 1995. As an
observatory rather than a sky survey, it specialized in a few deeper surveys and
follow-ups of individual objects. NASA’s Spitzer Space Telescope, launched in
2003, had a primary mirror with a similar diameter to the ISO, but enormously
more sensitive detectors. Both the ISO and Spitzer resulted in the discovery of
many new ULIRGs and HLIRGs, as well as shedding (metaphorical) light on
many star-forming galaxies. Figure 5.10 shows the Antennae galaxies, a pair
of colliding galaxies observed with the HST and with Spitzer. Some of the
heavily-extincted regions in the HST image are strongly luminous at mid-infrared
wavelengths in the Spitzer image. In the local Universe, ULIRGs contribute
a negligible amount to the cosmic star formation density, but the discovery
of SMGs and the surveys by Spitzer have demonstrated that the fractional
contribution from ULIRGs to the cosmic star formation history increases strongly
with redshift, as we’ll see.
167
Chapter 5 The distant multi-wavelength Universe
Exercise 5.4 The factor kd has a large uncertainty. Estimates range from
0.04 m2 kg−1 to 0.3 m2 kg−1 at a rest-frame wavelength of 800 µm, i.e. a range of
about a factor of 7, though more typical values are 0.15 ± 0.09 m2 kg−1 . For an
850 µm observation of an SMG at a redshift of z = 3, what would the fractional
range be in the possible dust mass assuming the following?
(a) kd = 0.15 ± 0.09 m2 kg−1 , a grey body emissivity index of β = 1.5 and a
fixed temperature.
(b) The same as (a), except also allowing a grey body emissivity index of
β = 1–2.
(c) The same as (b), but also allowing a range of assumed temperature of
T = 20–40 K (a wide but not unreasonable range for galaxies).
What advantages are there to measuring fluxes at more wavelengths than just
850 µm? ■
Although undoubtedly difficult to measure, the dust masses are often key
predictions for galaxy evolution models, especially those in which giant elliptical
galaxies form at high redshifts, converting gas to stars in high star formation rates
and generating large dust masses.
dust in galaxies is roughly around 70–130 µm, as shown in Figures 5.1 and 5.3.
Could we use this as our star formation rate indicator? How do we know that the
dust is heated by star formation, and not (for example) just by the ambient
interstellar radiation or from an active nucleus (Chapter 6)? In our Galaxy, the
ambient interstellar light heats the ambient dust, and the thermal radiation from
that dust has been detected in the all-sky far-infrared surveys from IRAS and the
Japanese AKARI space telescope. This dust has come to be known as cirrus
owing to its wispy appearance. (This foreground cirrus structure also places a
similar limit to some deep-field observations to point source confusion, known
See, for example, Gautier, T.N. as cirrus confusion noise. The power spectrum of cirrus is approximately
III et al., 1992, Astronomical P (k) ∝ k −3 , where k is inverse angle, so cirrus is smoother on smaller scales,
Journal, 103, 1313. thus observations with larger beams are more susceptible to cirrus confusion.)
Ultimately, the case for using the far-infrared luminosity to measure star formation
(like all other estimators) rests on astrophysical plausibility. Radiative transfer
models predict that the cirrus contribution is cooler than star-forming giant
molecular clouds (Orion, for example, is warmer than Galactic cirrus, shown in
Figure 5.11), and in star-forming galaxies the cirrus component is predicted to be
lower luminosity than the far-infrared emission from star formation. Supermassive
black hole accretion (Chapter 6) in a galaxy’s active nucleus also heats dust in the
circumnuclear torus, but most models predict this contribution to dominate in the
mid-infrared rather than at longer wavelengths. Far-infrared measurements
capture the obscured star formation but still aren’t free of IMF assumptions,
because the massive stars (above 5 M) or so) dominate the dust heating.
Another star formation rate indicator relies on an entirely different physical
process, and is completely independent of dust obscuration: the radio luminosity.
Supernovae from massive stars accelerate charged particles, which spiral along
the field lines of the galaxy’s magnetic field, emitting synchrotron radiation. The
synchrotron luminosity should therefore be proportional to the recent supernova
rate in the galaxy. This synchrotron radiation dominates at radio wavelengths
(e.g. metre-scale) with a power-law spectrum Sν ∝ ν −α with α 1 0.7 to 1.0
depending on the energy distribution of the charged particles (which itself depends
Figure 5.11 The Orion on time). Dust clouds are transparent to this radiation, and synchrotron-emitting
nebula, as seen by the AKARI regions are themselves optically thin to synchrotron radiation, so the radio
space telescope at 140 µm. The luminosity is ostensibly obscuration-independent. However, it’s still subject to the
constellation itself is marked in IMF. Figure 5.12 shows schematically how stars of different masses contribute to
white. The bright far-infrared the radio, far-infrared and ultraviolet luminosities. All three are sensitive to
knot at the location of the massive stars, to varying degrees, but none covers stars less massive than 5 M) .
nebula (inside the bottom of the
These three star formation rate indicators are sensitive to different parts of
constellation) is caused by
the galaxy too: the ultraviolet traces the unobscured regions, the far-infrared
dust-shrouded star formation.
traces the dust-shrouded regions, and the radio traces the total. Perhaps it’s not
too surprising, then, that the radio and far-infrared luminosities of galaxies
Helou, G., Soifer, B.T. and correlate, as in Figure 5.13. This correlation was discovered by George Helou,
Rowan-Robinson, M., 1985, Tom Soifer and Michael Rowan-Robinson in 1985. However, the tightness of
Astrophysical Journal Letters, the radio–far-infrared correlation over nearly four orders of magnitude in
298, 7. luminosity is an unsolved puzzle, as is the physical origin of the normalization.
The radio–ultraviolet and far-infrared–ultraviolet correlations, meanwhile, are less
tight.
170
5.4 Measuring star formation rates
relative contribution
νSN Nuv
Figure 5.12 The relative
contributions to the
LFIR far-infrared luminosity, the
supernova rate and the
ultraviolet ionizing radiation,
per logarithmic interval of
10 100
stellar mass, assuming an
M/M! IMF varying as M −5/2 up to
100 M) .
24
23
log10 [h2 L1.49 /W Hz−1 ]
22
21
Figure 5.13 The
correlation between
20 far-infrared and radio
(1.49 GHz) luminosities
of star-forming galaxies.
19 Galaxies with active nuclei
8 9 10 11 12
log10 (h2 LFIR /L! )
have been excluded from this
plot.
Ideally, we’d like to track the cosmic star formation history with far-infrared,
radio and ultraviolet tracers at the same time. However, we’ve seen that the
diffraction limit of telescopes limits far-infrared observations. Moreover,
the atmosphere is opaque in most far-infrared wavelengths, and the Earth’s
atmosphere is strongly luminous in the few transparent windows.
Exercise 5.5 Why would a high background flux from (for example) the
Earth’s atmosphere limit astronomical observations? Demonstrate your answer
using Poisson statistics. ■
Radio observations get around the diffraction limit with interferometry, but this is
technically very challenging in the far-infrared partly because of the more
stringent timing requirements at higher frequencies. A solution to the terrestrial
sky background is space telescopes, but it’s difficult to launch large primary
mirrors into space. The largest so far is Herschel (Exercise 5.3). The proposed
Japanese SPICA space telescope will have a similar aperture, but will cool the
171
Chapter 5 The distant multi-wavelength Universe
optics to reduce the background from the telescope and increase the sensitivity. At
the time of writing, there are also ambitious proposals with both NASA and ESA
for future far-infrared space-based interferometers.
The coarse angular resolution of far-infrared images makes it difficult to identify
24 µm: (0.599 to 0.892) mJy which optical galaxy is the far-infrared emitter: the angular size of the submm
image in Figure 5.5 is approximately the same as the whole Hubble Deep Field
North. This also increases the confusion noise. However, there are some possible
shortcuts. One method is to use stacking analyses (Chapter 2), such as averaging
together the far-infrared images of galaxies detected at other wavelengths, in order
to measure the average far-infrared flux for those galaxies. Another method is to
24 µm: (0.337 to 0.389) mJy see what else the far-infrared luminosities correlate with. In the local Universe,
the mid-infrared luminosities correlate with bolometric luminosities (i.e. total
luminosities). The shorter wavelengths in the mid-infrared lead to higher angular
resolutions. Figure 5.14 shows stacked far-infrared images of 24 µm-selected
galaxies, implying that this correlation exists at higher redshifts too.
24 µm: (0.190 to 0.219) mJy But why should mid-infrared luminosities correlate with far-infrared? The
dust emitting the mid-infrared light is physically distinct from the far-infrared
emission. In order to be radiating at these shorter wavelengths, the dust must be
hotter. It turns out that the dust grains responsible are small (less than 0.1 µm, or
even as small as tens of atoms), and are transiently heated sometimes by a single
photon. The mid-infrared spectra of star-forming galaxies have strong spectral
24 µm: (0.107 to 0.123) mJy
signatures (see, for example, Figure 5.3), often (but not always) attributed to
polycyclic aromatic hydrocarbons (or PAHs). This small-grained and PAH dust
is heated by very short-wavelength light, because wavelengths much longer
than the grain size are largely unaffected by these small dust particles, so the
mid-infrared spectra measure the dust-shrouded ultraviolet light from O and B
24 µm: (0.080 to 0.089) mJy stars. The PAH emission line ratios can also be used to investigate the dust
composition, and at around 10 µm there is an additional absorption feature from
Figure 5.14 The average silicate dust grains. These mid-infrared spectral features do, however, make the
70 µm (left) and 160 µm (right) K-corrections quite complicated. This can make it difficult to estimate the
images of 24 µm-selected mid-infrared luminosities, but the Japanese AKARI space telescope turned this to
galaxies, for a variety of its advantage and made deep surveys in many mid-infrared filters, in order to
ranges of 24 µm flux. Most make mid-infrared photometric redshifts possible.
of these 24 µm galaxies are
One disadvantage of using the mid-infrared for estimating star formation rates is
individually undetected at longer
that black hole accretion can also contribute. The dust tori in active galactic nuclei
wavelengths. For fainter 24 µm
(Chapter 4) are predicted to emit the peak of their radiation at exactly these
fluxes, the detections in the
wavelengths. However, the thermal spectra from dust tori are largely featureless
average images are noisier, but
(notwithstanding a 10 µm absorption feature) so mid-infrared spectra can in
nonetheless significant.
principle distinguish star formation from black hole accretion.
The proposed successor to the HST, the James Webb Space Telescope (JWST),
will have an expected collecting area of 25 m2 , and will operate from 0.6 µm to
28 µm. SPICA should still out-perform the JWST in the mid-infrared,
despite SPICA’s smaller mirror, because of its cooled optics. The proposed
ESA planet-finding mission Darwin may also be able to take high-resolution
mid-infrared images, using mid-infrared space-based interferometry.
One more star formation rate measure deserves a brief mention. High-mass
X-ray binary stars (HMXBs) have one massive star, emitting a wind that is
accreted by a companion neutron star, and the accretion is responsible for the
172
5.5 Multi-wavelength surveys
X-ray radiation. Since massive stars are shorter-lived, the numbers of HMXBs
must be a measure of the recent star formation rate. In practice the high-redshift
X-ray luminosities from star formation are often overwhelmed by that from
supermassive black hole accretion, though this star formation rate estimator can
be useful in some systems. For more details, see the further reading section.
area. Note that in any single tier, there is a strong correlation of luminosity with
redshift (Section 4.5), which as a selection effect is also known as Malmquist
bias. However, by combining surveys of different depths, there is better coverage
of the luminosity–redshift plane.
1012
luminosity/(L! sr−1 )
1011
Arp 220
1010
M82
109
108
0 1 2 3
redshift, z
Figure 5.15 Simulated catalogues for planned surveys with the Herschel Space Observatory. Luminosity per
unit solid angle is plotted against redshift, with objects in each survey colour-coded. The luminosities of the local
starbursts Arp 220 and M82 are marked with dashed lines. The wide and shallow surveys cover the upper parts of
this figure, while the deep and narrow pencil-beam surveys cover the lower parts of the figure. In each survey in
isolation, redshift is correlated with luminosity, due to the Malmquist bias selection effect. Taking the surveys as an
ensemble improves the coverage of the luminosity–redshift plane. The slight horizontal banding is an artefact of
the simulation, but the paucity of high-luminosity objects at low redshifts is real and due to Malmquist bias.
● Why are there few high-luminosity objects at low redshift in any of the
surveys?
❍ This is partly because the luminosity function evolves, so high-luminosity
objects are intrinsically more common at higher redshifts. However, it’s also
because the amount of comoving volume sampled per unit redshift is smaller
at low redshift than at high redshift (Figure 1.19).
Exercise 5.6 Draw (or otherwise indicate) the approximate line of any flux
limit in Figure 5.15. ■
174
5.6 Cosmic star formation history and stellar mass assembly
the rest-frame ultraviolet luminosity density. There’s a strong increase from the
local Universe to redshift z = 1 seen in radio surveys, but it’s difficult to extend
this to higher redshifts because of the difficulty in obtaining reliable redshift
estimates for the radio-selected starbursts.
However, this has been achieved for some submm-wave surveys. In these, the
negative K-corrections give a very uniform selection function over most of the
Hubble volume, as we’ve seen. Figure 5.17 shows the cosmic star formation
history of SMGs. While this seems similar to the ultraviolet star formation
history, remember that the observed optical identifications of SMGs are often faint
or unprepossessing optical galaxies, and not strongly luminous in the rest-frame
ultraviolet. Therefore this star formation is in addition to the ultraviolet-derived
star formation rate!
Meanwhile, the Spitzer Space Telescope has made it possible to conduct
wide-field and deep extragalactic surveys at 3–8 µm. In the near-infrared, the
luminosities of galaxies are insensitive to star formation (see Chapter 4) and
relatively insensitive to dust obscuration, notwithstanding the possibility of a large
proportion of old stars in dust clouds. We could therefore use the near-infrared
luminosity density to measure the total density in stars, Ω∗ . By definition, this
must be an integral of the cosmic star formation history. These constraints are
shown in Figure 5.16. Differentiating this with respect to time gives the cosmic
star formation history in Figure 5.17. Surprisingly, these constraints turn out not
to be consistent with the observed cosmic star formation history!
lookback time/Gyr
1 3 5 7 9 11
8.5
0.5
log10 [ ρ∗ /M! Mpc−3 ]
ρ∗ /ρ∗ (z = 0)
0.2
8 Figure 5.16 The stellar mass
density of the Universe as
a function of time and of
0.01 redshift. A compilation of
earlier estimates is shown in
coloured symbols, while the
7.5 most precise determination so
0.05 far is shown in black symbols.
Open circles show the direct
measurements, while filled stars
0 1 2 3 4 incorporate an additional
redshift, z correction for galaxies below the
flux limits of the data.
175
Chapter 5 The distant multi-wavelength Universe
lookback time/Gyr
1 3 5 7 9 11
0.5
0.2
What could cause this discrepancy? There are many potential sources of
systematic uncertainties that we’ve discussed in this chapter and the previous
one. Perhaps, for example, the dust extinction corrections need revision, or
perhaps some of the star formation rate indicators have additional unrecognized
contributions from AGN. A more fundamental underlying cause could be changes
in the IMF. Indeed, some semi-analytic models assume a strongly top-heavy IMF
in order to be able to reproduce the SMG population without overproducing Ω∗ .
The revisions to the IMF are radical, though. Many observations point to the same
or very similar IMFs, but there is evidence that the IMF is strongly top-heavy in
See, for example, Larson, 2005, the central parsec of our Galaxy.
in the further reading section, Another surprise has been the high-redshift reversal of the relationship between
and Bartko, H. et al., 2009, star formation and environment. In the local Universe, star-forming galaxies avoid
arXiv:0908.2177. rich environments such as the cores of galaxy clusters (see Section 3.7). However,
there is increasing evidence that at z 1 1, star-forming galaxies detected at
24 µm are more common in richer environments. This is already a challenge to
semi-analytic models, and might suggest that mergers were more important in
triggering star formation at higher redshifts. There are also hints that SMGs
are found in richer environments than non-starbursting galaxies. The natural
generalization of the Butcher–Oemler effect (Chapter 3) is to measure the cosmic
star formation history in different environments, and this is the subject of much
active research.
176
5.7 Downsizing
5.7 Downsizing
When did galaxies form? Up to the 1980s, the tendency was to speak of an epoch
of galaxy formation at some (as yet unfathomed) redshift, perhaps involving
monolithic collapse (Chapters 3 and 4). The advent of deep HST imaging
changed the terminology and the thinking, with galaxy formation then being seen
as an ongoing process. The growing sophistication of N -body and semi-analytic
simulations led to a growing acceptance of hierarchical galaxy formation,
following the schematic picture in Figure 4.2.
The discovery of SMGs at redshifts z > 2 with enormous star formation rates of
thousands of solar masses per year therefore came as a tremendous surprise.
Evidence mounted that present-day massive galaxies formed earlier than
present-day less-massive galaxies. This is exactly the opposite to what you might
expect from hierarchical structure formation (Figure 4.2). Figure 5.18 shows the
fraction of stellar mass assembled in present-day galaxies as a function of
redshift. A similar trend is seen in direct measurements of the star formation
history at an observed wavelength of 24 µm (Figure 5.19).
lookback time/Gyr
1 3 5 7 9 11
100
ρ∗ /ρ∗ (z = 0) as a percentage
50
20
1010.0 < M/M! < 1011.0
1011.0 < M/M! < 1011.5
1011.5 < M/M! < 1011.7
1011.7 < M/M! < 1012.0
M/M! > 1012.0
0 1 2 3 4
redshift, z
Figure 5.18 The fraction of the present-day assembled stellar mass, for a
variety of galaxy masses. Note that the most massive galaxies formed their stars
earlier in the history of the Universe.
177
Chapter 5 The distant multi-wavelength Universe
109
0.100
Note that massive starbursts comprised a greater proportion of the cosmic star
formation rate at higher redshifts. This is so striking that it was sometimes
called ‘anti-hierarchical’. Nevertheless, hierarchical semi-analytic models were
ultimately able to account for these populations, though needing to invoke
non-standard IMFs (Section 5.6) or particular feedback mechanisms, which we
shall meet in Section 5.8 and in Chapter 6. A more apt and widely-used term to
describe this top-down galaxy formation is downsizing.
If SMGs are the progenitors of giant elliptical galaxies, then we’d expect them to
be in massive dark matter haloes, and (according to semi-analytic models)
strongly biased tracers of the underlying dark matter distribution. They should
therefore cluster strongly. There have been hints of strong clustering in the
SCUBA surveys, but one of the key aims of the new SCUBA-2 camera on the
JCMT aims to measure the clustering of SMGs. The BLAST balloon-borne
telescope has already inferred a clustering signal in SMGs from fluctuations in its
background measurements, consistent with z ∼ 1 SMGs having typical halo
masses of ∼1013 M) . Also, it appears that the central brightest cluster galaxies
(BCGs) in z > 1 galaxy clusters have already formed > 90% of their stellar
See Collins, C.A., 2009, Nature, masses by z = 1.5, compared to their z = 0 counterparts. This suggests that the
458, 603. formation of BCGs is more akin to monolithic collapse than resembling the result
of repeated mergers. The forthcoming wide-field surveys by SCUBA-2 and
Herschel may find the rare violent starbursts that accompanied this collapse.
Figure 5.20 A simulation of a collision between two galaxies, with (top) and
without (bottom) energy input from accretion round a supermassive black hole.
The gas temperature distribution is colour-coded, blue to red. The maximum in
the star formation and black hole accretion is at 1.6 Gyr, when the galaxies merge.
The energy output from black hole accretion expels the gas from the inner regions
of the merged galaxy after this stage. However, without this energy input, the
result would be very different.
Another source of energy input and a cause of gas expulsion is supernovae. Local
starburst galaxies often have supernova-driven ‘superwinds’ of gas being expelled
from the galaxy. The amount of star formation (or black hole accretion) can
therefore affect the future star formation; in general, this is known as feedback.
Both star formation and black hole accretion are very complex phenomena and
are so far best addressed with large numerical simulations, the results of which are
used in generic ways in semi-analytic models. The role of feedback is perhaps the
single greatest unknown in our understanding of galaxy evolution.
The intra-cluster medium in galaxy clusters is enriched with heavy elements,
which is good evidence that galactic winds have played an important role in
galaxy evolution. The figure on the cover of this book shows an example galactic
wind observed with the Chandra X-ray telescope. Numerical simulations show
that supernova-driven winds succeed in driving out most of the gas only in dwarf
galaxies (< 108 M) ). However, the lack of resolution in these simulations means
that they don’t account for the Rayleigh–Taylor instability (which could help the
hot wind escape) or the Kelvin–Helmholtz instability. Alternatively, an outflow
driven by the energy from black hole accretion could also lead to a bigger wind
from the galaxy than supernovae can generate on their own.
179
Chapter 5 The distant multi-wavelength Universe
AGN feedback could also solve the cooling flow problem in galaxy clusters
(Section 3.9). Figure 5.21 shows a smoothed X-ray image of the Hydra A galaxy
cluster (greyscale) tracing the hot gas of the intra-cluster medium, superimposed
on contours of radio flux density from the radio lobes. The X-ray gas temperature
and profile is consistent with a cooling flow, but the AGN radio lobes have cleared
out cavities. Perhaps AGN radio lobes are the mechanism by which cooling flows
are stopped or regulated? A further source of mechanical energy input from the
AGN was found in Chandra X-ray images of the Perseus galaxy cluster, shown
in Figure 5.22 (the cavities are similarly due to radio lobes). After an image
processing technique called ‘unsharp masking’ (which amplifies high-frequency
Fourier components in the image, while suppressing low-frequency components),
large-scale ripples are seen (Figure 5.23). The lack of temperature changes
in these oscillations led to them being interpreted as acoustic waves with a
wavelength of about 11 kpc and a period of 9.6 × 106 years, or a frequency
about 57 octaves below middle C.8 AGN can therefore inject energy into the
surrounding medium in two ways: radiation energy and mechanical energy. These
are sometimes called radiative mode and kinetic mode.
Figure 5.21 Smoothed Figure 5.22 X-ray image of Figure 5.23 The result of
X-ray image of the galaxy the Perseus galaxy cluster, applying the unsharp masking
cluster Hydra A (grey shading), colour-coded by temperature. technique to Figure 5.22, which
compared to the radio lobes The image is 350 arcseconds removes large-scale features and
from the active nucleus across, or 131 kpc. accentuates small-scale features,
(contours). The bar marks a revealing subtle ripples.
distance of 20 arcseconds.
AGN may also play a more complex role than simply shutting down star
formation and expelling gas. Radio jets from radio-loud AGN have been observed
to trigger star formation in some systems, and there is at least one quasar in which
this jet-induced star formation has been argued9 to pre-date the formation of the
quasar host galaxy.
8
The press referred to this as the ‘deepest bass’ note in the Universe.
9
See Elbaz, D. et al., 2009, arXiv:0907.2923, and references therein.
180
Summary of Chapter 5
Summary of Chapter 5
1. The energy density in the cosmic optical/near-infrared background is about
the same as the energy density in the cosmic far-infrared background.
2. The observed cosmic backgrounds from optical to far-infrared wavelengths
are closely linked to the evolving comoving luminosity densities:
*
1 ddcomoving
Iν (ν0 ) = Eν . (Eqn 5.5)
4π (1 + z)
At any redshift, the populations that contribute most to the background will
be the same as those that contribute most to the corresponding luminosity
density. This links the extragalactic background light to the cosmic star
formation history.
3. A common unit used in astronomy is the jansky (symbol Jy), where
1 Jy = 10−26 W m−2 Hz−1 .
4. No-evolution source count models don’t follow the Euclidean slope at the
faint end, partly because of K-corrections and partly because of the
redshift-dependence of luminosity distance. In other words, the source
counts of a non-evolving population in a flat non-expanding universe are
very different to no-evolution counts in a flat expanding universe.
5. K-corrections often strongly affect the redshift distributions of surveys.
6. The angular resolution of a telescope in radians is given by θ = 1.22λ/D,
where λ is the observed wavelength and D is the diameter of the telescope’s
primary mirror. The large diameters of submm-wave telescopes are not
enough to compensate for the large wavelengths, so optical telescopes have
sharper images. Many optical galaxies can sometimes be found within the
observed position of an SMG, and it’s not always obvious which optical
galaxy is responsible for the submm flux. Other information is needed in
these cases.
7. Infrared-luminous galaxies are sometimes described as LIRGs
(1011 –1012 L) ), ULIRGs (1012 –1013 L) ) and HLIRGs (> 1013 L) ).
Locally, ULIRGs and HLIRGs appear to be galaxy–galaxy mergers.
8. Far-infrared luminosity in galaxies is not always cospatial with
optical-ultraviolet light.
9. There are many assumptions in calculating dust masses: the values of kd , the
temperature and the grey body index are needed, even if assuming a single
dust temperature.
10. There are many methods of determining star formation rates, such as
ultraviolet luminosities from young stars, far-infrared luminosities from star
formation in giant molecular clouds, or radio luminosities deriving
ultimately from supernovae. However, all measure the numbers of massive
stars, so one needs to assume an initial mass function to extrapolate to all
stellar masses.
11. The far-infrared and radio luminosities of star-forming galaxies correlate
strongly.
181
Chapter 5 The distant multi-wavelength Universe
Further reading
• Kolb. U.C., 2009, Extreme Environment Astrophysics, Cambridge University
Press.
• The Space Telescope Science Institute has an online tool for converting
between janskys, magnitudes and other systems, currently at
[Link] [Link].
• Condon, J.J., 1992, ‘Radio emission from normal galaxies’, Annual Review of
Astronomy and Astrophysics, 30, 575.
• De Zotti, G. et al., 2009, ‘Radio and millimeter continuum surveys and their
astrophysical implications’, arXiv:0908.1896.
• Hauser, M.G. and Dwek, E., 2001, ‘The cosmic infrared background:
measurements and implications’, Annual Review of Astronomy and
Astrophysics, 39, 249.
• Kennicutt, R.C., 1998, ‘Star formation in galaxies along the Hubble sequence’,
Annual Review of Astronomy and Astrophysics, 36, 189.
• Larson, R.B., 2005, ‘Thermal physics, cloud geometry and the stellar initial
mass function’, Monthly Notices of the Royal Astronomical Society, 359, 211.
• For more about the BLAST mission, including its crash and the heroic recovery
of its data, see the BLAST web page currently at [Link].
182
Chapter 6 Black holes
Black holes . . . are the most perfect macroscopic objects there are in the
universe: the only elements in their construction are our concepts of space
and time.
S. Chandrasekhar
Introduction
Where are the biggest black holes in the Universe? Why does every galaxy
contain a giant black hole at its centre, and how can we tell? And where did they
come from?
We’ll see in this chapter that black holes have been extremely important in galaxy
evolution. Most of the light that’s generated in the Universe has, ultimately, two
main origins: the release of nuclear binding energy through nuclear reactions in
stars, and the release of gravitational binding energy through accretion onto black
holes. This accretion luminosity is extraordinarily efficient — black holes turn out
to be not so black after all.
the equations above, black holes spinning above the maximal rate have no event
horizons. It turns out that it then becomes possible for particle world-lines around This view has recently been
the black hole to be closed time-like loops, i.e. the black hole would be a time challenged by Jacobsen, T. and
machine! Roger Penrose has proposed a ‘cosmic censorship hypothesis’ that there Sotiriou, T. P., arXiv:0907.4146.
are no naked singularities in nature, i.e. no singularities without event horizons.
At the time of writing, this remains unproven in classical gravity (unless naked
singularities are taken as pre-existing). The hypothesis is the subject of a bet
between Stephen Hawking, who contends that it is correct, and Kip Thorne and
John Preskill, who contend that it is not. The stake of the bet (re-formulated in
1997 to eliminate possible loopholes) is: ‘The loser will reward the winner with
clothing to cover the winner’s nakedness. The clothing is to be embroidered
with a suitable, truly concessionary message.’ A related conjecture by Stephen
Hawking is the ‘chronology protection conjecture’ that fundamental physics
forbids closed time-like loops. This conjecture would be addressed by, or could
form part of, a future theory of quantum gravity.
General relativity is time-symmetric, so one could conceive of the time-reverse of
black holes, sometimes called ‘white holes’. Perhaps entropic reasons forbid
them, for a reason similar to why molecules in a lake do not suddenly conspire to
throw pebbles out. We shall not dwell on this, except to say that many deep
problems in classical and quantum gravity involve entropic aspects of gravity.
White holes also occur in discussions of the Kerr metric. The curvature singularity
in the Kerr metric is ring-shaped, unlike the point-like Schwarzschild singularity.
Infalling matter in a Kerr metric could pass through a further inner horizon and
though the ring, and, if the same metric holds past these points, would emerge out
of a white hole — where? This has been the subject of much speculation in
science fiction. But an infalling traveller attempting this journey would receive an
infinite flux of radiation that fell into the hole from the other side, infinitely
blueshifted — an infinitely ferocious gauntlet to run. Clearly, there is still much to
be understood about these singular regions.
Black holes can form as the end-point of the most massive stars’ evolution,
when nuclear reactions or degeneracy pressure can no longer support a stellar
core against gravity. This conclusion was first reached by Chandrasekhar and
infamously opposed by Sir Arthur Eddington, who said: ‘I think there should be a
law of nature to prevent a star from behaving in this absurd way.’ Hindsight has
shown Chandrasekhar to be correct, but ironically Eddington’s name has become
associated with black hole accretion, as we shall see in the next section. One
might nevertheless share a similar reaction to the inevitable curvature singularities
inside black holes.
Exercise 6.1 Demonstrate this, using some quantified argument of your own
invention. (Like Exercises 4.4 and 4.11 this is a more open-ended exercise.) ■
The inward and outward forces balance when Fphoton = Fgravity , so
σT L G MBH mp
2
= . (6.5)
4πcr r2
Remarkably, the r2 terms cancel, so the balance is independent of radius. The
luminosity at this balance is
4πGc MBH mp
LE = , (6.6)
σT
which is known as the Eddington luminosity, after Sir Arthur Eddington who
first made these calculations in the context of stellar opacity.
In black hole accretion, this luminosity is generated by the accretion disc, so
L ∝ dMacc /dt. Ultimately, some of this energy output comes from the release of
gravitational binding energy of matter falling towards the black hole. This
happens because friction in the accretion disc leads to energy losses through
thermal radiation, so the orbital radius of the matter decreases. As the accreting
matter approaches the black hole, some of the combined mass-energy is converted
to luminosity. The accretion luminosity can be expressed in terms of the
mass-energy accretion rate (E = mc2 so Ė = ṁc2 ):
dMacc 2
L=η c , (6.7)
dt
186
6.2 The Eddington limit
187
Chapter 6 Black holes
188
6.3 Accretion efficiency
x
x(t1 )
x(t) ε(t)
x(t0 )
t0 t1 t
We set ε(t0 ) = ε(t1 ) = 0 so that the paths start and end at the same points.
The velocity on the least-action path is v(t) = dx(t)/dt, while on the
neighbouring path it’s v(t) + ε̇(t). To first order, the Lagrangian L(x, v) on
the neighbouring path will be
∂L ∂L
L(x + ε, v + ε̇) = L(x, v) + ε(t) + ε̇(t) , (6.11)
∂x ∂v
so the action on the neighbouring path will be A + δA, where
* t1 C D
∂L dε ∂L
δA = ε(t) + dt. (6.12)
t0 ∂x dt ∂v
But we set ε(t0 ) = ε(t1 ) = 0, so the term in the square brackets is zero, and
* t1 C D
∂L d ∂L
δA = ε(t) − dt. (6.14)
t0 ∂x dt ∂v
Now, you already know that if a variable a minimizes a function y(a), then
changing a at that minimum doesn’t change y to first order (see Figure 6.2).
189
Chapter 6 Black holes
Δy ∝ a
Δy ∝ a2
a
Figure 6.2 Near the minimum of this curve y(a), the displacement Δy is
proportional to a2 . This is because dy/da = 0 there, which means that the
first-order term in a Taylor series expansion around this point is zero, and
only the second-order term is non-zero. Elsewhere dy/da 5= 0, so Δy ∝ a
to first order.
Similarly, if x(t) is the path that minimizes the action, then putting a small
wiggle ε(t) onto the path ) won’t change the action to first order, i.e. δA = 0
(sometimes written as δ L dt = 0). But though ε is small, it’s arbitrary, so
δA = 0 can happen only if
∂L d ∂L
= . (6.15)
∂x dt ∂v
This is known as the Euler–Lagrange equation.
We’ve done this in one dimension for simplicity, but if the Lagrangian
depends on many coordinates q1 , q2 , . . . , q̇1 , q˙2 , . . . — where, for example,
the coordinates could be Cartesians (q1 , q2 , q3 ) = (x, y, z) or polars
(q1 , q2 , q3 ) = (r, θ, φ) or indeed any coordinate system — then
∂L d ∂L
= (6.16)
∂qi dt ∂ q̇i
for every qi .
We can use this to find conservation laws in physics. If the Lagrangian L is
independent of a Cartesian coordinate x, and v = dx/dt, then
d ∂L
= 0,
dt ∂v
so ∂L/∂v must be a constant. Empty space with no potential (V = 0) has
this property, and the conserved quantity in Cartesian coordinates turns out
to be linear momentum. Similarly, L in polar coordinates won’t depend
on θ, and the conserved quantity turns out to be angular momentum. In
Newtonian gravitation, L also doesn’t depend on θ, and the conserved
quantity gives Kepler’s second law. This can also be proved from angular
momentum conservation in elliptical orbits, but the Lagrangian trick is much
quicker and easier.
These coordinate independencies can also be thought of as symmetries,
because if L is not a function of x (i.e. L 5= L(x)), then L is invariant under
190
6.3 Accretion efficiency
the transformation x → x + δx for any δx. The fact that symmetries in the
Lagrangian imply conservation laws is known as Noether’s theorem, after
the brilliant physicist Amalie Emmy Noether (Figure 6.3), and is very
widely used in fundamental physics. A related argument can show that
energy conservation in general reflects time-independence. Maxwell’s
equations and charge conservation are equivalent to Lorentz invariance of
the electromagnetic vector potential — or rather, a more general invariance
known as gauge invariance (see, for example, Ryder, L.H., 1985, Quantum
Field Theory, Cambridge University Press). Einstein’s field equations of
general relativity can be found by minimizing an appropriate Lagrangian.
All fundamental physics can be thought of as simple symmetries and
conservation laws. This realization was a crowning achievement of
nineteenth and early-twentieth century physics, and is still true today.
What is energy, anyway? What is momentum? Lower-level textbooks tend
to describe their effects, but fundamentally these quantities are mostly
interesting because they are conserved. If they weren’t, we’d find and use
other quantities that were. This thinking is useful when particle physics
presents you with more abstract quantities such as strangeness or colour
charge or lepton number, and you wonder what they are.
You might well ask why Nature obeys the principle of least action —
after all, the total energy EK + V is an obviously physical quantity, but
L = EK − V isn’t. Classically, it’s hard to interpret, but the physicist
Richard Feynman found that in quantum mechanics the phase of the wave
function is A/!. He imagined particles in quantum mechanics taking all
possible paths simultaneously, but nearby paths would have wave functions
that tend to cancel out (because to first order the phases are different), except
in the regions where the phases are all the same to first order, i.e. where
δA = 0, which is the path of least action. The smallness of ! ensures
that this cancellation happens only very close to the least-action path, so
macroscopically (i.e. classically) the particle appears to take only the
least-action path.
Figure 6.3 Amalie Emmy
Geodesics are the paths that Noether, 1882–1935.
) ) maximize the total relativistic spacetime distance
ds along the path (i.e. δ ds = 0 in the notation of the box above). In fact, we
)can pick any two points on the path and the geodesic will follow the maximum
ds between those points — because if it didn’t,
) we could tweak the path
between those points and find a better global ds. In general relativity, any
free-falling frame is locally the metric of special relativity, and the maximum
spacetime interval-in special relativity between two events is just a straight line in
spacetime: δs = (c δt)2 − (δx)2 − (δy)2 − (δz)2 . We can think of a geodesic
as a sum of these δs contributions measured in a chain of free-falling reference
frames along the path. But if δs is a maximum, then so is (δs)2 . By instead
summing up these (δs)2 contributions, we can make a new quantity, say Y2 , that’s
also maximized along geodesics:
*
dxµ dxν
Y2 = gµν dτ. (6.17)
dτ dτ
191
Chapter 6 Black holes
Figure 6.4 plots this function for various specific angular momenta J. B Most
angular momenta have a stable minimum at one radius. This is also the radius of a
B because for circular orbits, dr = 0 ⇒ dr/dτ = 0
circular orbit at that J,
⇒E B = VB . But at sufficiently low angular momenta, there is no stable circular
orbit. There is therefore a closest possible inner edge of the accretion disc.
We’ll use this closest stable circular orbit to calculate the maximum efficiency for
converting mass-energy to luminosity in a black hole accretion disc. It’s not too
hard (but a bit tedious) to show that√the smallest stable circular orbit around a
Schwarzschild black hole has JB = 3RS and radius r = 3RS . (Either find where
d2 VB /dr2 = 0, or set dVB /dr = 0 and require that a finite root exists.) Putting in
the numbers, the fractional binding energy of an orbit at this radius is therefore
mc2 − E -
= 1 − EB = 1 − 8/9 = 0.0572 . . . 1 6%. (6.22)
mc2
If a particle spirals in from r = ∞, radiating binding energy or mass-energy as
luminosity via friction, this is the fraction of energy released. For comparison,
< 0.1% of the uranium rest mass in a fission-based atomic bomb is converted to
energy.
The corresponding result for a maximally-spinning Kerr metric is a minimum
stable equatorial circular orbit radius of r = (5 ± 4)GMBH /c2 (use − for
prograde
J orbits, + for retrograde). The fractional binding energy
J is
5 1 1
1− 3 3 1 4% for retrograde orbits and an astonishing 1 − 3 1 42% for
192
6.4 Cosmic mass density of black holes, ΩBH
prograde orbits. (If the black hole has a spin just less than the maximal value, e.g.
0.998 times the maximal spin, then this can drop to ‘merely’ ∼30%, but this is See,1.04
for example, Thorne, K.S.,
still an astonishingly high efficiency.) The proofs follow similar methods to the 1974, Astrophysical Journal,
Schwarzschild case, but are longer for this more complicated metric (details are in 191, 507.
Misner, Thorne and Wheeler). Retrograde accretion orbits seem unlikely, either 1.02
because of frame dragging or because the black hole and accretion disc both
formed from matter with similar angular momentum axes, so spinning black holes
are expected to be even more efficient converters of mass–energy to luminosity. 1.00
V!
4
Almost 200 years after John Michell’s suggestion of matter hidden in dark stars, 0.98
the cosmologist Andrzej Soltan spotted that the number counts of quasars give an
ingenious constraint on the present-day total mass of black holes, which is
independent of H0 , Ωm and ΩΛ . This is done by estimating the total energy output 0.96
of quasars throughout the history of the Universe, and applying the Eddington
limit. 3.75
√
0.94 2 3 = 3.464
The energy output of quasars per unit comoving volume and per unit time is
E = L Φ(L, z), where L is the quasar bolometric luminosity and Φ is the
bolometric luminosity function. The bolometric energy output in a time t to t + dt 0.92
from quasars with luminosities from L to L + dL is then 0 5 10 15 20 25
r/(GM/c2 )
E(L, t) dL dt = L Φ(L, z) dL dt,
Figure 6.4 The effective
where time t and redshift z are related through Equation 1.34. Now, the number
potential V! around a
counts are related to the luminosity function by
Schwarzschild black hole, given
dV by Equation 6.21. The numbers
4π n(S, z) dS dz = Φ(L, z) dL dz,
dz on the curves are the values of
where n(S, z) is dN/dS evaluated at flux S for objects with redshift z, and V is the specific angular momentum
the comoving volume. Luminosity and flux are also related: relative to the black hole mass,
! 2 /(GM ), where M is the
Jc
L = 4πd2L S, black hole mass.
so
" $−1
2 dV
E(L, t) dL dt = (4π) S n(S, z) dS d2L dt. (6.23)
dz
Now one can show that
" $
2 dV −1 1
4πdL dt = (1 + z) dz. (6.24)
dz c
Putting this into Equation 6.23 and integrating over L and t, we find that the total
energy output of quasars throughout the history of the Universe is
# ∞ # t0
Etotal = E(L, t) dL dt
L=0 t=0
# #
4π ∞ ∞
= (1 + z) S n(S, z) dS dz. (6.25)
c z=0 S=0
This does not depend on either H0 or the cosmology!
193
Chapter 6 Black holes
where (z|fB / is the mean redshift at a given B-band flux fB . The uncertainty in
the B-band number counts n(fB ) is much larger than the uncertainties in (z|fB /.
Once we have the total energy emitted by quasars, we can convert this to a
present-day black hole mass density using ρBH c2 = Etotal (1 − η)/η, where η is
the conversion efficiency of black hole mass accretion to luminosity. The factor of
(1 − η), not originally used by Soltan, accounts for the fact that not all the
accreted matter falls into the black hole. Putting in the numbers, Soltan found a
present-day black hole density of
ρBH = (0.1/η) × 8 × 104 M) Mpc−3
for a bolometric correction of kbol = 6.0 (justified on the basis of quasar spectral
energy distributions). This corresponds to a cosmological density of black holes
from ‘dormant’ quasars of
1 − η 0.1
ΩBH h2 = 3 × 10−7 .
0.9 η
For comparison, the present-day mass density in stars has been estimated as
Ω∗ h = (2.9 ± 0.43) × 10−3 for a Salpeter initial mass function, so ΩBH in
dormant quasars is only about 0.01h% of Ω∗ . However, we’ve only counted the
type 1 (broad line) active galaxies and not the type 2 (narrow line) ones, so this
should be regarded as a lower limit.
Nearby type 1 active galaxies have had black hole masses measured using the
widths of the Balmer emission lines. On the assumption that the dynamics of the
broad line region is dominated by gravity, MBH 1 v 2 RBLR /G, where v is the
velocity width and RBLR is the broad line region radius. This latter parameter can
be estimated from models of the ionization within quasars, or from reverberation
mapping, which we shall meet in the next section. This yields an integrated mass
density of 1 600 M) Mpc−3 from local active galaxies. This is two orders of
magnitude smaller than Soltan’s estimate of ΩBH . Most of the present-day ΩBH
must therefore be in dormant quasars, which we shall cover in the next section.
194
6.5 Finding supermassive black holes
Exercise 6.5 Find the angular size in arcseconds of the sphere of influence of a
108 M) black hole in a galaxy with a central velocity dispersion σ = 220 km s−1 ,
at a distance of 10 Mpc. Compare this to the best angular resolution typical of
optical ground-based telescopes (known as ‘seeing’) of ∼0.5%% set by turbulence in
the Earth’s atmosphere. (One arcsecond is 1/3600th of a degree.) ■
250
200
σ/km s−1
150
100
−40 −20 0 20 40
r/arcsec
150
100
50
v/km s−1
−50
Figure 6.5 The velocity
−100
dispersion σ and mean
−150 velocity v (both in km s−1 )
−40 −20 0 20 40 along the major axis of the
r/arcsec
core of M31.
The Andromeda galaxy (M31) is 2.52 million light-years away, or 0.77 Mpc. It is
one of our Galaxy’s closest neighbours and can even be seen with the naked eye
on a dark enough night. At this distance, one arcsecond is about 3.7 pc, so the
196
6.5 Finding supermassive black holes
sphere of influence of the black hole may just be within the capabilities of
ground-based telescopes. Figure 6.5 shows the velocities of the stars in the bulge
of M31 (measured from the integrated starlight rather than detecting individual
stars). The characteristic sharp feature in the centre implies a black hole mass of
1 106 M) . Subsequent HST spatially-resolved spectroscopy revised this upwards
to (3.0 ± 1.5) × 107 M) .
The HST found evidence for a much larger black hole in another nearby galaxy,
M87, by measuring the Doppler shifts in ionized gas close to the centre. The
spatial sampling of the HST’s Faint Object Camera is 0.028%% , corresponding
to only about 2 pc in M87. Two sets of observations by the HST yielded
(2.4 ± 0.7) × 109 M) within 18 pc (0.25%% ), followed by (3.2 ± 0.9) ± 109 M)
within 3.5 pc (0.05%% ). The HST has since been used for measuring black hole
masses in several other galaxies. For comparison, the total mass of the entire
Small Magellanic Cloud (Chapter 3), including its dark matter, is about
6.5 × 109 M) .
6.5.3 Megamasers
The early HST discoveries received a lot of press coverage, but at about the same
time — though finding less press notice — a much stronger constraint on a
supermassive black hole came from radio astronomy. The galaxy NGC 4258 (also
known as M106) has naturally-occurring masers (microwave lasers) generating
coherent radiation through stimulated emission, with H2 O molecules providing
the masing medium. These masers are believed to be generated in random
directions but with a small random subset lying along the line of sight to the
Earth. (This abundant maser activity is sometimes referred to as ‘megamasers’.)
This masing emission can be detected by radio interferometry, which routinely
has milliarcsecond (mas) resolutions or better. Figure 6.6 shows the line-of-sight
velocities of masers in NGC 4258 observed with the Very Long Baseline Array
(VLBA) of radio telescopes. Currently, observations are consistent with a central
mass of (3.82 ± 0.01) × 107 M) within a central radius of 0.13 pc (4.1 mas, i.e.
0.0041%% ).
1500
1000
LSR velocity/km s−1
500
550
0 500
450
400
−500 0.4 0.0 −0.4 Figure 6.6 The velocities of
megamasers around the central
9 8 7 6 5 4 3 2 1 0 −1 −2 −3 −4 −5 −6 −7 −8 −9
supermassive black hole in the
impact parameter/milliarcsecond
galaxy NGC 4258.
197
Chapter 6 Black holes
1994.32 1995.53
1992.23
1996.25
1996.43
1997.54
1998.36
0.05"" 1999.47
(2 light-days)
2000.47
SgrA*
2002.66
2002.58 2001.50
2002.50
2002.40
2002.33 2002.25
Figure 6.7 The orbit of a star close to the central supermassive black hole in
our Galaxy. The numbers indicate the dates of the observations, expressed as
decimal years.
199
Chapter 6 Black holes
80 1
Fλ (1350) Å
40 0
10 1
Fλ (5100) Å
5 0
10 1
He II
5 0
correlation coefficient
100 1
Lyα
50 0
80 1
C IV
60 0
Figure 6.8 The variations in
the continuum and in selected
emission lines in the Seyfert 1 1
15
galaxy NGC 5548. The
C III
to 8
,the cross-correlation 0
C(t) L(t − Δt), where 6
C(t) are the continuum 47 500 47 600 47 700 0 20
measurements and L(t) are the (JD − 2400 000)/days delay/days
line measurements.
200
6.6 The Magorrian relation
isodelay surface
r cos θ
Y illuminating
emission line photons Z continuum
θ photons
r
to observer θ
X
D
Figure 6.9 Isodelay surface in a quasar broad line region. Broad line clouds
lying along the isodelay surface have a time delay of Δt = (r/c)(1 + cos θ)
relative to light from the centre.
In practice the long-term monitoring data are available for only a few quasars, so
the reverberation measurements act as calibrators for black hole mass estimators
of the form MBH = kLαν (Δv)2 , where Lν is the continuum luminosity measured
near a particular emission line, and k and α√are constants specific to that emission
line. Typically α ! 0.5, as required if r ∝ L.
The quasars in the Sloan Digital Sky Survey (SDSS) have had black hole mass
estimates ranging from around 107 M! to an astonishing > 1010 M! (compare
the mass of the entire Small Magellanic Cloud galaxy in Subsection 6.5.2).
However, these black hole mass estimates have underlying uncertainties of many
tens of per cent at best. It’s possible that these largest black holes are in fact
lower-mass objects in which the underlying uncertainties happen to have given
rise to a higher measurement. Nevertheless, the SDSS quasar data set is consistent
with black holes existing up to a maximum mass of 3 × 109 M! .
If we correlate the supermassive black hole masses against the total host galaxy
masses, the correlation is somewhat weak. A stronger correlation (shown in
Figure 6.10) is between MBH and the spheroid luminosity, i.e. the bulges in the
case of spiral galaxies, and the entire galaxies in the case of ellipticals.
1010 1010
109 109
black hole mass/M!
107 107
106 106
Figure 6.10 Left: correlation between the black hole masses and central
velocity dispersions for local galaxies. Right: correlation between black hole
masses and bulge B-band luminosities of the same sample of local galaxies.
Ellipticals are displayed as circles, spirals as triangles, and the squares represent
both lenticulars and compact elliptical galaxies.
This was first discovered in 1998 by a team led by the astronomer John
Magorrian, and has become known as the Magorrian relation. There is an even
stronger correlation between the spheroid velocity dispersions σ and MBH . This
correlation is also shown in Figure 6.10. The dispersion of the data points is
almost entirely attributable to the uncertainties in the measurements — in other
words, the measurement of the underlying scatter is almost consistent with zero.
The best-fit relationship is
C D4.58±0.52
8 σ
MBH = (1.66 ± 0.32) × 10 M) . (6.32)
200 km s−1
Central supermassive black holes are about 0.6% of the masses of their galactic
bulges (or the whole galaxies in the case of ellipticals), with a sphere of influence
that is less than a thousand billionth of the volume of the bulge, yet the black hole
mass correlates astonishingly strongly with the galaxy velocity dispersion.
Clearly, the creation of a central supermassive black hole has somehow been
strongly connected to the formation of its galaxy. We shall speculate on how and
why later in this chapter.
202
6.7 The hard X-ray background
The Moon reflects solar X-rays, but the dark side is clearly obscuring a
faint background. The image is grainy because X-ray detectors respond to
individual photons. The dark side of the Moon isn’t quite black because the
Earth’s geocorona, or extended outer atmosphere, emits X-rays, and the satellite
orbits within this geocorona. (The extragalactic X-ray research community
sometimes refers to the ‘background’ as being the proportion that is not yet
resolved into point sources, rather than the total flux from the sky. We shall follow
the conventions at other wavelengths and take the ‘background’ to mean the total
flux, rather than the unresolved component of it.)
As with the CMB, the X-ray background appears to be fairly isotropic, once the
Galaxy has been subtracted. This on its own suggests a cosmological origin. The
X-ray background is included in Figure 5.1. One can make inferences about
the evolution of black hole accretion from this, but one thing quickly became
apparent. The spectrum of the extragalactic X-ray background is very different to
the spectrum of a star-forming galaxy or an unobscured quasar, so what could
generate this background? This is sometimes known as the X-ray spectral
paradox.
The shape of the X-ray background therefore requires some objects with harder
X-ray spectra, i.e. with a greater proportion of higher-energy X-rays. The most
likely candidate is type 2 AGN. Figure 6.12 shows the effect of X-ray and optical
203
Chapter 6 Black holes
wavelength/nm wavelength/nm
1 10 300 1000 3000
1.0000 1.0000
0.1000 0.1000
flux
flux
0.0100 0.0100
0.0010 0.0010
0.0001 0.0001
10.0 1.0 0.1 0.003 0.001 0.0005
energy/keV energy/keV
Figure 6.12 Left: the transmitted flux through a neutral hydrogen column density of (from top to bottom) 1020 ,
1021 , 1022 and 1023 cm−2 . The greater the column density, the less flux is transmitted through. On the right-hand
side, the corresponding near-infrared to ultraviolet extinction is shown for NHI = AV × 1.8 × 1025 m−2 mag−1 .
The right-hand curves show AV = 0.055, 0.555, 5.55 and 55.5 (from top to bottom).
absorption column density distribution. There must therefore have been some
significant contribution from Compton-thick active galaxies to the black hole
accretion history of the Universe.
If the geometry of the absorption allows it, it may be possible to detect some of
the Compton scattered component directly (though faintly). The spectrum of this
‘Compton reflected’ component is expected to be a broad peak around 20–30 keV,
depending on the ionization of the scattering medium. Also, Compton scattering
from iron nuclei can excite a strong iron Kα line (Subsection 6.5.5). Both the iron
line and the spectral shape can be useful indicators of Compton-thick absorbers,
though with current X-ray telescopes this can be done only in luminous and/or
local active galaxies.
The softer X-ray background has been mostly resolved into its constituent
point sources by ROSAT (1 75% of the 0.5–2 keV background). It has taken
longer to do the same at harder X-ray fluxes where the X-ray spectral paradox
suggested new populations. The Japanese ASCA satellite resolved 1 35%
of the 2–10 keV background, while the Italian BeppoSAX mission resolved
1 20–30% of the 5–10 keV background. The breakthroughs in resolving most
of the hard X-ray background into its constituent point sources came from
the European Space Agency’s XMM-Newton space telescope and NASA’s
Chandra space telescope. Figure 6.13 shows the deep pencil-beam surveys
taken by XMM-Newton and Chandra. The point sources in these surveys can
account for 1 80–90% of the 2–6 keV background, but only 50–70% of the
6–10 keV background. Astronomical X-ray CCDs detect individual photons by
converting them to electrons via the photoelectric effect (about 10–80% of
incident photons are converted, depending on the X-ray photon energies), then
reading the accumulated charge in each pixel. The faintest X-ray sources found by
XMM-Newton and Chandra have electron count rates of only 1 1 per day.
The term ‘hard X-ray background’ often tends to refer roughly to the 2–10 keV
range, but one should not forget the higher-energy background. Most of the
energy density in the cosmic X-ray background is at 10–100 keV, but only a few
per cent of this background has been directly resolved so far. The proposed future
European Space Agency X-ray space telescope, currently named the International
X-ray Observatory (IXO), will be able to observe 0.1–40 keV and directly probe
the Compton-thick populations.
What sort of objects dominate the hard (2–10 keV) X-ray background? Most
turn out to be active galaxies, but there is an extraordinary range in the optical
properties: the X-ray–optical flux ratios vary by over four orders of magnitude.
There are unobscured and obscured AGN, which appear in optical spectra as
broad line and narrow line objects, respectively. More surprisingly, there is a
minority of objects with obscured X-ray spectra but broad optical emission lines,
and others with low X-ray column densities yet narrow line AGN optical spectra
(implying that the obscuring gas doesn’t follow the same distribution as the
obscuring dust in Figure 4.15). Many objects in the XMM-Newton and Chandra
pencil-beam surveys are too faint for optical spectroscopy, even with the largest
8–10 m-mirror optical telescopes, though broad-band photometry is consistent
with these being mainly distant AGN. Some objects have no optical counterparts
in even the deepest ground-based and space-based optical imaging. There are also
X-ray bright, optically-normal galaxies (with the delightful acronym XBONGs),
205
Chapter 6 Black holes
which show X-ray evidence for AGN (sometimes obscured, but not always) yet
no optical AGN evidence (such as broad or high-ionization emission lines) in the
optical spectra. A minority of the hard X-ray background also comes from
starburst galaxies (see Chapter 5), galaxy clusters and groups (see Chapters 3
and 7), and Galactic stars.
(a) (b)
Figure 6.13 Deep fields taken by the XMM-Newton and Chandra space telescopes. (a) The Chandra Deep Field
North (CDF-N). This is a 2 Ms image (i.e. an exposure of 2 × 106 seconds) taken by Chandra, in the region of
the Hubble Deep Field North (HDF-N, marked in green). The Spitzer GOODS survey field is also shown in
green. There are X-ray data over around 448 arcmin2 , i.e. around 60% of the angular size of the Moon. This
image represents 0.5–2 keV photons as red, 2–4 keV as green (except for annotations), and 4–8 keV as blue.
Confusingly, there is also a Chandra Deep Field South (CDF-S), which does not coincide with the Hubble Deep
Field South (HDF-S), but the Chandra field is nevertheless the site of the Hubble Ultra Deep Field (UDF). (b) The
XMM-Newton deep field in a region of sky known as the Lockman Hole in Ursa Major (named after Felix
Lockman who discovered that this region had very low X-ray absorption from Galactic neutral hydrogen). Here,
0.5–2 keV photons are represented as red, 2–4.5 keV as green, and 4.5–10 keV as blue. Objects are broader closer
to the edges because the instrumental angular resolution is coarser. This image covers about 1556 arcmin2 .
But where are the Compton-thick objects? A few high-redshift objects are known
to be Compton-thick from X-ray observations, such as the hyperluminous galaxy
IRAS F10214+4724 (Chapters 5 and 7), but most are very hard to detect in
X-rays. An intriguing clue has recently come from infrared surveys. Good
candidates for highly dust-shrouded quasars were found through high 24 µm to
Martı́nez-Sansigre, A. et al., 3.6 µm flux ratios. (A faint 3.6 µm flux implies that it’s not an unobscured quasar,
2005, Nature, 436, 666. in which case the 3.6 µm flux is dominated by the host galaxy, while the 24 µm
excess suggests hot dust as expected for an AGN dust torus.) Furthermore, as we
saw in Chapter 5, star-forming galaxies have strong emission and absorption
features in the mid-infrared, while the mid-infrared spectra of active galaxies are
typically featureless. Mid-infrared spectroscopy of some mid-infrared-bright
but optically-faint galaxies in the Chandra Deep Field North (also known as
GOODS-North) has found evidence of active nuclei, but the X-ray emission of
these galaxies is weak or absent. This suggests that there is a population of
‘mid-infrared excess’ galaxies having high mid-infrared–optical flux ratios, at
least some of which are Compton-thick active galaxies. Could these be the
missing galaxies that dominate the highest-energy X-ray backgrounds?
206
6.8 Black hole demographics
territory. There’s a connection to the cooling flow problem in Section 3.9: dense
cooling core clusters are a nearby example where we can observe the influence of
a black hole suppressing star formation in the central galaxy.
One of the problems is that it’s not currently technologically feasible to directly
resolve the black hole sphere of influence in any but the most local galaxies. The
local MBH –σ relationship is the end result of billions of years of evolution,
including multiple galaxy mergers. Intuition suggests that the MBH –σ relationship
was somehow imprinted early on, and numerical simulations have confirmed that
this relationship, once established, is maintained surprisingly well in galaxy
mergers. Unfortunately, the primordial high-redshift links between black holes
and their host galaxies are very difficult to observe directly.
However, there are slightly less direct approaches. As we’ve seen, reverberation
mapping can be used to derive the masses of quasar black holes, while the
velocity dispersions can be inferred from the widths of absorption lines in the
quasars’ host galaxies. Provided that these are both reliable measures, we can
make some constraint on the evolution of the black hole–host galaxy relationship.
Another approach is to use radio-loud active galaxy unification models. The
radio-loud active galaxy population in general has powerful radio jets emitted
from the active nucleus, terminating in a bow shock in the intergalactic medium.
Figure 4.12 is an example. Only about 10% of active galaxies are radio-loud, but
it is not clear why.
In the radio-loud unification model, active galaxies have a dusty torus that
obscures the view of the quasar broad lines from some orientations (see
Figure 4.15). Quasars and radiogalaxies with the same radio lobe luminosities
should be members of the same population, though seen with different
orientations. Therefore we can measure the host galaxy properties by studying the
radiogalaxies, then measure the corresponding quasar properties by comparing the
radiogalaxies’ counterparts in the quasar population. There is some (admittedly
weak but suggestive) evidence for the evolving black hole mass and host galaxy
properties inferred from this method.
So which came first, the black hole or its galaxy? The comparison of the Madau
diagram (Chapters 4 and 5) to the black hole accretion history suggests that there
was plenty of star formation before the quasar epoch. However, the comparison of
3CRR host galaxy masses and black hole masses suggests that the MBH /Mspheroid
ratio increases with redshift, which in turn suggests that the most massive black
holes were pre-existing and spheroids formed around them.
On the other hand, there are hints that the submm-selected galaxy population has
a smaller MBH /Mgalaxy mass ratio than these quasars, at least in those submm
galaxies with broad optical emission lines (Figure 6.15). What’s more, many
submm galaxies have been detected in hard X-rays, implying that black hole mass
accretion appears to be much more common in those galaxies than in the general
galaxy population. Perhaps the most massive starbursts are eventually shut off by
the energy input from an exponentially-growing accreting black hole. Quasars at
all redshifts are ultraluminous starbursts on average, though this doesn’t tell us
whether the quasar phase comes at the start of the starburst or at the end. At the
time of writing, it seems that the star formation rate in quasars varies roughly as
the square root of the quasar luminosity, not linearly:
208
6.9 Observations of black hole growth and the effects of feedback
dM∗ /dt ∝ (dMBH /dt)0.44±0.07 . This feature has not so far been reproduced by
models of quasar feedback. Serjeant and Hatziminaoglou,
2009, Monthly Notices of the
Royal Astronomical Society,
local ULIRGs with MBH 397, 265.
z > 1.8 SMGs (stellar mass)
z > 1.8 SMGs (CO dyn mass)
X-ray luminous broad-line
SMGs, using CO dyn mass
−2
quasars
log10 (MBH /MGAL )
radio AGN
local relationship
−3 η = 0.1
η = 0.2
−4 η = 1.0
0 1 2 3
redshift, z
Figure 6.15 Black hole–galaxy mass ratios for the galaxies selected at submm
wavelengths (SMGs) that are obscured at X-ray wavelengths with galaxy masses
inferred from observed stellar mass and via the width of a carbon monoxide
emission line. Also shown is the relationship for local ultraluminous infrared
galaxies (ULIRGs), active galaxies, and an indication of the range spanned on
average by X-ray luminous QSO SMGs. The effect of changing the assumed
value of η (Equation 6.7) for SMGs is also shown. The SMGs have black hole
masses that are smaller than those of quasars, for their host galaxy sizes.
The scatter in the MBH –σ relationship appears to be smaller than that of the
MBH –Mhalo correlation, suggesting that the velocity dispersion and not the mass
of the dark matter halo is primary. This has been shown (at some length) to be Wyithe and Loeb, 2005,
consistent with self-regulated black hole growth, in which the energy output from Astrophysical Journal, 634, 910.
black hole accretion is enough to unbind the gas, which chokes off the supply of
fuel to the black hole. There is currently a great deal of research activity in this
area, aiming at inferring the strengths of the physical links from the tightnesses of
the correlations. For example, the surprise lack of a black hole in the galaxies
M33 and NGC 205, even though their central star clusters obey the same central
mass versus spheroid mass relationship, may point at a different fundamental
relation. This is still being debated and studied. Also, in nearby active galaxies,
there appears to be a different distribution of Eddington ratios for galaxies with
recent star formation, compared to those with more quiescent stellar populations.
209
Chapter 6 Black holes
Galaxies with recent star formation also seem to have higher Eddington ratios
(Figure 6.16). It’s been suggested that this is consistent with self-regulated black
hole growth while the gas supply is plentiful (which also fuels the star formation),
but when the gas supply runs out, it seems that the only fuel for the black hole
comes from mass loss from evolved stars, starving the black hole.
log10 F
−1 −1
−1.5 −1.5
−2 −2
−2 −1 0 1 2 −2 −1 0 1 2
log10 [L([O III])/MBH ] log10 [L([O III])/MBH ]
Figure 6.16 Distribution of inferred Eddington ratios L/LE for galaxies with young stellar populations (left),
and with old stellar populations (right). The AGN luminosity is taken to be proportional to the luminosity in the
[O III] emission line, symbolized as L[O III], and the y-axis is the logarithm of the fraction of the population. The
colours represent black hole mass ranges, as shown in the right-hand panel.
Einstein himself initially thought that gravitational waves did not exist. Eddington
is said to have dismissively quipped that gravitational waves travel ‘at the speed of
thought’, but in truth his remark was directed at a certain spurious subset, and in Eddington, A. S. (1922)
fact he showed that members of another class of gravitational waves do indeed Proceedings of the Royal Society
carry energy. of London A, 102, 268–82.
Gravitational waves have been inferred in the binary pulsar PSR B1913+16, in a
beautiful verification of the predictions of general relativity that won Russell
Hulse and Joseph Taylor the 1993 Nobel Prize (Figure 6.17). The pulsar is in a
binary orbit with another star, detectable through subtle variation in the timings
of the pulses. (Doppler shifts imply timing variations, in the same way that
cosmological redshift implies supernova time dilation — see Chapter 1.) The
energy loss from gravitational radiation leads to a gradual spiralling in of the two
pulsars, detectable from the timings. Primordial gravitational waves are also
expected to contribute to the CMB power spectrum (Chapter 2), though direct
detection of primordial gravitational waves will be extremely challenging.
−5
−10
cumulative shift of periastron time/s
−15
−20 general
relativity
prediction
−25
−30
Figure 6.17 The cumulative
change of pulsar PSR B1913+16
−35 in the periastron time (the
time of closest approach of
the two stars), compared to
−40 1975 the predictions of general
1980 1985 1990 1995 2000 2005
relativity. The data agree with
year
the predictions to 0.2%.
10−18
coalescence of massive black holes
resolved Galactic binaries
gravitational wave amplitude
The merger of two black holes would generate copious gravitational waves.
Within our Galaxy, many merger events of black holes and neutron stars should be
detectable (see Figure 6.18). At cosmological distances, one could detect only the
mergers of supermassive black holes. Could this happen? At the time of writing,
at least one credible candidate for a binary supermassive black hole has been
found in a quasar (Figure 6.19). But supermassive black hole mergers may be
much more common than this single example suggests. If quasars and starbursts
are triggered by mergers, then galaxy–galaxy merging is common in the history of
the Universe. Merging galaxies with pre-existing supermassive black holes will
have their supermassive black holes forming a binary system within a million
years of the merger, according to numerical simulations. The expectation is of a
few tens of supermassive black hole merger events per year detected with LISA.
The gravitational waves from inspiralling black holes are also sufficiently
well-understood that they could be treated as standard candles, and they are
sometimes referred to as ‘standard sirens’. The physical simplicity of such a
system, completely determined in practice by two masses and two spins, is very
212
Summary of Chapter 6
z = 0.3889
Figure 6.19 The quasar
z = 0.3727 SDSS J1536+0441. The
100 spectrum has evidence of three
redshifts. At z = 0.3889
−1
Summary of Chapter 6
1. Non-rotating uncharged black holes are described by the Schwarzschild
metric, and their rotating counterparts by the Kerr metric.
2. The accretion efficiency of a black hole can be calculated by finding the
energy released from dropping from infinity to the radius of the smallest
stable circular orbit. For a Schwarzschild metric this is 6% of the rest mass
at infinity, while for the Kerr metric it is 42%. This is the most efficient
conversion process known from mass-energy to luminosity, making black
holes prime candidates for powering the central engines of quasars.
3. The present-day contribution to Ωm from black holes can be estimated from
the source counts of quasars, using measurements of the average redshift as
a function of apparent quasar magnitude, combined with assumptions of the
bolometric correction and the accretion efficiency. This constraint is
independent of the Hubble parameter H0 and the cosmological density
parameters.
4. A similar constraint can be made using the hard X-ray background. The
background has a harder spectral shape (i.e. more output at higher energies)
213
Chapter 6 Black holes
Further reading
• John Michell’s 1767 paper ‘An inquiry into the probable parallax and
magnitude of the fixed stars’ is at
[Link] It includes a derivation of
integral source counts.
• John Michell’s 1784 paper is at
[Link]
• For more on black holes at this level, see Lambourne, R., 2010, Relativity,
Gravitation and Cosmology, Cambridge University Press.
• At the time of writing, some audio renderings of gravitational waves from
inspiralling black holes can be found at
[Link]
214
Summary of Chapter 6
215
Chapter 7 Gravitational lensing
Do not Bodies act upon Light at a distance, and by their action bend its
Rays; and is not this action (caeteris paribus) strongest at the least distance?
Isaac Newton, Opticks
Introduction
Some of the most beautiful images in cosmology are found in gravitational
lensing. In these, we see the direct effect that matter has on the curvature of
spacetime around it. Most astronomy can investigate only luminous matter, but
this is one of the very few opportunities to infer much more. Gravitational lensing
effects are created only by the intervening matter distribution, regardless of
whether it’s luminous or dark, or in equilibrium or not. Lensing can’t distinguish
between these different sorts of intervening matter, but the positive side of this is
that we don’t miss anything.
source
deflector
observer
216
7.1 Gravitational lens deflection
Figure 7.3 The galaxy cluster Abell 2218 observed with the Hubble Space
Telescope (HST) with the WFPC2 instrument. Note the background galaxies
distorted into arcs.
217
Chapter 7 Gravitational lensing
from the background source is partially obscured by dust in the lensing galaxy, it
could give the appearance of different colours, but this achromaticity test can
sometimes be done at radio wavelengths where dust extinction has no measurable
effect. If the source has some variation in colour and has different magnifications
in different parts, then the multiple images could have different colours. One must
then carefully model the lens system to see if any observed achromaticity could be
due to differential magnification (e.g. Figure 7.4).
0.5
1 5
2
arcsec
−0.5
E
−1
N
Figure 7.4 HST image of the hyperluminous galaxy IRAS FSC 10214+4724, taken at a wavelength of around
800 nm (just to the red of the visible range). The IRAS galaxy is the arc to the left, gravitationally lensed by the
foreground galaxy (marked as 2). There is a second image (‘counterimage’) of the IRAS galaxy, marked as 5.
Objects 2 and 5 have their central pixels boosted artificially in this image for clarity. The contours are HST data at
around 400 nm, which surprisingly failed to detect the counterimage; the slight shift in the 400 nm and 800 nm
images suggests some colour gradient in the IRAS galaxy and hence differential magnification.
How much does an object deflect light by gravitational lensing? We can make a
Newtonian prediction of the gravitational lens deflection angle by a mass M , by
treating an incoming photon as being a particle with initial velocity c, as shown in
Figure 7.6. In this Newtonian model, the deflection angle φ in radians will be
φ 1 tan φ = vy /c, where vy is the y-axis velocity acquired by the photon as it
passes the Sun. We neglect any x-axis change since the imparted velocity will be
* c.
In Newtonian gravity, the photon will move with acceleration a = GM/r2 in the
direction towards the Sun. The vertical (y-axis) acceleration in Figure 7.6 will just
be ay = a cos θ = (GM/r2 ) cos θ. We can shortcut some tedious algebra
by using Kepler’s second law (i.e. the conservation of angular momentum):
r2 dθ/dt = constant. We’ll need the value of that constant, and another trick
helps: Kepler’s laws apply even if the mass M is limitingly small or even zero.
Therefore the constant must be bc (where b, known as the impact parameter, is
shown in Figure 7.6), because that would be the value of r2 dθ/dt at the point of
closest approach to the mass if the photon were not deflected.
Now imagine a short time interval dt. The change in y-axis velocity in that
time will be dvy = ay (t) dt, because ay = dvy /dt. But we can rearrange
r2 dθ/dt = bc to get dt = (r2 /bc) dθ. Putting this together, we find
218
7.1 Gravitational lens deflection
0.4
0.3
flux/arbitrary units
0.2
Figure 7.5 Summed spectrum
of the two nearest foreground
objects that dominate the
0.1
gravitational lensing of the
redshift z = 2.286 galaxy
IRAS FSC 10214+4724. The
0
discontinuity is the 4000 Å
break (Section 4.4), redshifted to
5500 6000 6500 7000 7500 8000
about z = 0.9. This spectrum is
λ/Å very different to that of the
IRAS galaxy (Figure 5.8).
y
M
θ Figure 7.6 The
gravitational lensing
b deflection of a photon by a
ed path mass M . Radial distances r
deflect
are measured outwards from
undeflected path the mass M . The distance b
is sometimes known as the
x
impact parameter.
GM
cos θ dt
dvy = ay (t) dt =
r2
GM r2 GM
= 2 cos θ dθ = cos θ dθ.
r bc bc
Integrating this from θ = −π/2 to +π/2, we find that
2GM
vy = ,
bc
so
vy 2GM
φNewtonian 1 = . (7.1)
c bc2
In the weak-field limit, the full general relativistic treatment turns out to be
exactly a factor of two greater:
4GM
φ= . (7.2)
bc2
Why exactly a factor of two? This is difficult to answer. As you saw in Chapter 6,
there is a similar conservation of angular momentum r2 dθ/dλ = constant, where
219
Chapter 7 Gravitational lensing
λ is a parameter measured along the path of the photon. (For a massive particle
we could use dλ = dτ , where τ is the proper time, but photons have ds = 0 so
dτ = 0.) Converting λ to coordinate time t involves a factor also involving
GM/c2 (due to the spacetime curvature), which leads ultimately to the larger
deflection angle. We’ll return to this in Section 7.5.
But beware of a subtlety that traps the unwary. The vertical distances in Figure 7.7
are angular diameter distances (Chapter 1). For example, DS is the angular
diameter distance from the source to the observer, while DL is the angular
diameter distance from the lens to the observer. But DLS is the angular diameter
distance to the source as seen from the lens, so DS does not necessarily equal
DLS + DL !
● Do any cosmological distances add up, so that Earth-to-source equals
Earth-to-lens plus lens-to-source?
❍ Comoving distances add up in exactly this way.
If the lens, image, background source and the Earth are all in the same plane, as in
Figure 7.7, then
β = θ − α. (7.3)
(To show that this is true while avoiding DS 5= DLS + DL , compare distances
along the top of Figure 7.7.) But what if they aren’t in the same plane? This could
220
7.2 The lens equation
happen if the lens is not symmetrical, for example. In this case we can treat the
angles as vectors on the sky, so
β = θ − α(θ). (7.4)
Exercise 7.1 Derive a flat space expression for DLS involving the comoving
distances rL and rS (the comoving distances to the lens and source, respectively),
and the lens and source redshifts zL and zS .
Exercise 7.2 Write down a proof of Equation 7.4 by working with vectors on
the source plane, keeping in mind that DS 5= DLS + DL . ■
So far we’ve not used any information on the lens mass distribution, or on how
much deflection that mass causes. Let’s see what happens for a point mass M .
Adapting Equation 7.2, a point mass M will cause a deflection of
4GM
α
K= . (7.5)
c2 ξ
(By symmetry, the light rays are all confined to a plane in this case, so we don’t
need to use vectors.) This deflection is related to the observed shift α by
DLS
α= α
K (7.6)
DS
using Figure 7.7, so the lens will cause a visible deflection of
DLS 4GM
α= . (7.7)
DS c2 ξ
We can rewrite the (scalar) lens equation as
DLS 4GM
β =θ−α=θ− ,
DS c2 ξ
and using θ = ξ/DL (Figure 7.7) we reach
DLS 4GM
β =θ− . (7.8)
DL DS c2 θ
Exercise 7.3 What if the background object is exactly behind the lens, so
β = 0? What will this look like? (Give this some thought before looking up the
answer!) ■
This angular size is often known as the Einstein radius and given the symbol θE :
#
4GM DLS
θE = . (7.9)
c2 DL DS
It depends only on the source redshift zS , the lens redshift zL and the lens
mass M . (Note that θE isn’t just a property of the lens, because it also depends on
the distance to the background source.) It’s an important quantity in gravitational
lensing in general.
221
Chapter 7 Gravitational lensing
When the source position β is around θE or less, the magnifications are typically
strong. Conversely, if β 0 θE , then there is typically very little magnification.
We’ll show in Section 7.6 how θE can also be a boundary between having
multiple images and having only one image. Also, multiple images tend to have
separations of roughly 2θE , as we’ll show. Figure S7.1 from Exercise 7.3 is an
example of an Einstein ring.
Substituting in the numerical values and assuming a point mass, we obtain an
equation that’s useful for cosmological lensing:
C D1/2 C D
θE M DL DS /DLS −1/2
= . (7.10)
arcseconds 1011.09 M) Gpc
Typically, galaxy–galaxy lensing gives Einstein radii of the order of an arcsecond,
while lensing by a galaxy cluster typically has θE about ten times bigger. At the
opposite size scale, gravitational microlensing (which we shall meet later in this
chapter) can be characterized with
C D1/2 C D
θE M DL DS /DLS −1/2
= . (7.11)
milliarcseconds 1.23 M) 10 kpc
For our point mass lens, we can write Equation 7.8 as
θE2
β=θ− . (7.12)
θ
This quadratic equation has the solution
C J D
1
θ= β ± β 2 + 4θE2 , (7.13)
2
giving two possible values for θ.
Exercise 7.4 Show that one value of θ in Equation 7.13 is always negative. Is
this a physical solution? If it is, then what does it correspond to? If it isn’t, then
why does it occur in this equation?
Exercise 7.5 In general, is there a unique image position θ for any given
source position β? Also, is the reverse true — is there a unique source position β
for every image position θ? ■
7.3 Magnification
Gravitational lensing magnifies not only the sizes of distant galaxies, but also their
fluxes. This is no coincidence: it turns out that surface brightness (flux per unit
area on the sky) is conserved in gravitational lensing. We’ve outlined a proof
briefly in the box below, but this is only in case you’re unsatisfied with having
surface brightness conservation unproven. We won’t use the proof later in the
book.
222
7.3 Magnification
z p2
Vp = |p|2 Δ|p| ΔΩ
y
= (p0 )2 Δp0 ΔΩ
x
δz = δt
Vs = A δt p1
p3
Figure 7.8 The space volume and momentum space volume of photons
hitting the detector in a time δt. To make the figure clearer, we’ve flipped the
momentum diagram and shown the detector as an emitter instead of a
receiver.
In a short time δt the detector receives the photons from a volume Vs = A δt.
Those photons have energy E → E + δE, and since E = pc for photons,
their z-axis momenta are E/c → E/c + δE/c. Their momentum space
volume is Vp = (1/c3 )E 2 δE ΔΩ (see Figure 7.8). Putting this together with
the spatial volume, we find that our N photons have phase space density
N N c3 N c3
ρphase = = = = constant, (7.14)
Vs Vp A δt E 2 δE ΔΩ h3 A δt ν 2 δν ΔΩ
where for the second step we used E = hν, with h being Planck’s constant
and ν the frequency, and the last step is just stating Liouville’s theorem.
The surface brightness is the amount of energy per unit area, per unit solid
223
Chapter 7 Gravitational lensing
So the flux of a object with a uniform surface brightness Iν and an area Ω on the
sky is Sν = Iν × Ω. Lensing increases the area to Ωlensed = µΩ (where µ is the
magnification factor), and Iν is the same, so the lensed flux is
Sν,lensed = Iν Ωlensed = Iν µΩ = µSν .
But hang on — doesn’t surface brightness conservation violate energy
conservation? We’ve conserved surface brightness and made the image bigger, so
where have the extra photons come from? In fact, it’s still consistent with energy
conservation. Part of the answer is that photons are being redirected, so in some
directions the background source could be demagnified. Another part of the
answer is that you must take account of the spatial curvature around the lens: the
photons from the background source are now being spread over slightly less than
4π steradians. There is still the same number of photons, but they’re being
distributed over slightly less space.
The magnification factor of an image is therefore equal to the factor increase
of the image’s area on the sky. If the lens is circularly symmetric, then the
magnification is given by
θ dθ
µ= , (7.16)
β dβ
where θ and β are as given in Figure 7.7.
Exercise 7.6 Show by differentiating Equation 7.12 (or otherwise) that lensing
by a point mass (a special case of circular symmetry) gives rise to a magnification
& C D4 @−1
θE
µ= 1− , (7.17)
θ
where, as we’ve seen, θ has two possible values for any source position β.
Exercise 7.7 If an image is within the Einstein radius, i.e. θ < θE , then the
magnification in Equation 7.17 is negative. Is this a physical solution? If it is,
what does this correspond to? If it isn’t, why does this occur in this equation?
(Hint: Why would µ in Equation 7.16 be negative?) ■
We can write the total magnification caused by a point mass as µ = |µ1 | + |µ2 |,
where µ1 and µ2 are the magnifications of each of the two images. After a little
algebra, it turns out that the total magnification of a point mass is
2 + (β/θE )2
µ = |µ1 | + |µ2 | = - .
(β/θE ) (β/θE )2 + 4
224
7.3 Magnification
This has the remarkable property that it is always larger than 1, for any β or θE !
Again, doesn’t this violate energy conservation? Again, it doesn’t. Putting a point
mass lens into the Universe couldn’t change the number of photons that the
background source put out, but it would change the volume over which they are
distributed, because the point mass has a spatial curvature around it. Just like with
our discussion of the surface brightness conservation above, the same number of
photons is being distributed over slightly less than 4π steradians because of this
curvature, so if we compare a universe without the lens (more volume) to one with
the lens (less volume), it’s possible for the magnification always to be > 1.
Gravitational lens magnification has a curious effect on the source counts of
extragalactic objects. We can imagine putting a population of lenses between
ourselves and some extragalactic background objects. These lenses will give
each extragalactic background object a random magnification |µ|, which has
a probability distribution Pr(|µ|). Lenses are generally quite sparse on the
sky, so Pr(|µ|) will have a sharp peak close to |µ| = 1. (We’ll ignore any
redshift-dependence of this probability for the purposes of demonstration.)
The magnification |µ| can be less than 1, in general, so some objects could be
log10 (dN/dS)
demagnified, while some are boosted in flux.
The underlying magnification probability is Pr(|µ|), but the observed
magnification histogram could look very different. Imagine surveying the sky
for background objects with an observed flux of S0 , and suppose that these
background objects have power-law source counts around S0 , i.e. dN/dS ∝ S −α , S0
with α being some constant. There will be a few objects brighter than S0 that are
demagnified, so appear to have flux S0 . However, there will be many more objects log10 (S)
fainter than S0 , as shown in Figure 7.9, some of which have a high µ so appear to
have flux S0 . The net effect is that high magnifications will be over-represented, Figure 7.9 The magnification
compared to what you’d expect from the shape of Pr(|µ|). The steeper the source bias effect. At any flux S0 ,
counts, i.e. the higher the value of α, the more high-magnification objects you’d there are many objects fainter
find. This is known as magnification bias and may be an important new way of than S0 , some of which will be
finding gravitational lenses, as we’ll see in Section 7.11. magnified to the flux S0 . There
If the lens does not have circular symmetry, the magnification calculation is a little are far fewer objects brighter
more complicated. The mapping from source position β = (βx , βy ) to image than S0 , of which again some
position θ = (θx , θy ) is in general done with a matrix A: a small change in β, will be demagnified to S0 .
dβ = (dβx , dβy ), relates to dθ via dβ = A dθ, where The asymmetry between the
C D brighter and fainter populations
∂β ∂βx /∂θx ∂βx /∂θy changes the distribution of
A= = . (7.18)
∂θ ∂βy /∂θx ∂βy /∂θy magnifications for objects with a
fixed observed flux of S0 .
(Equation 7.16 is a special case of Equation 7.18 for circular symmetry.)
The steeper the source count
To calculate the magnification, we want to know how much the background image slope, the more the observed
area (proportional to dβ 2 ) relates to the observed image area (proportional magnification distribution
to dθ2 ). This comes out as is skewed towards higher
dθ2 1 magnifications.
2
= ,
dβ det A
where det A means the determinant of the matrix A, sometimes written using
modulus signs:
C D $ $
a b $a b $
det A = det =$$ $ = ad − bc. (7.19)
c d c d$
225
Chapter 7 Gravitational lensing
For this reason the matrix A is sometimes called the inverse magnification
tensor. (If you need a reminder about what a tensor is, see the box below.) The
magnification tensor is M = A−1 , so 1/det A = det M .
What is a tensor?
Suppose that you have two springs connected to wires as shown in
Figure 7.10a. Both springs have the same spring constant k. What is the
force on the object of mass M in this figure? By Hooke’s law, the force from
each spring is proportional to the displacement, so we have
F = (Fx , Fy ) = (−kx, −ky) = −k(x, y) = −kr, (7.20)
where r is the displacement vector.
y-displacement
y-displacement
M M
Figure 7.10 (a) A mass M pulled by two springs. The springs have
frictionless rings that slide along bars that follow the x- and y-axes. The
displacement vector r is also shown. The resulting force vector F is aligned
with r (though in the opposite direction). (b) Now the mass M is pulled by
one spring along the y-axis direction but two along the x-axis direction. The
resulting force vector F is no longer aligned with the displacement vector r.
Now let’s put a second spring on the x-axis, as shown in Figure 7.10b.
What’s the force now? The force F is in a different direction to the
displacement r, and we can’t pull out the factor of k as we did in
Equation 7.20. But we could write it as a matrix:
F = (Fx , Fy )
= (−2kx, −ky)
C D C D
2k 0 2k 0
=− (x, y) = − r = −Kr,
0 k 0 k
where K could be called, say, the ‘spring tensor’ by analogy to the spring
constant. So we can think of this tensor as a matrix that operates on the
displacement vector r to give us the force vector F .
This is nearly sufficient to define this type of tensor, but not quite. Not every
matrix can be a tensor, because a tensor must obey certain transformation
rules. You may not have been aware of this, but the definition of a vector
226
7.4 The singular isothermal sphere model
includes the fact that it obeys the right transformation laws. In Galilean
relativity, any spatial three-vector must by definition obey the Galilean
transformation, so its length and direction are observer-independent. If they
aren’t, it’s not a vector. Similarly, in special relativity, a four-vector must by
definition obey the Lorentz transformation and have a Lorentz-invariant
‘length’ (such as the interval Δs in the case of the position four-vector). The
definition of a tensor is that it must obey similar transformation laws.
This takes us beyond the scope of this book, but it’s one of the key ideas
underpinning the beautiful theory of general relativity.
We’ve discussed only two-dimensional tensors here, but there can be
higher-order ones too, e.g. cubical arrays or hypercubes of numbers.
(The distance ξ was shown in Figure 7.7.) This mass distribution is known as the
singular isothermal sphere. You’ve seen already why this spherically-symmetric
distribution is isothermal. It’s called ‘singular’ because the mass density and
surface density tend to infinity as r and ξ respectively tend to zero. There are
various modifications that can be made to the model to avoid this singularity. The
total mass enclosed within a projected distance ξ is just
* ξ
πσv2
M (ξ) = Σ(ξ % ) 2πξ % dξ % = ξ. (7.23)
0 G
● Can the singular isothermal sphere model be extended to infinity?
❍ M (r) ∝ r, so the mass would tend to infinity. Therefore in practice this
model has to be truncated at some radius (typically > θE ) for it to be physical.
What about gravitational lensing by a singular isothermal sphere? By Birkhoff’s
theorem (Chapters 4 and 6), the deflection by any spherically-symmetric mass
distribution will depend only on the mass within the angular distance ξ, i.e. M (ξ):
4GM (ξ)
α
K= , (7.24)
c2 ξ
which comes out as
C D2
σv2 σv
α
K = 4π 1 (1.4%% )
c2 220 km s−1
(compare Equation 7.5). Similarly, the Einstein radius is
A
4GM (θE ) DLS
θE = (7.25)
c2 DL DS
(compare Equation 7.9), so
4GM (θE ) DLS 4GM (θE ) DLS θE 4G πσv2 ξ DLS θE
θE2 = = =
c2 DL DS c2 DS ξ c2 G DS ξ
thus
4πσv2 DLS
θE = . (7.26)
c2 DS
We can use the scalar lens equation because this lens is circularly symmetric:
β =θ−α (Eqn 7.3)
(see Figure 7.7). Remember that α can be positive or negative.
If β = 0, then θ = θE , i.e. the source is directly behind the lens. The lens equation
is therefore
β = θ ± θE . (7.27)
If β > θE , this gives only one possible solution, θ = β + θE . However, if β < θE ,
there is also a negative solution for θ, i.e. on the other side:
θ = β ± θE . (7.28)
This solution is shown in Figure 7.11a.
228
7.5 Time delays and the Hubble parameter
β β
θE θE
two images
three images
−θE θE θ −θE θE θ
(a) (b)
Figure 7.11 Graphical representation of the gravitational lens solution for (a) a singular isothermal sphere in
Equation 7.27, (b) an isothermal sphere with a smoothed-out density profile in the core, sometimes called a
‘softened isothermal sphere’.
If we write θ± for the two images, then the magnifications from Equation 7.16
come out as
C D
θ± θE θE −1
µ± = =1± = 1∓ . (7.29)
β β θ±
Strictly speaking, there would be a third image at θ = 0, because a single photon
shot straight through the middle could not be deviated (by symmetry). However,
this can only come from a zero-sized point in the background source, and the flux
of this central image comes out at zero. There could be a non-zero central image
if the central density cusp in the singular isothermal sphere mass distribution is
smoothed out somehow, which must be the case if the density profile is physical.
An example is shown in Figure 7.11b. This faint central image then appears
even if β 5= 0. The magnification of this image is typically |µ| < 1, i.e. it’s
demagnified. In general, more complicated density profiles also have faint central
images that depend on the central mass distribution. One of the aims of the new
eMERLIN array of radio telescopes in the UK (see Figure 7.12) is to detect faint
central images in order to determine the density profiles at the centres of galaxies.
Exercise 7.8 Suppose that the lens is an infinite sheet of matter with a constant Figure 7.12 The Lovell
surface density Σ. Show that the deflection angle α is given by telescope at Jodrell Bank,
4πGΣ DL DLS near Manchester in England.
α(θ) = θ. This telescope is part of the
c2 DS
eMERLIN array of radio
Next suppose that Σ takes the critical value telescopes.
c2 DS
Σcr = . (7.30)
4πG DL DLS
What will happen? Do gravitational lenses in general focus light? ■
mass, which curiously turned out to be exactly a factor of two more than the
Newtonian prediction. We’ll shed a little more light on that here.
In Section 7.1 we found the Newtonian deflection as
* +π/2
vy GM
φNewtonian = = 2
cos θ dθ
c θ=−π/2 bc
*
1 ∞ GM
= cos θ dt.
c t=−∞ r2
Now, (GM/r2 ) cos θ is the gradient of the gravitational potential, Φ = −GM/r,
in the y-axis direction in Figure 7.6. We can write this as ∇⊥ Φ, where ⊥ refers to
differentiation being made along a direction perpendicular to the direction of
motion of the particle. (Again, we’re treating this as effectively the same thing as
the y-axis direction, because the change in direction is small.) The deflection is
therefore
*
1 ∞
φNewtonian = ∇⊥ Φ dt. (7.31)
c −∞
i.e. a bit less than c (remember that Φ is negative). We could again think of the
lens as having an effective refractive index
A
1 − 2Φ/c2 2Φ
n= 2
11− 2
1 + 2Φ/c c
(using the first terms in a Taylor series expansion). This time, however,
∇⊥ n = (2/c2 ) ∇⊥ Φ, explaining the extra factor of two back in Section 7.1.
A photon takes time (1/c) dY to travel distance dY in empty flat space. If there is a
refractive index n, the time spent is (n/c) dY. Therefore putting a gravitational
lens in between the source and the observer will induce a total time delay of
* observer * observer
1 n
Δt = dY − dY
source c source c
* observer
1−n
= dY
source c L
* observer
2Φ
= dY, (7.35) S
source c3 O
large H0
where the integrations are done over the light path from the source to the observer.
This is known as the Shapiro delay, after its discoverer. Two different images of
a background source would have two different path lengths and experience S L
different potentials, so in general we should expect there to be a relative time O
delay between different images of a background source.
small H0
This leads to an ingenious method of finding the Hubble parameter H0 . Most of
the lensing equations that we’ve derived up to now have been dimensionless. For Figure 7.14 Schematic view
example, angles are dimensionless, and DLS /DS is dimensionless. Therefore of how the geometry of a
there’s no way to use the lens configuration or arrangement of images to gravitational lens depends on the
determine the absolute size scale of the lens system (see, for example, Hubble parameter H0 . The lens
Figure 7.14). However, the Shapiro delay is proportional to the path length is marked as L, while the source
from the source to the observer. Cosmological distances are proportional and observer are S and O,
to c/H0 (see Chapter 1) so the time delay between two images will be respectively. It’s not possible to
Δt ∝ (1/H0 ) × a number that depends on the lens mass model. So if we can tell from the positions of images
find a mass model of the lens that reproduces the lens geometry (e.g. image alone what the absolute size
configurations, lens redshift and source redshift), we can predict the value of scale of the system is, but the
H0 Δt; then by measuring Δt we can infer the Hubble parameter! time delay between two different
This has been done in several lenses, such as the quasar QSO 0957+561. The images can give an absolute
main uncertainty in this experiment is the mass model. (This uncertainty is much scale and hence H0 .
larger than the effect that varying ΩΛ or Ωm would have on the lens geometry.)
Also, the time delay itself can sometimes be hard to discern from the data. A
recent compilation of time delays from 10 different gravitational lens systems Saha, P. et al., 2006,
found an average Hubble parameter of H0 = 72+8 −11 km s
−1 Mpc−1 . Astrophysical Journal Letters,
650, L15.
231
Chapter 7 Gravitational lensing
It follows from Section 7.5 that the deflection from a gravitational lens is
*
2 ∞
φ= 2 ∇⊥ Φ dx, (7.36)
c −∞
K in
where Φ is the Newtonian potential. This deflection is also the angle α
Figure 7.7. The observed deflection α will therefore be
*
2 DLS
α= 2 ∇⊥ Φ dx, (7.37)
c DS
where we have switched to the more general vector notation. The lens equation is
therefore
*
2 DLS
β =θ−α=θ− 2 ∇⊥ Φ dx. (7.38)
c DS
We could rewrite this in a simpler-looking form as
β = θ − ∇θ ψ (7.39)
if we can find a suitable new function ψ. Here ∇θ means derivatives with respect
Note that we’re not equating two to θ, i.e.
numbers or variables, but rather C D
∂ ∂
two operators. This is a subtle ∇θ = , . (7.40)
∂θx ∂θy
but radical change in the use of
the = sign. The simplest choice of ψ that works is
*
DLS 2
ψ(θ) = Φ dx. (7.41)
DL DS c2
This is sometimes called the scaled projected Newtonian potential. It’s related to
the deflection angle α through
∇θ ψ = α. (7.42)
We can then rewrite the lens equation as
; H
0 = θ − β − ∇θ ψ = ∇θ 12 (θ − β)2 − ψ . (7.43)
To see what the term in square brackets means, here is the corresponding equation
for the time delay:
(1 + zL ) DL DS ; 1 H
Δt(θ) = 2 (θ − β)2 − ψ = Δtgeom + Δtgrav , (7.44)
c DLS
where zL is the redshift of the lens. We won’t prove this directly (it would take us
too far off-topic); instead, we’ll point out some general features. The two terms in
the square brackets correspond to a gravitational Shapiro time delay (Δtgrav )
involving the projected potential ψ, and a geometrical time delay (Δtgeom )
involving the angular offset between β and θ. The geometrical term is caused by
the fact that the light ray is simply travelling further in getting around the lens.
The factor of (1 + zL ) is necessary because a time delay of Δt as the light passes
the lens will be time dilated by an additional factor of (1 + zL ) by the time it’s
received on the Earth.
Together, Equations 7.43 and 7.44 imply that ∇θ t(θ) = 0. This means that we
find images at stationary points in the time delay. This is a cosmological version
of Fermat’s principle. We’ll see in Section 7.9 that this projected potential ψ can
also be related to the projected mass density Σ.
232
7.6 Caustics and multiple images
The time delay is sometimes called the time delay surface since it varies in
general with both θx and θy on the sky. Images will form at the stationary
points of this surface (minima, maxima, saddle points and points of inflection).
However, if the lens is circularly symmetric, we need to consider only one axis.
Figure 7.15 shows how the two components of the time delay vary with position
for a particular circularly-symmetric lens. Note the images at the three stationary
points in the time delay. If we move the position of the background source, the
geometric time delay component moves (see Figure 7.16), which changes the
shape of the total time delay.
tgeom
tgrav
time delay
time delay
ttotal
Figure 7.15 The geometric and gravitational Figure 7.16 The variation of the total time
time delays for a particular circularly-symmetric delay and the positions of the images, as the
lens. The position of the source is marked as β, position of the background source is changed.
while the gravitational component peaks at the The lens is closely aligned with the background
centre of the lens (marked with a dotted line). source in the top panel, offset in the central panel,
There are three images marked as black dots that and offset by more in the bottom panel. Note
occur at stationary points in the total time delay how the leftmost image merges with the central
curve. image, and the combined image disappears.
Notice in Figure 7.16 how the image in the left-hand minimum point merges with
the image at the maximum, then they vanish. Images can only be created and
destroyed in pairs, because creating a new minimum means that we must also
create a new maximum. Therefore, provided that the lens is non-singular, there
must always be an odd number of images (sometimes called the odd-number
theorem). This is also true in the general non-circularly-symmetric case.
Another nice feature of these time delay curves is that the time delay between two
images is the vertical distance between them in these plots. In Figure 7.15, for
example, the image furthest from the lens will vary first. This is often the case in
cosmological lens configurations.
Exercise 7.10 Suppose that you have a softened isothermal sphere potential,
like the one in Figure 7.11b, and you gradually let the potential in the centre get
deeper, so it looks more and more like the singular isothermal sphere model in
Figure 7.11a. What happens to the time delay of an image seen right through the
centre? And where does the image go when the lens potential becomes exactly a
singular isothermal sphere? ■
The images that form at maxima, minima and saddle points are each quite
different in character. How can we find whether images are minima or maxima?
In one-dimensional calculus, a function y(x) with a stationary point at x = x0 has
dy(x0 )/dx = 0. This point is a minimum if d2 y/dx2 > 0 there, a maximum if
d2 y/dx2 < 0, and a point of inflection if d2 y/dx2 = 0. The two-dimensional
equivalent is to consider the matrix
C 2 D
d t/dθx dθx d2 t/dθx dθy
T = . (7.45)
d2 t/dθy dθx d2 t/dθy dθy
The criteria are more complicated than in the one-dimensional case. They rely on
the determinant and the trace of the matrix. We defined the determinant of a 2 × 2
matrix in Equation 7.19, while the trace of a 2 × 2 matrix is defined as
C D
a b
tr A = tr = a + d. (7.46)
c d
The criteria are given in Table 7.1.
We have already met something like the T matrix in a different form. If we
differentiate Equation 7.44 twice, we find that
C D C 2 D
1 0 d ψ/dθx dθx d2 ψ/dθx dθy
T ∝ − . (7.47)
0 1 d2 ψ/dθy dθx d2 ψ/dθy dθy
Back in Section 7.3 we met the inverse magnification tensor, which we defined
as A = ∂β/∂θ (Equation 7.18). If we use the lens equation to expand this
(Equation 7.4, β = θ − α), we find that
C D
∂βx /∂θx ∂βx /∂θy
A=
∂βy /∂θx ∂βy /∂θy
C D
∂(θx − αx )/∂θx ∂(θx − αx )/∂θy
=
∂(θy − αy )/∂θx ∂(θy − αy )/∂θy
C D C D
1 0 ∂αx /∂θx ∂αx /∂θy
= −
0 1 ∂αy /∂θx ∂αy /∂θy
C D C 2 D
1 0 d ψ/dθx dθx d2 ψ/dθx dθy
= − , (7.48)
0 1 d2 ψ/dθy dθx d2 ψ/dθy dθy
where we’ve used α = ∇θ ψ in the last step. Therefore the matrix T is just
proportional to the inverse magnification tensor A.
One consequence of T ∝ A = M −1 is that we can immediately say what the
magnifications of the different types of images are, because µ = 1/det A. These
magnifications are listed in Table 7.1. The saddle point images also have the
curious property of having negative parity, i.e. being mirror-reversed.
Another consequence of T ∝ A = M −1 is that the curvature of the time delay
surface is proportional to inverse magnification, so if the surface is more curved,
the image is less magnified.
234
7.6 Caustics and multiple images
Table 7.1 The types and properties of gravitational lens images, and how to
identify them from the matrix A or T . (We show in the text that T is proportional
to A.)
235
Chapter 7 Gravitational lensing
14
Navarro, J.F., Frenk, C.S. and White, S.D., 1996, Astrophysical Journal, 462, 563.
15
Moore, B. et al., 1999, Monthly Notices of the Royal Astronomical Society, 310, 1147.
16
Blandford, R.D. and Kochanek, C.S., 1987, Astrophysical Journal, 321, 658.
236
7.8 Microlensing
7.8 Microlensing
In 1936 Einstein published a short note about the gravitational amplification that Einstein, A., 1936, Science, 84,
would occur if two stars happen to appear very close in projection on the sky, 506.
which has since been called microlensing, for reasons that will become clear. He
wrote: ‘there is no great chance of observing this phenomenon, even if dazzling
by the light of the much nearer star . . . is disregarded.’ He published this paper
after being encouraged to investigate the effect by an amateur named Rudi
Mandl (though unknown to both, Eddington and Chwolson had each published
little-known papers on related effects). Einstein also wrote a private note to the
journal editor saying: ‘Let me also thank you for your cooperation with the little For more on this story, see
publication, which Mister Mandl squeezed out of me. It is of little value, but it Renn, J., Sauer, T. and
makes the poor guy happy.’ Einstein reckoned without the tremendous advances Stachel, J., 1997, Science, 275,
in optical imaging technology that have happened in the past few decades. 5297.
We can get a rough idea of the probability of one star gravitationally lensing
another from the Einstein radius. We found this for star–star lensing in
Equation 7.11, with the result that it would be typically measured in
milliarcseconds (10−3 of an arcsecond, which itself is 1/3600th of a degree). The
number of stars per unit area on the sky varies, with higher densities closer to the
Galactic plane. In crowded fields (for example, towards the Galactic bulge), it
turns out that we’d expect of the order of one faint foreground star per square
arcsecond. It may be too faint to detect on its own, but it might nevertheless be a
potential lens. The probability of this foreground star lensing a background one
would be of the order of θE2 /ρ, where ρ is the number of potential lenses per unit
area on the sky, which comes out around 10−6 . So, to detect this type of lensing,
one would need to monitor millions of stars simultaneously. (A more careful
calculation takes into account the fact that lenses close to the source or close to See, for example, Griest et al.,
the Earth have smaller θE than ones more centrally placed.) 1991, Astrophysical Journal
In Einstein’s time, wide-field optical astronomy could be done only with Letters, 372, L79.
photographic plates. Wide-field CCD arrays have now made microlensing
searches possible. Figure 7.18 shows one of the first discoveries of gravitational
microlensing, made with a long-term monitoring campaign of the Large
Magellanic Cloud. As the foreground lens passes in front of the background star,
the background star is gravitationally lensed and magnified. Note the similar
profiles in the red and blue filters: achromaticity is an important test that it is
gravitational lensing, and not some unknown type of variable star.
The original aim of microlensing searches was to detect clumps of dark matter,
which were given the acronym MACHOs (massive compact halo objects). These
clumps could be black holes, clumps of non-baryonic elementary particles, or
dark baryonic matter such as planetary-sized objects or cometary nuclei such as
are found in the Oort cloud of our Solar System. The team that made the early
detection in Figure 7.18 also named their survey ‘The MACHO Project’. For this
experiment one wants to avoid star–star lensing, so surveys for MACHOs have
been done outside the Galactic plane, e.g. towards the Large Magellanic Cloud.
Microlensing events are rarer outside the plane of the Galaxy. The initial results
suggested a large population of ∼0.5 M) lenses in the Galactic halo, but with
larger surveys the current best limit is that < 8% of the dark matter halo of the See, for example, Tisserand et
Galaxy is made up of compact objects. al., 2007, Astronomy and
Astrophysics, 469, 387.
237
Chapter 7 Gravitational lensing
8
blue Amax = 6.86
"
t = 33.9
6
Ablue
4
0
red
6
Ared
1.5
Ared /Ablue
0.5
0
400 420 440 460
days from 2 Jan 1992
Figure 7.18 The light curve of one of the first observations of gravitational
microlensing events, also showing the best fit to the data. The best-fit maximum
magnification and timescales are quoted in the figure. Note that the amplification
is achromatic, as expected for lensing.
● Could all of the dark matter in the Universe be clumps of baryonic matter,
like free-floating Jupiters?
❍ No, because this would violate the Big Bang nucleosynthesis constraint on Ωb
(Chapter 2).
To describe gravitational microlensing, one ideally takes into account the finite
source size and limb darkening (stars not being uniformly bright circles), but a
good approximation is a point mass lens magnification (Equation 7.17). The
distances in this case are not cosmological, so we can just use Euclidean distances
in which DLS does equal DS − DL . Despite the fact that the lens is moving across
our line of sight to a background star, mathematical descriptions of microlensing
are simplest from the lens’s point of view, in which the lens is stationary but the
background source is moving. This is shown schematically in Figure 7.19.
238
7.8 Microlensing
Figure 7.20 Exaggerated view of weak lensing by the cosmic large-scale structure of matter. The shear
component of the gravitational magnification will tend to be aligned with the nearby large-scale structure (red), so
measured galaxy ellipticities (blue) on average will trace the foreground large-scale matter distribution.
240
7.9 Cosmic shear
convergence alone
lensing
source
convergence + shear
241
Chapter 7 Gravitational lensing
Universe. A good test of whether there are systematic errors lurking in the data
analysis is to rotate the background galaxies by 45◦ . If the measured tangential
shear is due to gravitational lensing, then this 45◦ -rotated signal should be
consistent with zero. This is indeed what’s seen in Figure 7.23.
10−1
10−2
10−3
'γT )
10−4
10−5
5 × 10−3
0
'γX )
−5 × 10−3
0.1 1 10 100
radius/arcmin
243
Chapter 7 Gravitational lensing
244
7.9 Cosmic shear
2.8
2.6
2.4
declination/degrees
2.2
2.0
1.8
At the time of writing, LSST survey operations are planned to begin around 2016.
Figure 7.26 shows the projected sensitivity for the dark energy equation of
state parameters for the LSST, assuming that the systematics from intrinsic
alignments and point spread function variations can be well-characterized.
All these future and imminent surveys also seek to measure baryon wiggles,
high-redshift supernovae and the evolution of galaxy clustering (Chapter 3).
Besides intrinsic alignments, the main difficulty with ground-based optical
measurements of weak lensing is the characterization and stability of the point
spread function. There are two quite different solutions that other forthcoming
cosmic shear experiments will (or may) use. One solution is to move the telescope
above the Earth’s turbulent atmosphere. At the time of writing there are two major
space missions proposed to do this: the European Space Agency EUCLID
mission, and the NASA Joint Dark Energy Mission (JDEM). Both missions are
ambitious wide-field optical/near-infrared imaging and spectroscopy surveys
using a ∼1.2–1.5 m space telescope.
It has been proposed that the missions should merge and form a joint ESA/NASA
project. The other option is to use radio telescopes, because the angular resolution
of radio interferometry is not subject to the seeing limitations of ground-based
optical astronomy. Getting enough galaxies over a large enough sky area is
challenging for the current generation of radio telescopes, but the Square
245
Chapter 7 Gravitational lensing
0.5
w1
0
Kilometre Array (SKA; see also Chapter 8) will revolutionize this field. The SKA
should be completed around the year 2020, though early science observations
with a subset of the array will happen in the preceding few years. These future
projects aimed at measuring cosmic shear may also be useful for finding new
strong gravitational lenses (Section 7.11).
246
7.10 Galaxy cluster lenses
of the sky at that wavelength. Outside the core of the cluster the magnification
factors are modest, but within the core the magnification factors of individual
background galaxies vary typically from around 2 to 10, so these images are in
addition up to 10 times deeper than can be achieved in unlensed parts of the sky.
248
7.11 Finding gravitational lenses
249
Chapter 7 Gravitational lensing
The SLOAN survey has also been the source of another large catalogue of
lenses. The SLOAN Lens ACS survey (SLACS) has found 131 galaxy–galaxy
lenses by searching the SLOAN spectra for an absorption-dominated redshift
combined with nebular emission lines (e.g. [O II] 372.7 nm or [O III] 500.7 nm) at
another, higher, redshift in the same spectrum. These lens candidates were
followed up with high-resolution imaging from the HST Advanced Camera for
Surveys (ACS). Figure 7.31 shows some of the beautiful lens systems from this
survey. The mass profile implied by these lenses is approximately isothermal
(Section 7.4), but on average the mass profile is not the same as the light profile.
The mass profile does not seem to have evolved since z = 1. Most of these
lenses are elliptical galaxies, because ellipticals tend to be massive galaxies, so
their cross section for lensing is higher. Typical Einstein radii are about an
arcsecond, with lens masses roughly in the range 1010 –1012 M) . The SLACS
lenses also follow a fundamental plane (Chapter 3) that is consistent with the local
fundamental plane once luminosity evolution is accounted for.
Submm-wave surveys also have steep number counts (Chapter 5), particularly at
bright fluxes, so bright submm-wave galaxies should be more prone to
magnification bias (Section 7.3). At the time of writing, there are two forthcoming
surveys that may find many new lenses: the SCUBA-2 All-Sky Survey (SASSy)
and the Herschel ATLAS key project (Astrophysical Terahertz Large Area
Survey). Both projects aim to scan the sky quickly to a shallow sensitivity in
order to find the rare bright objects that may be lensed. Nearby galaxies that
make up the Euclidean slope of the counts will probably be easily excludable
by their cross-identifications with obvious nearby galaxies in optical surveys
250
Summary of Chapter 7
252
Chapter 8 The intervening Universe
Birth: the first and direst of all disasters.
Ambrose Bierce
Introduction
After the CMB, what made the first light in the Universe — early stars, or black
hole accretion? When were these first things created? In this final chapter, we
shall explore what we know of the very earliest objects in the Universe. Much of
this evidence comes from absorption lines, which we shall meet first. These
absorbers also usefully track the cosmic consumption of gas in star formation, and
give us a wonderful method of counting the total number of baryons in the
observable Universe.
100
50
0
4000 4200 4400 4600 4800 5000 5200 5400 5600 Figure 8.1 The Lyman α
λ/Å forest in the spectrum of a
quasar.
energy/eV
−13.6/16
− continuum
E4 r
−13.6/9
− E3
E2
−
−13.6/4
−13.6
− E1
Figure 8.2 Hydrogen energy levels. The energy of an energy level is given by
En = −13.6 eV/n2 . The potential energy from the nucleus is shown as a black
curve.
On the left-hand side of the Lyman α line, i.e. at shorter wavelengths, the
spectrum seems much noisier. This is not noise; it is the Lyman α absorption lines
Note that n = 1 → 2 is from neutral hydrogen clouds between us and the quasar. The absorbing atoms
absorption of a photon and a each have an electron in the n = 1 energy level that is promoted to n = 2
promotion of the election, while using an absorbed photon’s energy. Since the clouds are at lower redshift than
n = 2 → 1 is emission of a the quasar, their Lyman α absorption is less redshifted, so appears at shorter
photon and demotion of the wavelengths. Figure 8.3 shows this schematically. (Note that ‘Lyman α’ is
electron. sometimes abbreviated as Ly α.)
Earth
quasar
cloud 1 cloud 2
Fλ α Fλ α Fλ α Fλ α
These Lyman α clouds, collectively called the Lyman α forest, are another of
the few ways that astronomers can view the non-luminous Universe. From a
knowledge of the cross section of the absorption (see the box in Chapter 3),
which one can measure in a laboratory, one can calculate the projected number of
hydrogen atoms along this line of sight, per unit area. This is known as the
column density of the absorption, NH I .
254
8.1 The Lyman α forest
What are these absorbers? Are they intervening galaxies, for example? It turns
out that they are not — or at least, there is no one-to-one correlation between
intervening Lyman α absorbers and galaxies that appear to be close in projection
on the sky. Low-redshift absorbers are more likely close to gas-rich local galaxies;
nonetheless, galaxy haloes cannot account for all Lyman α clouds.
Instead, it seems that these absorbers are clumps of intergalactic material that
(for the most part) have not yet condensed to form galaxies. They would be
undetectable, were it not for the fact that they absorb light from background
quasars. The distribution of these primordial clumps is not subject to most of
the complicated physics that determines the distribution of galaxies, such as
non-linear gravitational collapse and feedback. The Lyman α forest can therefore
be used as a tracer of the underlying matter distribution. This is very useful for
testing cosmological models, as we shall see in the next section.
It may surprise you to read that Lyman α clouds exist even in the present-day
Universe. It was once imagined that galaxy formation was something that
happened only early in the history of the Universe, but more recently galaxy
formation has been seen as an ongoing process. There are even nearby galaxies
that seem to have formed all their stars very recently, such as the galaxy I Zw 18
(Figure 8.4). We imagine that a pre-existing puddle of neutral hydrogen has been
disturbed or interacted with in some way that has triggered the formation of stars
within it. It’s not clear what the triggers were for I Zw 18, however.
Exercise 8.4 Assuming that Jν ∝ ν −α , show that the optical depth is τ > 1
when NH I > 1.3 ((α + 3)/α) × 1021 m−2 . This is known as self-shielding. ■
256
8.2 Comparison with cosmological simulations
For reference, quasar spectra have α around 0.5–1 in the ultraviolet, while
galaxies have redder spectra. Taking account of intervening absorption, a
Lyman α cloud might experience an ambient light spectrum of α 1 2, depending
on redshift.
257
Chapter 8 The intervening Universe
flux/continuum flux
0 1.0
Ly 7
0.5
0
0.0
normalized flux
0 Figure 8.5 Damped Lyman α system in the quasar QSO 0913+072. The
spectrum has been divided by the expected quasar flux, so a flux of 1.0 means no
Ly 9
absorption. The red line shows the best-fit Voigt profile (see Section 8.5).
This absorber was chosen to have very few associated absorption lines from
0 heavier elements. Higher Lyman transitions have lower optical depth, and in
Figure 8.6 the companion deuterium Lyman lines can be seen, slightly blueward
(Ly 11) Ly 10 of the hydrogen absorption. The [D/H] abundance depends on the hydrogen
column density determined from the Lyman α profile in Figure 8.5, which is
difficult to fit to given the presence of other intervening Lyman α absorbers. (This
0 is the principal source of systematic uncertainties.) The deuterium abundance is
−400 0 400 log10 [D/H] = −4.56 ± 0.04, which when combined with other similar
relative velocity/km s−1 measurements gives Ωb,0 h2 = 0.0213 ± 0.0010. Later in this chapter we shall see
how little of this baryonic content of the Universe is stars and planets, and how
Figure 8.6 Hydrogen and much is still in its primordial state.
deuterium absorption in the Figure 8.7 compares the WMAP cosmological parameter constraints with the
quasar QSO 0913+072. The constraint from the [D/H] abundance. This figure shows that combining the
spectrum has been divided by WMAP data with the measured baryon density requires that ns < 1. Several lines
the expected quasar flux, so a of evidence now appear to disfavour an ns = 1 scale-invariant spectral index. In
flux of 1.0 means no absorption. inflationary models, ns depends on the shape of the inflation potential. Is this
In this notation, Lyman α is Ly1, measurement a hint of the new physics of the inflation potential? Inflation models
Lyman β is Ly2, and so on. The with ns 5= 1 also predict a gravitational wave background that might eventually be
x-axis units are km s−1 relative detectable directly in future gravitational wave observatories, or whose effects
to the quasar hydrogen Lyman may be measurable in the polarized CMB with the recently launched Planck space
series absorption. The red lines telescope or other later CMB missions. This will be a critical consistency test for
mark the expected positions of inflation.
the hydrogen and deuterium
absorption. Ly11 is off to the
left of the bottom panel.
8.4 The column density distribution
We’ve seen that the Universe is awash with primordial hydrogen that follows the
filaments and clumps of the underlying matter distribution. But is most of this
hydrogen still lurking in wispy filaments, or is it already in galaxy-sized clumps
waiting to be turned into galaxies?
258
8.4 The column density distribution
1.02
80
1.00 78
76
0.96 72
70
0.94 68
66
0.92
64
1
5
02
02
02
02
02
02
0.
0.
0.
0.
0.
0.
Ωb,0 h2
Figure 8.7 Constraints on the baryon density of the Universe, Ωb,0 h2 , and on the
primordial spectral index of scalar density perturbations ns . The points sample the
allowed distribution from the WMAP data, coloured according to the Hubble
parameter H0 in units km s−1 Mpc−1 . The shaded regions are the 1σ and 2σ bounds on
Ωb,0 h2 based on the deuterium abundance. (1σ means that there is an 1 68% chance of
the true value lying in that range; 2σ corresponds to 95%.) The curves are the 1σ and 2σ
constraints from combining the WMAP measurements with the deuterium abundance.
In Chapter 4, we used the luminosity function φ(L) of galaxies to find which
galaxies contribute most of the luminosity in the Universe: they were around the
peak of the L φ(L) distribution. In Chapter 5, we also used the source counts
dN/dS to find which galaxies dominate the extragalactic background light: they
were around the peak of the S dN/dS distribution. We shall use a similar
trick with the column density distribution to find out where most of the neutral
hydrogen is in the Universe.
The numbers of Lyman α clouds change strongly with cosmic time. Figure 8.8
shows the spectrum of a quasar at low redshift. Comparison with Figure 8.1
shows that the low-redshift quasar clearly has far fewer Lyman α absorbers than
the high-redshift quasar. It’s tempting to suppose that this is exactly the emptying
out of the voids, and filling up of the overdensities, that the cosmological
simulations predict. However, that supposes that we’re sampling the same
comoving volume in the two spectra. For example, could the 1250–1350 Å
observed wavelength range in the low-redshift quasar just be sampling much less
volume than the 4600–4700 Å observed wavelength range in the high-redshift
quasar? This might explain why there are fewer Lyman α lines at low redshift.
To find out, we shall calculate the number of absorbers that a photon would
encounter along its travel from the quasar to us.
Unfortunately there’s an annoying collision of notation: NH I is conventionally
used to mean column density, while N is conventionally used to mean numbers in
source counts dN/dS. The column density distribution (i.e. the number of
absorbers per unit column density) would then be dN/dNH I . To avoid this
clumsy notation, it’s conventional to use N to mean the number of absorbers.
259
Chapter 8 The intervening Universe
−1
10
The number of absorbers that a photon might encounter will be proportional to the
path length that it travels, dY, and proportional to the density of absorbers ρ, and
to the average geometrical cross section A of any single cloud. (Don’t confuse
this with the absorption cross section σ of a single atom.) Since ρ = nco (1 + z)3 ,
where nco is the comoving density of the absorbers, and dY = c dt, we can write
the number of absorbers encountered by a photon in a cosmic time interval
t → t + dt as dN = nco (1 + z)3 Ac dt. It’ll be useful to have the number of
We’ve written d2 N as a double absorbers per unit column density per unit redshift, which we can write as
differential, which it is, but be $ $
2
$ $
3 $ dt $
warned that some texts use dN d N = nco (NH I , z) A(NH I , z) (1 + z) c $ $ dNH I dz. (8.2)
dz
when referring to d2 N .
−21
−22
log10 f (NH I , X)
261
Chapter 8 The intervening Universe
0.12
0.10
0.08
lDLA (X)
0.06
0.04
0.02
0.00
0 1 2 3 4 5
redshift, z
cause Doppler shifts. There may also be Doppler shifts from the atoms’ thermal
motion, known as thermal broadening. (A third possibility, relevant in stars but
not at the expected densities of Lyman α systems, is pressure broadening: the
presence of nearby atoms affects the photon emission of any particular atom.)
The Doppler broadening in Lyman α clouds is generally treated as a Gaussian
distribution, because a Gaussian form occurs in the expected Maxwell–Boltzmann
2
thermal velocity distribution: Pr(v) ∝ e−mv /(2kT ) .
To find the total effect on the absorption profile, we convolve this Gaussian
distribution with the Lorentzian profile. The resulting curve is known as the Voigt
profile. Figure 8.5 shows the best-fit Voigt profile for this damped Lyman α
system. By analogy with the classical case, the shape of the wings of the profile
depends on the damping term in the oscillator equation of motion. Since the
centres of the profiles in these absorbers are essentially black (i.e. essentially
completely opaque), these damped wings dominate the profile shape, which is
why these Lyman α absorbers are known as ‘damped’. This happens typically at
column densities > 1024 m−2 or so.
The depth of the absorption can also be expressed as equivalent width, illustrated
in Figure 8.11. This is defined by imagining another absorption line, which
removes the same energy but is completely opaque and has a rectangular shape.
The width of this line (W in Figure 8.11) is the equivalent width. Note that this is
just a measure of the intensity of the absorption, and has nothing to do with
velocity widths. Mathematically the equivalent width is
* ∞
C(λ) − S(λ)
W = dλ, (8.8)
−∞ C(λ)
where C(λ) is the continuum level without the absorption, and S(λ) is the
observed spectrum with the absorption. In terms of optical depth, equivalent
width can be written as
* ∞! F
W = 1 − e−τ (λ) dλ. (8.9)
−∞
How can we use the observed equivalent widths to derive the column densities?
W
flux
wavelength
Figure 8.11 The equivalent width, marked as W , is the width of the box that
has an area (hatched) the same as the area of the absorption line (in yellow).
263
Chapter 8 The intervening Universe
Figure 8.12 shows the ‘curve of growth’ for damped Lyman α absorption,
meaning a curve of how the width depends on the optical depth. The optical depth
to ionizing photons τ is related to the column density:
τ (λ) = σ(λ) NH I , (8.10)
where σ is the cross section for absorption. The equivalent width increases
linearly with optical depth: W ∝ τ for small τ . This regime corresponds to
overdensities of δρ/ρ 1 0–15, corresponding to the linear or mildly non-linear
regime of cosmological structure formation. Once τ is around unity, the absorber
is essentially black, and there is little change to the equivalent width with
increasing optical depth until column density is high enough for the damping
wings to start affecting the equivalent width. Once τ > 105 or so, the equivalent
width increases as the square root of τ .
100
1
0.5
10
0
1215 1216
1 logarithmic
square root
W/Å
1
0.1
0.5
linear
1 0
0.01 1205 1215 1225
0.5
0.001 0
1215 1216
Figure 8.12 The variation of equivalent width with optical depth at the line
core.
In Exercise 8.4 we found that Lyman α absorbers above the threshold for Lyman
limit absorption are self-shielded, i.e. they are dense enough that ionizing
radiation has an optical depth > 1 and does not penetrate the cloud. This implies
that the gas in these higher column density absorbers must be mostly neutral,
particularly in damped Lyman α systems. This cannot be said of the Lyman α
forest in general, as we shall see later in this chapter.
In the nearby Universe, the objects with the biggest neutral hydrogen column
densities are spiral discs, and this continues to about z 1 1.6. Could the
higher-redshift damped Lyman α systems be the primordial progenitors of these
spiral discs? Even if they are, how can we separate the faint light of these
primordial galaxies from the glare of the background quasar?
One approach is to take high-resolution spectra to try to detect narrow emission
lines from star formation in the galaxy causing the damped Lyman α system.
264
8.5 Damped Lyman α systems
Astronomers have looked for Hα, redshifted into the infrared, or Lyman α in the
centre of the damped absorption trough. Similarly, one can take an image with a
narrow filter, chosen to cover the dark central region of the damped absorption
trough (see Figure 8.13); such an image may also detect faint, narrow Lyman α
emission from star formation. Only three damped Lyman α systems at z > 1.6
have any star formation detected so far using emission lines, though several have
upper limits. It seems that we are seeing a key stage in the assembly of galaxies,
before they are luminous.
6 0.6
flux/10−19 J s−1 m−2 Å−1
transmission
4 0.4
2 0.2
0 0
4500 4550 4600 4650 4700 4750
λ/Å
Figure 8.13 The left panel shows the transmission of a narrow-band filter, compared to the spectrum of a
damped Lyman α system in the quasar PKS 0528-250. Light that passes through this filter should have little or no
contribution from the background quasar. The right image (65%% × 65%% in size) is taken through this narrow-band
filter. The position of the quasar is marked as a red cross. Nearby, there is a galaxy that is ostensibly responsible
for the damped Lyman α absorption.
Another possibility is to use other absorption lines in the quasar spectrum. If the
interstellar medium of the galaxy causing the damped Lyman α absorption has
been enriched by star formation, there should be metal absorption lines in the
quasar spectrum, and these have been detected in many systems. Also, the C II∗
133.57 nm absorption line has been argued to correlate well with the [C II]
158 µm emission line, which in turn is an indirect star formation rate indicator.
From this it’s possible to estimate the star formation rate in projection, in units of
M) per year per kpc2 , in damped Lyman α systems. However, it turns out that the
mean metal content of damped Lyman α systems is about 10 times lower than
expected from their inferred cosmic star formation history! Could rapid star
formation use up the neutral hydrogen, so damped systems don’t stay damped and
others take over? At these low star formation rates, the timescales are too slow for
this to work. Could the metals be ejected from supernova-driven winds? This
would disagree with observations of the metallicity of the intergalactic medium.
It’s not clear what the solution is, but some approaches that have the neutral gas
spatially distinct in the absorbing galaxies from their active star forming regions
265
Chapter 8 The intervening Universe
may be consistent with the data. Whatever the solution, it’s clear that damped
Lyman α absorption gives us a unique window into otherwise invisible aspects of
galaxy formation.
Exercise 8.6 There is some evidence that the highest column density damped
Lyman α systems are more common in quasars with bright apparent magnitudes.
What could cause this?
Exercise 8.7 How could one use observations of the background quasars to
investigate the dust content of damped Lyman α systems? ■
The vital clue has come from the proximity effect in quasar spectra: as we
approach the redshift of any quasar, the numbers of Lyman α clouds in that
quasar’s spectrum decreases. This is caused by the ionizing radiation from the
quasar, which can be estimated independently from extrapolating the quasar
spectrum. When the quasar’s ionization equals that from the ambient background,
the number of Lyman α clouds dN /dX will be half the number that there are
elsewhere (e.g. along other lines of sight to other quasars, far from a quasar).
At z 1 2.5 the background turns out to be around 10−24 J m−2 . As with the
Hubble parameter, this is sometimes expressed as a dimensionless quantity:
Iν = J−21 × 10−21 (νion /ν)α erg cm−2 s−1 Hz−1 sr−1 , where α is the slope of
the spectrum. (Note: 1 erg cm−2 = 10−3 J m−2 .) In other words, J−21 is the
background at νion in units of 10−21 erg cm−2 s−1 Hz−1 sr−1 . If α = 1, then
J−21 = 1 corresponds to a proper photon density of 63 photons m−3 .
The value of J−21 comes curiously close (within a factor of a few) to the total
ionizing background estimated from integrating the quasar luminosity function.
Do star-forming galaxies provide the rest of this ionizing background? The
similarity of the quasar contribution to the total would then just be a coincidence.
Or are there errors or inaccuracies in the calculations, and quasars provide it all?
The jury is still out. In case ‘coincidence’ is read as pejorative, remember that
there are other coincidences in astronomy (indeed, there must be): for example,
the similar angular sizes of the Sun and the Moon are a coincidence that makes
total solar eclipses possible. In any case, this ‘coincidence’ may reflect some
underlying physical connection between quasar activity and star formation,
already hinted at in the Magorrian relation.
The ionizing background is a fairly constant J−21 1 1 at 1.6 < z < 4, but there is
a very quick decline in the ionizing background at z < 1.6. At z 1 0.5, J−21 is
only 6 × 10−3 , as the epochs of cosmic quasar activity and star formation draw to
a close. At the earliest cosmic epochs, the ability of primordial galaxies to ionize
their environments will depend on the escape fraction of ionizing photons from
these galaxies, of which we shall hear more later in this chapter.
Finally, a creative way to constrain the lifetimes of quasars and test the isotropy of
their emission is the transverse proximity effect: if you have two quasars that
have different redshifts but appear close on the sky, then you can use the Lyman α
forest in the spectrum of the more distant quasar to measure the ionization effect
of the nearer quasar. If quasars are found in rich environments on average, this
will complicate the interpretation, since the richer environment might compensate
for the loss of Lyman α clouds from ionization. (A similar bias may be present in See Goncalves, Steidel and
the proximity effect measurements of J−21 .) Pettini, 2008, Astrophysical
Journal, 676, 816.
267
Chapter 8 The intervening Universe
include the µ term to account for the contribution of helium to the neutral gas
mass. The comoving neutral hydrogen matter density must therefore be
*
ρH I (z) = µmH nco NH I A(NH I , z) dNH I
*
H0 µmH
= NH I f (NH I , z) dNH I , (8.15)
c
using Equation 8.4. It’s conventional to measure ρH I in units of the critical
(matter) density, i.e.
8πG ρH I
ΩH I = (8.16)
3H 2
(compare Equation 1.15). In practice, the total HI is estimated over an absorption
distance ΔX by summing the column densities in the interval ΔX:
* ,
NH I,i
N f (NH I , z) dNH I = .
ΔX
Here, NH I,i refers to the column density of the ith absorber.
Figure 8.14 shows the evolution in ΩH I measured by quasar absorption lines.
At the time of writing, the picture is somewhat confusing. Over the redshift
interval 2 < z < 6, there seems to be significant evolution, consistent with the
consumption of gas by star formation. The data point at z = 0 is consistent with
this broad trend. However, at 0.16 < z < 2 there are marginally discrepant data
points. It’s not yet clear what are the causes of the discrepancy, or whether this
represents a genuine effect.
lookback time/Gyr
0 4 6 8 9 10 11 12
1.5
1.0
ΩH I /10−3
0.5
It was originally thought that the high-redshift ΩH I matched the current comoving
stellar mass density, often written as Ω∗ . The decline in ΩH I could then be
attributed to the consumption of gas by star formation in galactic discs. This
interpretation, in which damped Lyman α systems (which dominate ΩH I ) do not
interact much with their environment, is sometimes called the ‘closed box’
268
8.8 How big are Lyman α clouds?
1
0.9
0.8
0.7
counts/s−1
Quasar pairs, on the other hand, are more widely separated and probe separations
of 1–2 h−1 Mpc. The line-of-sight comparisons are much less striking. From
statistical cross-correlations, there do appear to be some coherent structures on
Mpc scales, but it’s less clear that one is taking two lines of sight through a single,
coherent object — one might be just tracing the same large-scale structure.
269
Chapter 8 The intervening Universe
These size constraints are already enough to constrain the physics of Lyman α
clouds. One early suggestion was that the Lyman α clouds are neutral clumps in
pressure equilibrium with a tenuous ionized medium, but the predicted range of
sizes of 0.03–30 kpc in this model is inconsistent with these size observations.
However, as we’ve seen, the sizes are consistent with cosmological simulations in
which the Lyman α forest is the neutral ‘tip of the iceberg’ of the predominantly
ionized hydrogen gas, which follows the bottom-up gravitational collapse of
matter perturbations.
Figure 8.16 Numerical simulation of reionization in a (2h−1 )3 comoving Mpc3 volume by Nick Gnedin. The
brown opaque fog symbolizes neutral hydrogen. Glowing blue gas is dense ionized hydrogen, and less dense
ionized hydrogen is rendered as transparent. Yellow dots represent galaxies. The redshifts shown are z = 12.1,
10.4, 9.1, 8.1, 7.3, 6.6, 6.3, 6.0.
270
8.9 Reionization and the Gunn–Peterson test
It’s easy to show that most of the Universe at 1.6 < z < 4 is, on average, ionized.
The present-day density of the Universe is ρ0 = 1.8789 × 10−26 Ω0 h2 kg m−3
(Chapter 1). Putting in the nucleosynthesis value of Ωb,0 h2 1 0.015, and
remembering that density scales as (1 + z)3 , we find the baryon density of the
Universe to be ρb (z) 1 2.8(1 + z)3 × 10−28 kg m−3 . About 75% is hydrogen, as
we’ve seen, and the mass of a proton is 1.67 × 10−27 kg, so there are on average
about 0.13(1 + z)3 hydrogen ions or atoms per cubic metre. The average free
electron density is therefore ne = 0.13(1 + z)3 x per cubic metre, where x is the
average hydrogen ionization fraction. We’ve already seen that the estimated
J−21 1 1 at z 1 3 implies about nγ = 63 photons per cubic metre, so the
ionization parameter is U = nγ /ne = 500x−1 (1 + z)−3 . Using Equation 8.14 we
can find a quadratic equation for x:
107.9
x2 = (1 − x) ,
(1 + z)3
for which the only positive solution is (1 − x) 1 10−8 (1 + z)3 or x 1 1.
Therefore the z 1 3 Universe should on average be highly transparent to
Lyman α, and it’s only because of density inhomogeneities that any Lyman α
absorbers can be seen. If we assume that a Lyman α cloud is 25 kpc in size
(Section 8.8), the neutral hydrogen density must be around ρH I 1 NH I /25 kpc,
which comes out as ρH I 1 (NH I /1019 m−2 ) × 0.013 atoms per cubic metre. This
is much lower than the total hydrogen density of the Universe from primordial
nucleosynthesis (see above), so again we see that most of the hydrogen must be
ionized.
What’s more, an ionizing flux of J−21 = 1 is clearly enough to ionize the
Universe at any redshift for which we are likely to observe an object. However, if
we see the highest-redshift Universe becoming opaque on average to Lyman α
photons, then J−21 must have dropped sharply, and the Universe will be
predominantly neutral. This was first proposed by Gunn and Peterson in 1965 and
is now known as the Gunn–Peterson test. The transition between opaque and
transparent would then be probing the epoch of reionization in which the first
Strömgren spheres expand around the very first luminous objects in the Universe.
We had to wait several decades for the first thrilling hints of reionization in
quasar spectra from a Gunn–Peterson absorption trough at the highest redshifts.
Figure 8.17 shows the spectra of the highest redshift quasars — note the rapidly
decreasing lack of flux in the Lyman α forest at redshifts z > 6. We can convert
this to a Lyman α optical depth, shown in Figure 8.18. Whether this represents a
transition to the epoch when the Strömgren spheres were just beginning to overlap
is still a matter of debate. The Lyman α opacity is sensitive to the presence of rare
voids in the intergalactic medium, so measurements of the Gunn–Peterson
trough are sensitive to assumptions about the distribution of gas. Known quasars
are also biased tracers of the underlying matter distribution, and the quasars
or starbursts responsible for reionization may well also have been strongly
biased, so reionization is likely to have been inhomogeneous. Nevertheless, these
high-redshift quasars are the first to give useful constraints on reionization
simulations.
271
Chapter 8 The intervening Universe
Fλ
J1306 + 0356, z = 5.99
Figure 8.17 High-redshift quasars in the Sloan Digital Sky Survey (SDSS).
Note the increasing Gunn–Peterson opacity at the highest redshifts.
272
8.10 The Lyman α forest of He II
6
τeff
Was the first light in the Universe from star formation, or from black hole
accretion in quasars? At redshifts z > 3 the comoving number density of the most
luminous quasars drops quickly (Chapter 4). The slope of the quasar luminosity
function tells us whether fainter quasars could be important. It turns out that the
slope at z > 4 is shallower than at low redshift, which implies that quasars did not
contribute the majority of J−21 during the tail-end of reionization at z 1 6.
Another possibility for the origin of the first light is star-forming galaxies, since
young massive O and B stars are prodigious emitters of ionizing radiation. But
what fraction of this ionizing radiation escapes star-forming galaxies? It’s
difficult to measure Lyman continuum photons from high-redshift galaxies
because of the presence of intervening Lyman α absorbers and Lyman limit
systems; measurements of escape fractions so far range from fesc < 0.1 to
fesc > 0.5. However, even assuming fesc = 1, the luminosity function of z > 6
optically-selected galaxies suggests, as for quasars, that they are insufficient to
reionize the Universe.
Perhaps a new population of objects — such as intermediate mass accreting black
holes — reionized the Universe, but this mini-quasar population could easily
exceed the unresolved soft X-ray background. Perhaps the luminosity function of
z > 5 star-forming galaxies steepens at luminosities much fainter than probed so
far, invoking a new population of dwarf star-forming galaxies. At the time of
writing, the objects that reionized the Universe remain a mystery.
We can’t yet rule out more than one reionization epoch. Figure 8.19 shows
the constraints on the neutral fraction (1 − x) as a function of redshift. Two
reionization epochs might happen if there is an initial flurry of star formation that
creates predominantly massive stars because of the low metallicity (known as Radiation pressure limits the
population III stars), but subsequent stars (population II) are less massive so less maximum luminosity of stars,
able to ionize their surroundings. The intergalactic medium in this model would but the primordial gas from
then recombine until enough stars have formed to ionize it again. which population III stars
formed lacked metal absorption
lines, reducing the radiation
pressure on the gas.
8.10 The Lyman α forest of He II
The epoch of hydrogen reionization is tantalizingly just beyond our grasp, but
25% of the baryons in the Universe are in helium, and helium reionization is
already within our grasp. Helium is harder to ionize than He II: 54.4 eV are
273
Chapter 8 The intervening Universe
1 Ly α galaxy
late
and GP trough
length WMAP
10−1
Strömgren
sphere
neutral fraction
10−2
double
10−3
Gunn–Peterson
10−4
early
10−5
5 10 20
redshift, z
Figure 8.19 Experimental constraints on the reionization history of the Universe. The lines are two models that
are consistent with the Gunn–Peterson data, and one that is not (but which still is marginally consistent with
WMAP). The Strömgren sphere point is a constraint on the sizes of ionized regions around high-redshift quasars.
transmission 1 He II λ = 304 Å
0.5
zQSO = 3.2
0
1000 2000 3000 4000 5000
observed wavelength, λ/Å
Figure 8.20 The expected average transmission to Lyman continuum photons in the spectrum of a z = 3.2
quasar. The dashed lines show the ±1σ range expected in the opacity. Also shown is the location of the He II
Lyman α line at z = 3.2. For some quasars, we might expect enough transparency to be able to detect this line.
Figure 8.21 shows the He II Gunn–Peterson trough in the quasar HE 2347-4342.
The He II opacity is strikingly different to the H I opacity, at 4 times longer
wavelengths. Some regions lacking H I Lyman α lines are also relatively
transparent to He II, but for the most part the spectrum is opaque to He II. (There
is a region with high He II opacity but no obvious H I Lyman α absorbers, possibly
caused by thermal broadening or instrumental noise, or possibly related to
variations in the hardness of the ionizing radiation.)
Taking all available observations, He II reionization is measured to have happened
at a redshift of z = 2.8 ± 0.2. Despite the decline in quasar comoving number
density at z > 2, quasars are more than enough to reionize He II. There is some
tentative evidence, from comparing the H I and He II opacities, that the spectrum
of ionizing radiation is softer at high redshifts, i.e. a smaller proportion of
high-energy photons, consistent with star-forming galaxies providing a bigger
proportion.
redshift, z
2.775 2.800 2.825 2.850 2.875 2.900 2.925 2.950
1.00
0.75
0.50
normalized flux
0.25
0.00
1.00
0.75
0.50
0.25
0.00
1150 1160 1170 1180 1190 1200
λ/Å
Figure 8.21 Signatures of He II reionization in the quasar HE 2347-4342. The top panel shows the optical
spectrum, normalized to the quasar spectrum (so a flux of 1.0 is no absorption). The lower panel shows the
ultraviolet spectrum. The wavelengths in the top panel have been divided by approximately four, to match the
wavelengths of the H I and He II Lyman α forests. The thin dotted, roughly horizontal line is the 1σ uncertainty in
the ultraviolet measurements. The thick vertical dotted line marks the expected position of He II Lyman α (no
emission line is detected), and data redward (i.e. rightward) of the dashed vertical line are affected by absorbers
within the quasar itself or its host galaxy. The quasar redshift is z = 2.885, and redward of the dashed line the
quasar is known to have absorption lines associated with the quasar itself, i.e. zabs 1 zQSO .
infrared. There have been several claims of detections of this cosmic near-infrared
background, independently from the Infrared Telescope in Space (IRTS) and
the Diffuse Infrared Background Experiment (DIRBE) on the COBE CMB
mission. This would be a ground-breaking discovery, but this faint background
(∼10–50 nW m−2 sr−1 at wavelengths of 1–4 µm) is around a hundred times
fainter than the reflected sunlight from the zodiacal dust in our own Solar System.
This is a very delicate experiment that requires careful control of the systematic
uncertainties, and opinion is still divided as to whether genuinely cosmic infrared
background signals have been detected. Another approach is to take the DIRBE
maps, subtract the infrared fluxes of known stars and galaxies, and/or mask them
out, then look for the clustering of the residuals (analogously to the CMB). This is
independent of the absolute cosmic infrared background level. The clustering
measurements in the cosmic near-infrared have been argued to be consistent with
reionization population predictions, but opinion is again divided because this is
again an experiment that needs careful treatment of systematic uncertainties.
Nevertheless, the potential reward of discovering the reionization population
makes this a hot topic in current cosmology.
A completely independent approach to constraining the cosmic near-infrared
background comes from gamma-ray observations of quasars. If the Universe
is filled with a homogeneous cosmic background of near-infrared photons,
they should interact with the gamma rays through the pair production reaction
(γ + γ → e− + e+ , the inverse reaction of electron–positron annihilation, where
one γ is a gamma-ray photon and the other γ is a near-infrared photon), which
results in a measurable gamma-ray opacity. This opacity has not been seen, which
places important limits on the cosmic near-infrared background.
276
8.12 The Square Kilometre Array
19
flux density/mJy
18
The SKA also promises to revolutionize many of the topics discussed in this
book: the SKA team aim to measure the dark energy equation of state (from
cosmic shear and baryon wiggles), test whether dark energy clusters (using the
Integrated Sachs–Wolfe effect), measure the power spectrum of primordial
277
Chapter 8 The intervening Universe
But we know that Robs /Rem = dtobs /dtem = (1 + z), so this just becomes
dz
= (1 + z)H0 − H(z).
dtobs
Without being too disingenuous we could write this as
5 +
dz H(z)
= ż = H0 (1 + z) − , (8.17)
dtobs H0
where H(z)/H0 is the factor by which the Hubble parameter has changed.
The rate of change of redshift, ż, is plotted in Figure 8.24. So the constant of
proportionality is generally a bit smaller than 1, at least in the concordance
cosmology.
0
ż × 1010 /yr
−1 ΔV = 0.5 cm s−1
−2 ΩΛ,0 = 0.9
ΩΛ,0 = 0.8
ΩΛ,0 = 0.7 ΔV = 1.0 cm s−1
ΩΛ,0 = 0.6
ΩΛ,0 = 0.4 ΔV = 2.0 cm s−1
−3 ΩΛ,0 = 0.0
2 4 6 8
redshift, z
Figure 8.24 The real-time rate of change of redshift for various cosmological
models. Some reference velocity changes are shown as thin black dashed lines.
So the prospects for measuring the real-time expansion of the Universe seem to
look grim. However, one approach that might work is the Cosmical Dynamics
Experiment, or CODEX. The objective is to take very-high-resolution spectra of
the Lyman α forest with an extremely careful and stable wavelength calibration.
By cross-correlating the spectrum with a second spectrum at least 10 years later,
the shifts in redshifts should be detectable. Figure 8.25 shows simulated spectra
separated in time; note that it’s only by averaging the shifts of many Lyman α
absorbers that the expansion is detectable. This averaging also washes out any
peculiar acceleration in individual objects. CODEX is currently a proposed
experiment for the proposed European Extremely Large Telescope, and is still
many years from taking its first data. Another approach could be to use the SKA
to monitor the H I 21 cm forest in a z > 10 radio-loud active galaxy, if we can find
one that is bright enough (see Figure 8.23). Either way, it is possible that within
our lifetimes we shall have detected the real-time expansion of the Universe.
279
Chapter 8 The intervening Universe
1
0.8
0.6
0.4
0.2
0
transmission
5800 5900 6000 6100
λ/Å
0.5
0
5860 5870 5880 5890
λ/Å
Figure 8.25 The change in redshift in the Lyman α forest expected in five
million years. The shift in ten years will be much smaller, and detectable only
statistically.
Summary of Chapter 8
1. The Lyman α forest of absorption lines blueward of the Lyman α emission
line in quasars and galaxies is caused by intervening, lower-redshift
Lyman α absorbers.
2. Most of the electrons in hydrogen atoms in the Lyman α clouds are in the
ground state, implying no Hα absorption.
3. If sufficiently dense, the clouds will be self-shielding, i.e. will have an
optical depth > 1 to Lyman α photons.
4. The large-scale structure of the Lyman α forest traces the power spectrum of
baryonic matter on scales that are close to the linear regime for the growth of
perturbations.
5. The largest contribution to ΩH I comes from damped Lyman α systems. The
term ‘damped’ refers to the damping wings of the Lorentzian absorption
profile.
6. The column density and optical depth of an absorption line can be related to
the equivalent width via the curve of growth.
7. The comoving number density of absorbers along a line of sight is measured
with the use of absorption distance.
8. Ionizing photons near a quasar reduce the comoving number density of
Lyman α clouds. This is known as the proximity effect.
280
Summary of Chapter 8
Further reading
• Fan, X., 2006, ‘Observational constraints on cosmic reionization’, Annual
Review of Astronomy and Astrophysics, 44, 415.
• Loeb, A. and Barkana, R., 2001, ‘The reionization of the Universe by the first
stars and quasars’, Annual Review of Astronomy and Astrophysics, 39, 19.
• Rauch, M., 1998, ‘The Lyman alpha forest in the spectra of QSOs’, Annual
Review of Astronomy and Astrophysics, 36, 267.
• Wolfe, A.M., Gawiser, E. and Prochaska, J.X., 2005, ‘Damped Ly α systems’,
Annual Review of Astronomy and Astrophysics, 43, 861.
• Cen, R., 2003, ‘The Universe was reionized twice’, Astrophysical Journal,
591, 12.
• Faucher-Giguère, C.A., Lidz, A., Hernquist, L. and Zaldarriaga, M., 2008,
‘Evolution of the intergalactic opacity: implications for the ionizing
background, cosmic star formation, and quasar activity’, Astrophysical Journal,
688, 85.
• Hauser, M.G. and Dwek, E., 2001, ‘The cosmic infrared background:
measurements and implications’, Annual Review of Astronomy and
Astrophysics, 39, 249.
• Rybicki, G.B. and Lightman, A.P., 1979, Radiative Processes in Astrophysics,
Wiley.
• More on the Square Kilometre Array can be found at its website, currently
[Link]
281
Epilogue
How wonderful it would be to become wise.
Genesis 3, 6
Where will the next big changes in thinking in cosmology come from? Many of
the previous big changes have come from unexpected observational discoveries,
which makes it difficult to foresee the next leaps. As we’ve seen, the population
of high-redshift submm-luminous galaxies seemed unremarkable to optical
telescopes, yet were found to be convulsed in violent star formation by
submm-wave imaging. This led in part to the new model of galaxy downsizing.
As I write this, the submm-wave Herschel Space Observatory will be launched in
six days, and the submm-wave SCUBA-2 camera will shortly be commissioned at
the James Clerk Maxwell Telescope in Hawaii. Both have tremendous scope for
new discoveries. Cosmology has also seen a change from small teams and lone
scientists, to large international consortia using many different astronomical
facilities and techniques. Despite that, it’s still possible for individual scientists to
make a mark, whether on their own or as part of a small or large team. As a result
of these large-scale international efforts and developments in survey technology at
all wavelengths, we are in a very data-rich phase of astronomy.
So what’s next? Large CCD arrays are just making time-domain optical
astronomy possible. Projects such as PAN-STARRS and Gaia will repeatedly
survey large areas of sky. These will almost certainly uncover many new
gravitational microlens events and many nearby supernovae. Gamma-ray
monitoring of the sky led to the completely unexpected discovery of gamma-ray
bursts, which themselves have optical transients, so what else lies in wait to be
discovered in time-domain optical astronomy? Perhaps the new generation of
radio telescopes such as LOFAR and the SKA, or the HST’s successor the JWST,
or the next generation of 1 50 m-diameter optical/near-infrared telescopes, will
detect unexpected reionization populations that generated the first light in the
Universe after the Big Bang. Perhaps the new gravitational wave observatories
LIGO and LISA will detect inspiralling black holes and confront us with
irreconcilable inconsistencies with general relativity. Perhaps the anisotropies
in the CMB will eventually be found inconsistent with inflation, or the LHC
could fail to find the Higgs boson, either of which would force big changes in
fundamental physics. Perhaps the signatures of dark matter particles will be found
in terrestrial direct detection experiments or at the LHC, or their annihilation
signatures will be inferred from cosmic rays, which will tell us what dominates
most of the matter content of the Universe. We know very little indeed about the
dark sector in general, whether dark matter or dark energy. We assume, perhaps
blithely, that dark matter only responds to gravity, but perhaps it has its own
intricate suite of dark physics. Perhaps the delicate measurements of cosmic
shear or baryon wiggles or the expansion of the Universe will constrain the
phenomenological parameters of dark energy, and give some insight on the
physical causes of what dominates the current expansion of the Universe, or
even challenge our assumptions of the size scales at which the Universe is
homogeneous. There has surely never been a more exciting time in observational
cosmology.
282
Appendix A
Table A.1 Common SI unit conversions and derived units.
Quantity Unit Conversion
speed m s−1
acceleration m s−2
angular speed rad s−1
angular acceleration rad s−2
linear momentum kg m s−1
angular momentum kg m2 s−1
force newton (N) 1 N = 1 kg m s−2
energy joule (J) 1 J = 1 N m = 1 kg m2 s−2
power watt (W) 1 W = 1 J s−1 = 1 kg m2 s−3
pressure pascal (Pa) 1 Pa = 1 N m−2 = 1 kg m−1 s−2
frequency hertz (Hz) 1 Hz = 1 s−1
charge coulomb (C) 1C = 1A s
potential difference volt (V) 1 V = 1 J C−1 = 1 kg m2 s−3 A−1
electric field N C−1 1 N C−1 = 1 V m−1 = 1 kg m s−3 A−1
magnetic field tesla (T) 1 T = 1 N s m−1 C−1 = 1 kg s−2 A−1
temperature energy
absolute zero: 0 K = −273.15 ◦ C 1 eV = 1.602 × 10−19 J
0 ◦ C = 273.15 K 1 J = 6.242 × 1018 eV
283
Appendix A
Particle constants
charge of proton e 1.602 × 10−19 C
charge of electron −e −1.602 × 10−19 C
electron rest mass me 9.109 × 10−31 kg
= 0.511 MeV/c2
proton rest mass mp 1.673 × 10−27 kg
= 938.3 MeV/c2
neutron rest mass mn 1.675 × 10−27 kg
= 939.6 MeV/c2
atomic mass unit u 1.661 × 10−27 kg
Astronomical constants
mass of the Sun M) 1.99 × 1030 kg
radius of the Sun R) 6.96 × 108 m
luminosity of the sun L) 3.83 × 1026 W
mass of the Earth M⊕ 5.97 × 1024 kg
radius of the Earth R⊕ 6.37 × 106 m
mass of Jupiter MJ 1.90 × 1027 kg
radius of Jupiter RJ 7.15 × 107 m
astronomical unit AU 1.496 × 1011 m
light-year ly 9.461 × 1015 m
parsec pc 3.086 × 1016 m
Hubble parameter H0 (70.4 ± 1.5) km s−1 Mpc−1
(2.28 ± 0.05) × 10−18 s−1
age of Universe t0 (13.73 ± 0.15) × 109 years
current critical density ρc,0 (9.30 ± 0.40) × 10−27 kg m−3
current dark energy density ΩΛ,0 (73.2 ± 1.8)%
current matter density Ωm,0 (26.8 ± 1.8)%
current baryonic matter density Ωb,0 (4.4 ± 0.2)%
current non-baryonic matter density Ωc,0 (22.3 ± 0.9)%
current curvature density Ωk,0 (−1.4 ± 1.7)%
current deceleration q0 −0.595 ± 0.025
284
Appendix B
Introduction
In this appendix, we shall take you on a very quick revision of special relativity.
You will need the Lorentz transformation, time dilation and Lorentz contraction in
this book, as well as to be able to use Einstein’s mass–energy equivalence.
Proving E = mc2 will take us into discussions of four-vectors in this appendix,
though four-vectors are not needed in themselves for this book. Where algebraic
steps have been left out for brevity, enough information should be given for you to
fill them in, should you wish to.
There isn’t space to describe Einstein’s ingenious thought experiments that led
him to this theory, nor the many astonishing relativistic paradoxes. For these and
more, consult a specialist text, such as Lambourne’s Relativity, Gravitation and stationary
Cosmology published by Cambridge University Press.
length, L0
B.1 Principles
The principles of special relativity are:
• There is no universal standard of rest.
• The speed of light (c) is invariant.
moving with velocity v
length, L0
Consider a light clock (two mirrors with a light ray bouncing between them),
shown in Figure B.1. The clock is moving with velocity v. One can use
Pythagoras’s theorem to show that
δt1 = γ δt0 , (B.1)
Figure B.1 A Feynman light
where clock, made of two mirrors at a
1 fixed distance L0 , between
γ=- , (B.2)
1 − v 2 /c2 which a light pulse bounces. In a
stationary clock, bounces occur
δt1 is the time between reflections in the moving frame, and δt0 is the time in the at intervals of δt0 = L0 /c. In a
stationary frame. This can be remembered as ‘moving clocks run slowly’. moving clock, the intervals δt1
Sometimes the notation β = v/c is used. between bounces are longer,
because the
- light travels a
distance L20 + (v δt1 )2 > L0 .
B.3 Lorentz contraction and simultaneity Setting this distance equal to
c δt1 and rearranging, one
In Figure B.2, the light pulses leave the corner simultaneously. The impacts at the
obtains Equation B.1.
top and side mirrors are simultaneous in the stationary frame, but not in the
moving frame because the outward journey is longer than the return journey for
the light ray moving parallel to the direction of motion. In general, there is no
universal standard of simultaneity in special relativity.
285
Appendix B
It can be shown that the length in the stationary frame L0 (measured along the
direction of motion) and the length in the moving frame L are related by
length, L0
1
L= L0 . (B.3)
γ
Note that this is contraction, not dilation: ‘moving rulers are short’. There is no
contraction perpendicular to the motion. To prove this, imagine two circular hoops
with the same rest-frame radius, both aligned to be perpendicular to the x-axis.
length, L0 One hoop moves along the x-axis. If one hoop passed inside the other, then there
would be a preferred standard of rest, in contradiction with the first principle.
Figure B.2 A modified
Feynman light clock, made of
two sets of mirrors, both at a B.4 Lorentz transformation
fixed rest-frame distance L0 ,
between which light pulses The transformation of ct, x, y, z coordinates from one reference frame to another
bounce (shown as dashed lines). (which we denote as primed and unprimed coordinates) can be expressed as
%
ct ct
x% x
% = Λ , (B.4)
y y
z% z
where Λ is a 4 × 4 matrix (not to be confused with the cosmological constant).
We assume that the origins of the coordinate systems coincide. For x-axis motion
with velocity v,
γ (−v/c)γ 0 0
(−v/c)γ γ 0 0
Λ=
. (B.5)
0 0 1 0
0 0 0 1
This can be proved elegantly using only symmetries.
First, y % = y and z % = z, because there is no Lorentz contraction perpendicular to
the motion. Suppose that
C %D C D C DC D
ct ct A B ct
=Λ = .
x% x C D x
(We neglect the y- and z-components for reasons of space.) Consider light rays in
the positive and negative x-directions. From the principles of special relativity, the
line x = ct must transform to x% = ct% , i.e.
C DC D C D
A B 1 1
=a ,
C D 1 1
where a is a non-zero constant, which implies that A + B = C + D. Similarly,
x = −ct must transform to x% = −ct% , i.e.
C DC D C D
A B 1 1
=b ,
C D −1 −1
where b 5= 0 is another constant, implying that A − B = −(C − D). Together,
these imply that A = D and B = C, i.e. Λ is a symmetric matrix:
C D
A B
Λ= .
B A
286
Appendix B
B.5 Invariants
Using the Lorentz transformation, one can show that the interval δs is invariant
(i.e. the same in all reference frames) under Lorentz transformations, where
(δs)2 = c2 (δt)2 − (δx)2 − (δy)2 − (δz)2 .
ct
Note that δs = c δτ , where τ is the proper time. Note also that δs = 0 for photons.
ct
We can write this as
=
A
=
−
E
x
ct
(δs)2 = ηαβ δxα δxβ , B
α,β
x
where xα does not mean ‘x to the power of α’, but rather in this context refers to
the components of the four-vector (ct, x, y, z). The convention is for this to count
from zero, i.e. x0 = ct, x1 = x, x2 = y, x3 = z. One can only apologize for the
obvious inadequacies of this very common notation. It is usually clear from the
context whether superscripts refer to components, or mean ‘to the power of’.
The matrix Figure B.3 Special
relativistic lightcone diagram.
1 0 0 0
0 −1 0 0 The origin and point A have a
η=
0
time-like separation for all
0 −1 0
0 0 0 −1 observers, while the origin and
point B have a space-like
is called the metric tensor. Real intervals are known as time-like, and imaginary separation for all observers.
intervals as space-like (see Figure B.3). (Note that some textbooks use
diag(−1, 1, 1, 1), resulting in the opposite convention for δs for space-like and
time-like intervals.) This metric is sometimes known as Minkowski spacetime.
In general, if Aα and B α are the components of four-vectors, then
E
ηαβ Aα B β
α,β
287
Appendix B
B.11 E = mc2
To reach Einstein’s famous equation, we start from U · a = 0:
E E
0= ηαβ U α maβ = ηαβ U α F β
α,β α,β
dP 0 dt d(γmc) dxi i
= U0 − U i γf i = c − γf .
dτ dτ dτ dτ
289
Appendix B
B.12 Photons
Photons have zero rest mass, but nevertheless carry momentum and energy
consistent with Equation B.7: E = pc. The four-velocity is not defined for a
photon, but the four-momentum is (E/c, px c, py c, pz c), where px is the
x-component of the relativistic three-momentum, and so on. The invariant
interval δs along any two points on a light ray is always zero.
One curious and little-known aspect of special relativity is that it implies Planck’s
famous formula E = hν, but doesn’t give a value for h. If we consider a photon
with energy E moving along the x-axis, and Lorentz transform to the frame of
an observer
- also moving along the x-axis with speed v, one can show that
E % /E = (c + v)/(c − v). This is the relativistic Doppler shift (but don’t
confuse it with cosmological redshift in Chapter 1).
Alternatively, we could consider a monochromatic plane wave. We can define a
wave four-vector as k = (ω/c, kx , ky , kz ), where the three-vector (kx , ky , kz )
points along the direction of the wave, and ω = 2πν, where ν is the frequency. (To
see why k must be a four-vector, note that the phase
J φ must be an invariant scalar,
and that k · x = φ.) The wavelength is λ = 2π/ kx2 + ky2 + kz2 . A Lorentz
transformation of the -wave four-vector of a wave moving along the x-axis leads
%
eventually to ν /ν = (c + v)/(c − v). Therefore E % /E = ν % /ν, or E ∝ ν.
290
Solutions to exercises
Solutions to exercises
Exercise 1.1 When we calculated that the sky is as bright as the Sun, we
assumed that the line of sight stopped on the star, i.e. stars are opaque. When we
calculated the brightness of the sky for a S −5/2 power law, we integrated down to
zero flux, which (for any particular type of star) means integrating to r = ∞.
So the lines of sight don’t stop on stars in the latter case; stars are treated as
transparent.
Exercise 1.2 We use E = γm0 c2 , where E is the energy, m0 is the
rest mass, γ = (1 − v 2 /c2 )−1/2 , and c is the speed of light. We have that
1020 eV = γc2 × 938.28 MeV/c2 1 γ × 109 eV . The quoted accuracy of the
energy does not justify carrying more than just the first significant figure on the
proton’s rest mass. The γ factor is then just γ = 1020 /109 = 1011 . The cosmic
ray is moving at very close to the speed of light, so it would take about 100 000
years for the proton to cross the Galaxy in the Galaxy’s rest frame. But moving
clocks run slow, so it would take 100 000/γ years in the proton’s rest frame, i.e.
105 /1011 years, or 10−6 years, or about 30 seconds!
Exercise 1.3 First we differentiate Equation 1.7 with respect to time t to get
8πG ! 2 F 2Λc2 RṘ
2ṘR̈ = ρ̇R + 2ρRṘ + , (S1.1)
3 3
where we write ρ = ρm + ρr for brevity and the ‘dot’ notation is used to indicate
differentiation with respect to time, i.e. Ṙ = dR/dt and R̈ = d2 R/dt2 . The
conservation of matter energy gives
d 1 2 3(
ρc R = ρ̇c2 R3 + 3ρc2 R2 Ṙ
dt
d(R3 )
= −p
dt
= −3pR2 Ṙ,
so
ρ̇c2 R3 = −3pR2 Ṙ − 3ρc2 R2 Ṙ.
Equation S1.1 has a term ρ̇R2 , so we rearrange the above to find
−3pRṘ
ρ̇R2 = − 3ρRṘ
c2 C D
3p
= −RṘ + 3ρ .
c2
Substituting this into Equation S1.1 gives
5 C D+
8πG 3p 2Λc2 RṘ
2ṘR̈ = 2ρRṘ − RṘ + 3ρ +
3 c2 3
C D 2
8πGRṘ 3p 2Λc
= 2ρ − 2 − 3ρ + RṘ
3 c 3
C D
−8πGRṘ 3p 2Λc2
= ρ+ 2 + RṘ
3 c 3
C D
3p RṘ 2Λc2
= −8πG ρ + 2 + RṘ.
c 3 3
291
Solutions to exercises
θ2 ∝ d−2 2
A . The flux will be inversely proportional to dL (Equation 1.49), i.e.
−2
S ∝ dL . The surface brightness will therefore vary as S/θ2 ∝ d2A /d2L . But
dL = (1 + z)2 dA (Equation 1.50), so surface brightness must vary as (1 + z)−4 .
Exercise 1.7 In Section 1.5 we are given that H0 = 72 ± 3 km s−1 Mpc−1
and ΩΛ,0 = 0.742 ± 0.030. One parsec is 3.09 × 1016 m, so in SI
units, H0 = (2.3 ± 0.1) × 10−18 s−1 . Equation 1.17 relates these two
quantities to Λ: ΩΛ,0 = Λc2 /(3H02 ), so Λ = 3ΩΛ,0 H02 /c2 . Putting in the
numbers,
- we get Λ = (1.3 ± 0.2) × 10−52 m−2 . The horizon size will be
3/Λ = (1.5 ± 0.1) × 1026 m, or 4900 ± 300 Mpc. This cosmological event
horizon will be exceedingly distant; for comparison, the current radius of the
observable Universe in Section 1.9 is about 3.53c/H0 = 14 900 Mpc.
Exercise 2.1 The 13.6 eV photon does ionize another atom. However, the
process of recombination needn’t result in the emission of just one photon.
Sometimes the electron will bind first in a high energy state (releasing one photon
with an energy < 13.6 eV), then release the remaining energy in stages as the
electron drops down the energy levels of the hydrogen atom. Each of these stages
will involve the release of a photon, but none of these photons will have enough
energy on its own to ionize hydrogen atoms.
Exercise 2.2 We are given that T = 2.725 ± 0.001 K, so the energy density
must be ρr,0 c2 = 4σT 4 /c = 4 × 5.67 × 10−8 × 2.7254 /(3.00 × 108 ) joules per
cubic metre, i.e. ρr,0 c2 = 4.17 × 10−14 J m−3 , or mass-equivalent density of
ρr,0 = 4.64 × 10−31 kg m−3 . Applying Equation 1.16, and remembering that
H0 = 100h km s−1 Mpc−1 = 3.24 × 10−18 h s−1 , we find that
8πG ρr,0
Ωr,0 = = 2.47h−2 × 10−5 .
3H02
So
Ωr,0 h2 1 2.5 × 10−5 ,
as required.
Exercise 2.3 The matter energy density scales as R−3 , while the
photon/neutrino energy density scales as R−4 . Therefore from Equations 1.15
and 1.16, Ωr /Ωm = (1 + z) Ωr,0 /Ωm,0 . From Exercise 2.2 and the text following
it, we have that Ωr,0 h2 1 4.2 × 10−5 (TCMB,0 /2.725 K)4 . The epoch of
matter–radiation equality must by definition satisfy Ωr /Ωm = 1, so
Ωm,0
1 + zeq =
Ωr,0
h2
= Ωm,0 (TCMB,0 /2.725 K)−4
4.2 × 10−5
1 23 800 Ωm,0 h2 (TCMB,0 /2.725 K)−4 ,
as required. Using Ωm,0 = 0.268, h = 0.704 and TCMB,0 = 2.725 K gives
zeq 1 3160.
Exercise 2.4 The analysis is the same up to Equation 1.30, where ρ this time
is ρr . However, instead of ρ = ρ0 × R03 /R3 , we must also take into account the
293
Solutions to exercises
fact that photons lose energy from redshifting, so ρr = ρ0 × R04 /R4 . With Λ set to
zero, the equivalent of Equation 1.32 comes out as
C D2
H 1 (
= (1 + z)2 1 − Ωr,0 + Ωr,0 (1 + z)2 ,
H0
and inserting Ωr,0 = 1 and using H 2 = (1 + z)−2 (dz/dt)2 , we find that
C D2
dz
= H02 (1 + z)6
dt
so
dz d(1 + z)
= = H0 (1 + z)3 .
dt dt
Now the dimensionless scale factor a is related to redshift via a = 1/(1 + z), so
we could write this as
d(a−1 )
= H0 a−3
dt
thus
da
−a−2 = H0 a−3
dt
hence
a da ∝ dt.
Integrating this gives a2 ∝ t, or a ∝ t1/2 as required.
Exercise 2.5 ! is measured in J s. A Joule has dimensions of energy (like
1 2 2 −2
2 mv ) so it has dimensions ML T , where we write M for the dimension of
mass, L for length, and T for time. (Note that numerical constants are ignored
in dimensional analysis.) Therefore we can write the dimensions of ! as
[!] = ML2 T−1 . Similarly, the dimensions of c are [c] = LT−1 . To find the
dimensions of G, we can start with the familiar equation F = GM m/r2 , and note
that force is mass times acceleration, so ma = GM m/r2 or G = ar2 /M , so the
dimensions of G are [G] = LT−2 L2 /M = M−1 L3 T−2 . Now let’s suppose that the
Planck time is given by a formula of the form !x cy Gz , where the constants x, y
and z are to be determined. The result must have the dimensions of time, so
1 (x 1 −1 (y 1 −1 3 −2 (z
T = ML2 T−1 LT M L T .
Multiplying this out and rearranging gives
T = Mx−z L2x+y+3z T−x−y−2z .
The left-hand side has no mass M, so x − z must equal zero, i.e. x = z. The
left-hand side also has no length L, so 2x + y + 3z = 0. The left-hand side has
exactly one power of T, so −x − y − 2z = 1. We have three simultaneous
equations for three unknowns. Substituting in x = z into the other two equations
gives 5x + y = 0 and −3x − y = 1. Therefore y = −1 − 3x = −5x, or
x = 1/2. Since x = z, we have z = 1/2. Finally, any of the equations involving y
imply that y = −5/2. Therefore- the characteristic time must be of the form
x y z
! c G =! c 1/2 −5/2 1/2
G = !G/c5 , as required.
294
Solutions to exercises
Exercise 2.6 We have already that (1/R) d2 R/dt2 = α(α − 1)t−2 . Since t is
positive and α > 1, the right-hand side must be positive. Therefore the left-hand
side must also be positive. Since R is also positive, d2 R/dt2 > 0.
Exercise 2.7 We start with
3H φ̇ = −V % (Eqn 2.23)
and then use
8π
H2 = V. (Eqn 2.24)
3m2Pl
Now the H dt term in the integral in the question can also be expressed as
dt dφ
H dt = H dφ = H .
dφ φ̇
Next we use Equation 2.23 to get
dφ dφ
H dt = H %
= −3H 2 % .
(−V /3H) V
Finally, using Equation 2.24 this comes out as
C D
−8π V
H dt = 2 dφ,
mPl V %
so we reach the required integral:
*
−8π φ1 V
N= 2 dφ.
mPl φ2 V %
For the next part, we set V % 1 V /φ and φ1 = 0 (as advised in the question) to
write this as
*
−8π 0 V
N= 2 φ dφ.
mPl φ2 V
Evaluating this integral gives
C √ D2
4π 2 2 πφ2
N = 2 φ2 = .
mPl mPl
√ √
Thus to have N > 60 we need φ2 > mPl 60/(2 π), or in other words,
φ2 > 2.2mPl .
Exercise 2.8 No, not immediately. At first the CMB will appear very uniform,
as you receive light from only your immediate neighbourhood. As time progresses
you will receive light from larger and more distant parts of the Universe. You’ll
only be able to see the structures with wavelength λ once light has had time to
travel the distance λ, i.e. after a time δt = λ/c, where c is the speed of light. The
size of the largest acoustic peak is set by the sound horizon after inflation. Once
light has had time to travel this distance, all the acoustics will start to become
visible. Also, the acoustic peaks will have a different angular size on the sky,
because the surface of last scattering was closer. Finally, the CMB wouldn’t have
peaked at microwave wavelengths then, so perhaps we shouldn’t call it the CMB
then!
295
Solutions to exercises
Exercise 2.9 We found in Section 2.7 that the particle horizon radius
√ at
recombination was 2c/H = 0.46 Mpc. The sound speed is√cs = c/ 3, so the
sound horizon will be 2cs /H = (2c/H) × (cs /c) = 0.46/ 3 Mpc = 0.27 Mpc.
Exercise 2.10 Dark matter clumps through gravitation, while dark energy
appears to be smoothly distributed through space. Dark matter is also essentially
pressureless, with Ωm dominated by the rest mass of the dark matter particles,
while dark energy has a strong negative pressure. Dark matter makes up about
20% of the total energy density of the Universe, and at recombination made up
about 70%. Dark energy, meanwhile, was negligible at recombination and yet
dominates the present-day energy density of the Universe. (One hopes that it
will soon be possible to add that the dark matter particle has been directly
detected, though that is not yet true at the time of writing; certainly, the proposed
particle physics mechanisms for generating dark matter and dark energy are very
different.)
Exercise 2.11 One parsec is about 3.09 × 1016 m, so
H0 = 72 × 103 /(106 × 3.09 × 1016 ) 1 2.33 × 10−18 s−1 . In Chapter 1 we saw
that ΩΛ,0 = Λc2 /(3H02 ), so Λ = 3ΩΛ,0 H02 /c2 . Putting in the numbers gives
Λ = 1.3 × 10−52 m−2 .
Exercise 3.1 The luminosity contributed by a shell of radius r → r + dr will
be I(r) times the area of) the shell, 2πr dr. Summing these shells, the total
∞
luminosity will be L = 0 I(r) 2πr dr. Let’s define L0 to be the luminosity with
I0 = r0 = 1, i.e.
* ∞
L0 = f (r) 2πr dr.
0
Now let’s calculate the luminosity in the more general case:
* ∞ C D
r
L= I0 f 2πr dr
0 r0
* ∞ C D C D
2 r r r
= I 0 r0 f 2π d .
0 r0 r0 r0
But this integral has the same form as the integral defining L0 , which also
integrates from 0 to ∞, so L = I0 r02 L0 .
Exercise 3.2 A shell of thickness dr and radius r will have mass
dM = 4πr2 ρ dr. The gravitational potential energy of this shell will be
−G M (< r) dM
dEGR = , (S3.1)
r
where M (< r) is the mass enclosed within a radius r, i.e.
M (< r) = 43 πr3 ρ,
and the mass of the shell is
dM = 4πr2 ρ dr.
Substituting this into Equation S3.1 gives
−G 43 πr3 ρ
dEGR = dM = −G 43 πr2 ρ × 4πr2 ρ dr
r
296
Solutions to exercises
so
#4 2
!2
dEGR = −3G × 3 πr ρ dr.
Integrating this from radius 0 to radius R gives
$ R
# 4 2 !2 # 4 !2 R5
EGR = −3G 3 πr ρ dr = −3G 3 πρ
0 5
−3G # 4 3 !2
= πR ρ
5R 3
3GM 2
=− ,
5R
where M = 34 πR3 ρ is the total mass of the sphere.
Exercise 3.3 The kinetic energy will be EK = 32 N kT , where N is the number
of gas particles. Virial equilibrium is 2EK = −EGR , i.e.
3 GM 2
3N kT = .
5 R
The requirement for gravitational collapse is therefore
3 GM 2
3N kT < .
5 R
To reach Equation 3.7, we need to eliminate N and R. To a good approximation,
at recombination we can assume that the gas particle masses are the proton
mass mp , so the number of particles must be N = M/mp . We can also use
M = 34 πρR3 to eliminate R, since R = (3M/4πρ)1/3 . Inserting these
substitutions gives
" %1/3
M 3 2 4πρ
3 kT < GM ,
mp 5 3M
which when rearranged in terms of M gives the required equation.
The current temperature of the CMB is about 2.7 K, and the redshift of
recombination is about z = 1000, so the photon temperature at recombination
must be T = 2.7(1 + z) ! 3000 K . Matter and radiation will just have been in
thermal equilibrium, so this will have been the matter temperature too. The
baryonic density will be proportional to (1 + z)3 , and using Equation 1.26 and
ρb = Ωb ρcrit (Equation 1.22), we have that the baryonic density at z = 1000 will
be
ρb = ρb,0 (1 + z)3
= ρcrit × Ωb,0 (1 + z)3
= 1.8789 × 10−26 × Ωb,0 h2 (1 + z)3 kg m−3
= 1.8789 × 10−26 × 2.273 × 10−2 × (1 + 1000)3 kg m−3
! 4.3 × 10−19 kg m−3 .
Putting in the numbers gives
" %3/2 " %1/2
5 × (1.381 × 10−23 J K−1 ) × 3000 K 3
M> ×
(6.673 × 10−11 N m2 kg−2 ) × (1.673 × 10−27 kg) 4π × 4.3 × 10−19 kg m−3
! 2 × 1036 kg ! 106 M" as required.
297
Solutions to exercises
Exercise 3.4 For a flat universe, the comoving distance is the same as the
proper motion distance (Equation 1.56). This isn’t true in general (watch out!) but
it’s true in a flat universe. The proper motion distance is related to the angular
diameter distance dA by Equation 1.50, which gives dA = dcomoving /(1 + z). The
definition of angular diameter distance in Equation 1.47 gives us a relationship
between the size of an object as it was at the time of redshift z and the angular
size as it appears today. The proper size of the BAO wiggles is just the comoving
size divided by (1 + z), i.e. LBAO /(1 + z). The angular diameter distance to
redshift z is therefore dA = (LBAO /(1 + z))/θBAO . The comoving distance to
redshift z must therefore be dcomoving = dA × (1 + z) = LBAO /θBAO , as required.
Exercise 3.5 Here the trick is to use Equation 1.43. It follows from that
relation that a small comoving interval along the redshift axis must equal
δdcomoving = c δz/H(z). Setting this comoving interval to LBAO gives us
LBAO = c δz/H(z), so H(z) = c δz/LBAO , as required.
Exercise 3.6 No. The amplitude of the fluctuations could depend on the bias,
but the scale length itself is bias-independent.
Exercise 4.1 First, we need to get Equation 1.7 into a form where the only
time-dependent parameter is R. The density ρ is time-dependent and varies as
ρ = ρ0 (R0 /R)3 (where subscript 0 indicates present-day values), so we have
" % " %3
dR 2 8πG R0 8πGρ0 R03 −1
= ρ0 R2 − c2 = R − c2
dt 3 R 3
(where we’ve used k = +1). If we set dR/dt = 0 and solve, we find that
Rmax = R = 8πGρ0 R03 /(3c2 ). Therefore
" %
dR 2 Rmax 2
= c − c2 .
dt R
Using the chain rule we have that
" % " % " % " % " %
dR 2 dR 2 dt 2 dR 2 R 2
= =
dθ dt dθ dt c
and so
" % " %2 " %
dR 2 R Rmax 2
= c − c = Rmax R − R2 ,
2
dθ c R
as required.
We’re asked to verify that Equation 4.2 works rather than proving it, so all we
have to do is substitute it in. Differentiating Equation 4.2 with respect to θ gives
dR Rmax
= sin θ
dθ 2
so
" %2
dR 2
Rmax R2 # !
= sin2 θ = max 1 − cos2 θ .
dθ 4 4
298
Solutions to exercises
Meanwhile,
2
Rmax 2
Rmax
Rmax R − R2 = (1 − cos θ) − (1 − cos θ)2
2 4
2
Rmax R2 1 (
= (2 − 2 cos θ) − max 1 + cos2 θ − 2 cos θ
4 4
2
Rmax 1 (
= 2 − 2 cos θ − 1 − cos2 θ + 2 cos θ
4
2
Rmax 1 (
= 1 − cos2 θ ,
4
which equals (dR/dθ)2 as above.
Finally, we just need to differentiate Equation 4.3, which gives
dt Rmax R
= (1 − cos θ) = ,
dθ 2c c
as required.
Therefore Equations 4.2 and 4.3 are a solution.
Exercise 4.2 To show this, we’ll first get things in terms of H. It’s a flat
matter-dominated universe, so Ωm = 1 = 8πGρm /(3H 2 ), thus 4πGρm = 3H 2 /2.
We also know that H(t) = ȧ/a. Substituting this into Equation 4.9, we have
δ̈ + 2H(t) δ̇ = 3H 2 (t) δ/2.
Next we use H(t) = 2/(3t) to reformulate this in terms of a differential equation
involving just δ and time:
C D
4 3 2 2 2
δ̈ + δ̇ = δ = 2 δ.
3t 2 3t 3t
Next, let’s try power law solutions δ = btc where b and c are constants. Then
δ̇ = bctc−1 and δ̈ = bc(c − 1)tc−2 . Substituting in, we find
4 2
bc(c − 1)tc−2 + bctc−1 = 2 btc .
3t 3t
Collecting the terms together, we find that
bc(c − 1)tc−2 + 34 bctc−2 = 23 btc−2 ,
and dividing through by btc−2 gives
c(c − 1) + 43 c = 32 .
The solution to this quadratic equation is c = 2/3 or c = −1. The −1 solution is
known as the decaying mode, and is not physically relevant in this universe (it
decays more rapidly than the growing mode grows and is quickly negligible). The
2/3 power law time-dependence (which we found ultimately from linearized fluid
dynamic equations) is identical to Equation 4.8, which is why the latter is known
as the linear theory.
Exercise 4.3 The redder colour will be the one with the larger V-band
to B-band flux ratio SV /SB . The fluxes are related to the magnitudes by
299
Solutions to exercises
This ratio is independent of the geometrical cross section A and of the luminosity
density ρ. If the cloud is deep enough, then the term in brackets is 1 1, so we just
have Lout /Lno dust = 1/(kh). We can now write this for Hα light:
Lout (Hα) 1
= .
Lno dust (Hα) kHα h
For Hβ, we have that τ Hβ 1 1.45 τ Hα , so kHβ = 1.45 kHα , thus
Lout (Hβ) 1 1 1 Lout (Hα)
= = = .
Lno dust (Hβ) kHβ h 1.45 kHα h 1.45 Lno dust (Hα)
Therefore
Lout (Hα) Lno dust (Hα)
= 1.45 . (S4.1)
Lout (Hβ) Lno dust (Hβ)
This is independent of h, so we’ve now removed all dependence on the geometry.
So even if kh is enormous and Lout * Lno dust , the luminosity ratio of Hα and Hβ
is only ever 1.45 times the ratio that you get with no dust, when enough dust is
evenly mixed with the gas emitting the emission lines.
Now suppose that you wrongly assumed that it’s a simple dust screen with an
optical depth of τ Hα for Hα and τ Hβ = 1.45 τ Hα for Hβ. Your luminosities
would be
Lout (Hα) = Lno dust (Hα) × e−τ Hα ,
Lout (Hβ) = Lno dust (Hβ) × e−1.45 τ Hα ,
so the luminosity ratio would be
Lout (Hα) Lno dust (Hα) 0.45 τ Hα
= e . (S4.2)
Lout (Hβ) Lno dust (Hβ)
Comparing this to Equation S4.1, we have 1.45 = e0.45 τ Hα , or
τ Hα = ln(1.45)/0.45 1 0.83. Since τ Hα 1 0.7AV , we have AV 1 1.2. So, if you
have an optically-thick cloud in which the dust is well-mixed with the gas, but you
wrongly assumed a foreground dust screen, you’d infer a V-band extinction of just
1.2 magnitudes, regardless of what the real extinction τtotal is from one end of the
cloud to the other.
Exercise 4.5 Astronomical absolute magnitudes are defined as
m = −2.5 log10 L + constant, so
d(ln L) −2.5 1
dm = −2.5 d(log10 L) = −2.5 = dL. (S4.3)
ln 10 ln 10 L
Therefore
dN − ln 10 dN
= L . (S4.4)
dm 2.5 dL
The − sign just indicates that the magnitude increment dm is in the opposite
sense to the luminosity increment dL, and is usually neglected.
Exercise 4.6 The variance of a probability distribution p(x) is the mean of the
squares minus the square of the mean, i.e.
* 1 C* 1 D2
2
Var(x) = x p(x) dx − x p(x) dx .
0 0
301
Solutions to exercises
1012
luminosity/(L! sr−1 )
1011
Arp 220
1010
M82
109
108
0 1 2 3
redshift, z
Figure S5.1 This is the same as Figure 5.15, but with the approximate location
of one possible flux limit marked as a thick black line.
Exercise 6.1 Suppose that we wanted to separate a human being into protons
and electrons, then hold them one metre apart. For a 60 kg mass, the force
required would be F = (ne)2 /(4πε0 r2 ), where r = 1 m, n = 60 kg/mp and
ε0 is the vacuum permittivity of free space. This comes out as a gigantic
F 1 3 × 1029 kg m s−2 . The luminosity of the Sun is L) = 3.83 × 1026 W, so the
momentum flux from the Sun is L) /c = 1.28 × 1018 kg m s−2 . If we could
305
Solutions to exercises
employ all the momentum flux from all the 1 1011 stars in the Galaxy in keeping
the positive and negative parts separate, it would be just sufficient to maintain a
1 m separation for just 60 kg. The potential barrier for separating the charged
components of a plasma accreting around a black hole is clearly insuperable for
radiation pressure.
Exercise 6.2 Putting the numbers into Equation 6.6 gives
4π × (6.67 × 10−11 N m2 kg−2 ) × (3.00 × 108 m s−1 ) × (1.99 × 1030 kg) × (1.67 × 10−27 kg)
LE =
6.65 × 10−29 m2
31
= 1.26 × 10 W.
The luminosity of the Sun is 3.83 × 1026 W, which is far below the Eddington
limit.
Exercise 6.3 Assuming that the mass of a 100 W light bulb is (say) about 50 g,
we get an Eddington limit of just 0.2 W . Clearly, a light bulb radiates at much
more than the Eddington limit. Light bulbs don’t blow themselves apart because
they are not gravitationally bound.
Exercise 6.4 To obtain Equation 6.26 we start with Equation 1.53, then use
Equation 1.41. It immediately follows that
4πc dz c dz
dV = d2A (1 + z)3 = 4πd2A (1 + z)2 .
(1 + z) H(z) H(z)
(We ignore the − sign, which just refers to the directions in which the
infinitesimal increments are measured.) Next, putting in the relationship between
angular diameter and luminosity distance, dL = (1 + z)2 dA (Equation 1.50), gives
4πd2L 2 c dz 4πd2L c dz
dV = (1 + z) = .
(1 + z)4 H(z) (1 + z)2 H(z)
Dividing by dz and multiplying by H0 /H0 gives
dV 4πcd2L c 4πd2L
= = ,
dz (1 + z)2 H(z) H0 (1 + z)2 H(z)/H0
as required.
We can rearrange this as
4πd2L H(z)
= (1 + z)2 .
dV /dz c
Finally, we use Equation 1.28: |dz/dt| = (1 + z) H(z) (again we’ll not worry
about the sign). Therefore
4πd2L 1
dt = (1 + z) dz,
dV /dz c
which is Equation 6.24, as required.
Exercise 6.5 The angular radius θ will satisfy θ 1 tan θ = rh /D,
where D = 10 Mpc and rh is given by Equation 6.29:
rh = 10 × (108 /108 ) × (220/200)−2 pc = 8.3 pc. Plugging in the numbers, we
have θ 1 r/D = 8.3 pc/10 Mpc = 8.3 × 10−7 radians. In arcseconds this is
306
Solutions to exercises
θ = 8.3 × 10−7 × (360◦ /2π) × 60 × 60 = 0.17%% (or double that for the diameter).
This is clearly smaller than the seeing limit of ground-based telescopes.
Exercise 6.6 The e-folding timescale for Eddington-limited black hole growth
is te-fold = 4 × 108 × η/(1 − η) yr. There have been 3 × 109 /te-fold e-foldings
since the start of the Universe, or 7.5 × (1 − η)/η e-foldings. To reach 106 M) ,
one needs loge (106 /101 ) = 11.5 e-foldings. If η = 0.42, this is only time for
10.4 e-foldings. In order to grow a black hole large enough, it must be spinning
more slowly and therefore have a lower accretion efficiency.
Exercise 7.1 Comoving distances add, so rS = rL + rLS . Therefore
rLS = rS − rL . In flat space, angular diameter distance is simply comoving
distance divided by (1 + z) (Chapter 1), but in this case we need the redshift of
the background source as seen from the lens. We could write this factor as
(1 + zLS ). This is the factor by which the Universe expanded between the source
redshift and the lens redshift, i.e. RL /RS , where R is the scale factor. But
RL RL /R0 R0 /RS
= =
RS RS /R0 R0 /RL
(where the subscript 0 refers to the present day), so (1 + zLS ) = (1 + zS )/(1 + zL ).
Therefore our final expression for the angular diameter distance DLS is
(1 + zL )
DLS = (rS − rL ) × .
(1 + zS )
Exercise 7.2 First, matching distances along the top of Figure 7.7 shows that
θDS = βDS + αD K LS . But α = αD
K LS /DS , so θDS = βDS + αDS . Dividing
out the scalar DS gives θ = β + α, which we can rearrange to β = θ − α, as
required.
Exercise 7.3 We set β = 0 in Equation 7.8. We can rearrange this to show that
#
4GM DLS
θ= .
c2 DL DS
But what would this look like? The background object is exactly behind the lens
and it’s deflected by an angle θ. Is it deflected to the left or right or up or down?
In fact, there is nothing to give the deflection any particular direction, so the
background source is lensed into a ring. These are very rare, but an example is
shown in Figure S7.1.
Exercise 7.4 β 2 +J
4θE2 is always positive, but the square root of it can be
positive or negative. β 2 + 4θE2 > β unless θE = 0, so the negative root must
always give a negative θ. This is indeed a physical solution and represents an
angle measured in the opposite direction: as shown in Figure 7.7, the image is on
the other side of the lens. Note that one image is at θ > θE and the other is at
θ < θE , unless θ = θE and the system is an Einstein ring.
307
Solutions to exercises
Figure S7.1 The gravitational lens 0038+4133 (an Einstein ring) from the
COSMOS survey, taken by the HST . The image is 15%% by 15%% .
Exercise 7.5 From the previous exercise, a source can have multiple images, so
there is not necessarily a unique image position θ for a given source position β.
In mathematical terms, we would speak of the mapping β → θ as being
one-to-many. However, each image position θ does map in a one-to-one way onto
a source position β, i.e. each image position can correspond to only one position
in the background source. To see why, consider Equation 7.4. The function α(θ)
must be a single-valued function, i.e. any particular input θ can give only one
possible output α. Therefore there can be only one value of β for a given input θ.
Exercise 7.6 We’re asked to differentiate Equation 7.12, which gives
dβ/dθ = 1 + (θE2 /θ2 ). This gives us one of the fractions in Equation 7.16. The
magnification is therefore
C D−1 C D−1 C D−1
θ dθ θ θE2 θE2 θE2
= 1+ 2 =θ θ− 1+ 2
β dβ β θ θ θ
C 2
D−1 C 2
D−1 C D−1
θE θE θE θE2
2 θE4
= 1− 2 1+ 2 = 1+ 2 − 2 − 4
θ θ θ θ θ
C D −1
θ4
= 1 − E4 ,
θ
as required.
Exercise 7.7 A negative magnification means that the image is mirror-reversed.
For example, a positive change dβ would have a corresponding dθ in the opposite
direction, so dθ is negative. Therefore dθ/dβ is negative in Equation 7.16.
Exercise 7.8 We start from Equation 7.24. The mass enclosed is Σπξ 2 and we
set ξ = DL θ:
4GM (ξ) 4G 4G
α
K= = 2 × Σπξ 2 = 2 × Σπ × DL θ.
c2 ξ c ξ c
308
Solutions to exercises
Now,
DLS
α= α
K, (Eqn 7.6)
DS
so
4πGΣ DL DLS
α= θ,
c2 DS
as required.
If we then set Σ = Σcr , we find that α(θ) = θ for any θ, so β = 0. This means
that the gravitational lens is acting as a perfect focusing lens! However, this is a
very special case — gravitational lenses in general do not focus light. As ‘lenses’
in the optical sense, they have all forms of aberration, except of course chromatic
aberration since gravitational lensing is strictly achromatic.
Exercise 7.9 From left to right, they are a saddle point, a maximum and a
minimum.
Exercise 7.10 The time delay of the image at the centre increases. In a diagram
like Figure 7.15, the central panel showing the gravitational time delay would be
acquiring a sharper and higher point in the centre. When the lens potential
becomes a singular isothermal sphere, the time delay becomes infinite, so the
image disappears. Photons would take an infinite amount of time to climb out of
the infinitely-deep potential well, and (by symmetry) spend another infinite
amount of time falling in beforehand. But a more thoughtful answer is that this
deep potential well would form a black hole. Right from Equation 7.1, we’ve been
assuming a weak-field limit, so a better answer is that these simple assumptions
break down as the potential becomes more extreme.
Exercise 7.11 The background objects have the same redshift, so we
could think of the luminosity function as differential source counts, thus
dN/dS ∝ S −α . Therefore the number of objects per unit area brighter than a
flux S0 will be N (> S0 ) ∝ S01−α , which we could write as
N (> S0 ) = kS01−α .
If the background galaxies are gravitationally magnified by a factor of µ, the
intrinsic fluxes will be Sintrinsic = S/µ, while the comoving volume sampled will
be smaller by a factor of 1/µ. Therefore the number of galaxies brighter than an
observed flux S0 will be
C D
k S0 1−α
Nlensed (> S0 ) = = kµ−1 S01−α µα−1 = kS01−α µα−2 = N (> S0 ) µα−2 .
µ µ
Therefore for a magnification of µ (where µ > 1), the lensing changes the number
of background galaxies per unit area by a factor of µα−2 . For this factor to be
bigger than 1 we need
µα−2 > 1,
so log(µα−2 ) > log(1) = 0,
thus (α − 2) log(µ) > 0.
We already know that log(µ) > 0 (because µ > 1), so this can happen only if
α > 2. For example, if the source counts have a Euclidean slope (α = 2.5), then
309
Solutions to exercises
lensing would increase the number of objects. The effect of sampling less
volumes due to lensing, and so finding fewer objects than the flux magnification
on its own would suggest, is known as the Broadhurst effect. (See Broadhurst,
T.J., Taylor, A.N. and Peacock, J.A., 1995, Astrophysical Journal, 438, 49.)
Exercise 8.1 There’s no guarantee that the re-emitted photon comes out in the
same direction — in fact, it probably won’t. A corollary is that any Lyman α
cloud should glow faintly in Lyman α light in all directions from these re-emitted
photons, even if the cloud is not intercepting our line of sight to a quasar (because
there will always be some line of sight that does). This re-emission is in general
too faint to detect. However, Lyman α emission can sometimes be seen if there are
internal ionizing sources (e.g. star formation) within damped Lyman α systems,
which you will meet later in the chapter.
Exercise 8.2 The column density through the centre will be the same as that
seen through a cubical cloud with a side 2 Mpc, facing the observer (because the
absorption doesn’t depend on the distribution of material that the light doesn’t
pass through). One Mpc is about 3 × 1024 cm, so we can write the density as
(3 × 1024 )3 cm−3 = 2.7 × 1073 Mpc−3 . The total number of neutral hydrogen
atoms in the cube must be 2.7 × 1073 Mpc−3 × 8 Mpc3 = 21.6 × 1073 , which is
spread over a projected area of 2 × 2 Mpc2 = 36 × 1048 cm2 . Therefore the
column density must be 21.6 × 1073 /(36 × 1048 ) cm−2 1 6 × 1024 cm−2 .
Exercise 8.3 In order for a hydrogen atom to absorb an Hα photon, the
photon must have the right energy, and there must be an atom with an electron in
the n = 2 energy level ready to absorb the photon. This energy level is at
E = −13.6/n2 eV = −13.6/4 eV = −3.4 eV. In order to be in such a state, the
atom must have absorbed a photon of energy (−3.4 eV) − (−13.6 eV) = 10.2 eV.
Photons of this energy require a black body temperature of order
10.2 eV × 1.602 × 10−19 J eV−1
T 1 E/k = = 120 000 K.
1.381 × 10−23 J K−1
This is hotter than the surface of an O star, and is much hotter than the typical
temperatures in the intergalactic medium. Lyman α clouds are too cold to have
many atoms with electrons already excited to the n = 2 level, so the clouds have
almost no Hα absorption.
Exercise 8.4 We can write σ(ν) = σ0 (ν/νlimit )−3 , where
σ0 = 7.88 × 10−22 m−2 , and νlimit is the frequency of the Lyman limit. Writing
Jν = kν −α and plugging the terms in, we find
)∞ )∞ −3
νlimit (σJν /(hν)) dν νlimit (ν/νlimit ) kν −α−1 dν
τ = NH I ) ∞ = N H I σ0 )∞
−α−1 dν
νlimit (Jν /(hν)) dν νlimit kν
) ∞ −α−4
−α−3
NH I σ0 νlimit ν dν NH I σ0 νlimit α
= −3 ) ∞ = −3 −α
νlimit νlimit ν −α−1 dν νlimit α + 3 νlimit
N H I σ0 α
= ,
α+3
where we first cancelled the h terms, then cancelled the k terms. Setting τ > 1,
we find NH I > 1.3 ((α + 3)/α) × 1021 m−2 , as required.
310
Solutions to exercises
Exercise 8.5 Equation 1.28 relates dz/dt to H(z). Taking the modulus
and reciprocal of that equation gives (1 + z) |dt/dz| = 1/H(z).
A population with constant proper sizes has constant A in Equation 8.2,
and a constant comoving density is constant nco in the same equation.
Therefore d2 N ∝ (1 + z)3 |dt/dz| ∝ (1 + z)2 /H(z). If we write
dX/dz = (1 + z)2 H0 /H(z), then
$ $
2
$
2 $ 1 $
$
d N = nco A × (1 + z) c $ dNH I dz
H(z) $
gives
c
d2 N = nco A dX dNH I ,
H0
which is constant.
Exercise 8.6 Gravitational lensing of the background quasar by the damped
Lyman α system could cause such an effect. The strength of this effect, and the
biases that it creates on the measured cosmic evolution of neutral gas, are still the
subject of debate. However, it turns out that this is probably only a 10–20% effect
on ΩH I at z > 2.
Exercise 8.7 Dust in the damped Lyman α systems should redden the quasar
spectra, so one might compare the optical spectral indices or B–V colours of
quasars with and without damped Lyman α absorbers. However, if damped
systems are very dusty, they may induce so much reddening that the quasars drop
out of the parent sample, so bright quasar catalogues would be biased to detecting
low-reddening systems. Statistical analyses suggest that this latter effect does not
dominate, but direct results on quasar reddening are currently conflicting.
Exercise 8.8 The energy of the hydrogen Lyman limit is E = 13.6 eV,
i.e. E = 13.6 × 1.602 × 10−19 J = 2.179 × 10−18 J. This corresponds
to a frequency of ν = E/h, where h is Planck’s constant, which comes
out as ν = 3.289 × 1015 Hz. The wavelength of this light is λ = c/ν,
where c is the speed of light, which comes out as λ = 9.116 × 10−8 m, or
91.2 nm (i.e. 912 Å) to three significant figures. For the helium Lyman limit,
λHe = λ × 13.6/54.4 = 22.8 nm.
The redshifted hydrogen Lyman limit in Figure 8.20 is at a wavelength of
912 × (1 + z) Å = 912 × (1 + 3.2) Å = 3830 Å.
311
Acknowledgements
Grateful acknowledgement is made to the following sources:
Figures
Cover image courtesy of the Spitzer Space Telescope,
c
9NASA/JPL-Caltech/STScI/CXC/UofA/ESA/AURA/JHU;
Figure 1.9: supernova data taken from Blondin, S. et al. (2008) The Astrophysical
Journal, 682, 724; Figure 1.10 top left: [Link] Christian Buil;
Figure 1.10 top right: European Southern Observatory (ESO); Figure 1.10
bottom left: Stanford, S. A. et al. (2000) ‘The first sample of ultraluminous
infrared galaxies at high redshift’, The Astrophysical Journal Supplement Series,
131, 185, The American Astronomical Society; Figure 1.10 bottom right: van
Dokkum, P. G. et al. (2005) ‘Gemini near-infrared spectrograph observations of a
red star-forming galaxy at z = 2.225: evidence of shock ionization due to a
galactic wind’, The Astrophysical Journal, 622, L13, The American Astronomical
Society; Figure 1.11: Carroll, S. M. (2004), ‘Why is the Universe accelerating?’,
Freedman, W. L. ed. Measuring and Modelling the Universe, Carnegie
Observatories Astrophysics Series, 2, Carnegie Observatories; Figure 1.13:
NASA and the Hubble Heritage Team (STScI/AURA); Figure 1.16: Springel, V.
et al. (2005) ‘Simulations of the formation, evolution and clustering of galaxies
and quasars’, Nature, 435, 629 ; Figures 1.18 & 1.19: adapted from Carroll,
S. M., Press, W. H. and Turner, E. L. (1992) ‘The Cosmological Constant’,
c
Annual Review of Astronomy & Astrophysics, 30, 499, 9Annual Reviews Inc.;
Figure 1.20: Adapted from Knop R. A. et al. (2003), ‘New Constraints on ΩM ,
ΩΛ and w from an independent set of 11 high-redshift supernovae observed with
the Hubble Space Telescope’, The Astrophysical Journal, 598, 102, 9The c
American Astronomical Society;
Figures 2.1, 2.2 & 2.9: NASA/WMAP Science Team; Figure 2.3: adapted from
Coc, A. (2009) ‘Big-bang nucleosynthesis: a probe of the early Universe’,
Nuclear Instruments & Methods in Physics Research A, 611, 224, Elsevier
Science BV; Figure 2.4: adapted from a figure by Professor Edward L. Wright,
UCLA; Figures 2.7 & 2.8: Peacock, J. A. (1999) Cosmological Physics,
Cambridge University Press; Figure 2.10: University of Hawaii; Figure 2.11:
Granett, B. R. et al. (2008) ‘An imprint of super-structures on the microwave
background due to the Integrated Sachs–Wolfe effect’, The Astrophysical Journal
Letters, 683, L99, Institute of Physics Publishing; Figure 2.12: adapted from
Dunkley, J. et al. (2009) ‘Five year Wilkinson Microwave Anisotropy Probe
(WMAP1) observations: likelihoods and parameters from the WMAP data’,
Astrophysical Journal Supplement Series, 180, 306, Institute of Physics
Publishing; Figures 2.13 & 2.15: Hu, W. and Dodelson, S. (2002) ‘Cosmic
microwave background anisotropies’, Annual Reviews of Astronomy &
Astrophysics, 40, 171, Annual Reviews; Figures 2.14 & 3.7: adapted from figures
by Edward L. Wright, UCLA and based on data from Kowalski, M. et al. (2009)
The Astrophysical Journal Supplement Series, 686, 749; Figure 2.16: adapted
from Larson, D. et al. (2010) ‘Seven year Wilkinson Microwave Anisotropy
Probe (WMAP1) observations: power spectra and WMAP-derived parameters’,
Astrophysical Journal Supplement Series (in press, arXiv:1001.4635), Institute of
Physics Publishing; Figures 2.17 & 2.18: adapted from Komatsu, E. et al. (2009)
312
Acknowledgements
from Dey, A. et al. (1998) ‘A galaxy at z = 5.34’, The Astrophysical Journal, 498,
L93, The American Astronomical Society; Figure 4.11: adapted from Bell, E. F.
et al. (2003) ‘The optical and near infrared properties of galaxies. 1. luminosity
and stellar mass functions’, The Astrophysical Journal Supplement Series, 149,
289, The American Astronomical Society; Figure 4.12: NRAO; Figure 4.13:
Sloan Digital Sky Survey; Figure 4.14: adapted from Yates, M. G. and Garden R.
P. (1989) ‘Near-simultaneous optical and infrared spectrophotometry of active
galaxies’, Monthly Notices of the Royal Astronomical Society, 241, 167, The
Royal Astronomical Society; Figure 4.16: adapted from Figure 2.3 of Peterson, B.
M. (1997) An Introduction to Active Galactic Nuclei, Cambridge University Press;
Figure 4.17: adapted from Richards, G. T. et al. (2006), ‘The Sloan Digital
Sky Survey Quasar Survey: quasar luminosity function from data release 3’,
The Astronomical Journal, 131, 2766, The American Astronomical Society;
Figure 4.18 left: A. Fujii; Figure 4.18 right: R. Williams (STScI), the Hubble
Deep Field Team and NASA; Figure 4.19: Robert Williams and the Hubble Deep
Field Team (STScI) and NASA; Figure 4.20: NASA/ESA, CXC, JPL-Caltech,
STScI, NAOJ, J. E. Greach (Univ Durham) et al.; Figure 4.21: adapted from
Gabasch, A. et al. (2004) ‘The evolution of the luminosity functions in the
FORS deep field from low to high redshift’, Astronomy & Astrophysics, 421, 41,
ESO; Figure 4.22: NASA, ESA, S. Beckwith (STScI) and the HUDF Team;
Figures 4.23 & 4.24: adapted from Bouwens, R. J. et al. (2009) ‘Constraints on
the first galaxies: z 10 Galaxy Candidates from HST WFC3/IR’, Submitted to
Nature (arXiv:0912.4263); Figure 4.25: adapted from Cohen, J. G. et al. (1996)
‘Redshift clustering in the Hubble Deep Field’, The Astrophysical Journal, 471, 5,
The American Astronomical Society; Figure 4.26: adapted from Bouwens, R. J.
et al. (2004) ‘Galaxy size evolution at high redshift and surface brightness
selection effects:constraints from the Hubble Ultra Deep Field’, The Astrophysical
Journal, 611, 1. The American Astronomical Society; Figure 4.27: adapted from
van Dokkum, P. G., Kriek, M. and Franx, M. (2009) ‘A high stellar velocity
dispersion for a compact massive galaxy at redshift z = 2.186’, Nature , 460, 717,
Macmillan Publishers Limited; Figure 4.28: NASA Jet Propulsion Laboratory
(NASA-JPL); Figure 4.29: H. Ferguson, M. Dickinson, R. Williams, STScI and
NASA; Figure 4.30: adapted from Bell, E. F. et al. (2004) ‘Nearly 5000 distant
early type galaxies in COMBO-17: a red sequence and its evolution since z 1’,
The Astrophysical Journal, 608, 752, The American Astronomical Society;
Figure 5.1: adapted from Hauser, M. G. and Dwek, E. (2001) ‘The Cosmic
Infrared Background: Measurements and Implications’, Annual Review of
Astronomy & Astrophysics, 39, 249, Annual Reviews Inc; Figure 5.2: adapted
from Hopwood, R. H. et al. ‘Ultra deep AKARI observations of Abell 2218:
resolving the 15 m extragalactic background light’, Astrophysical Journal Letters,
716, 45; Figure 5.3: [Link] Figure 5.4: adapted from
Blain, A. W. et al. (2002) ‘Submillimeter galaxies’, Physics Reports, 369,
111, Elsevier Science B.V.; Figure 5.5: adapted from Hughes D. H. et al.
(1998) ‘High-redshift star formation in the Hubble Deep Field revealed by
a submillimetre-wavelength survey’, Nature, 394, 241; Figure 5.6: BLAST
Collaboration; Figure 5.7: ESA and SPIRE Consortium; Figure 5.8: adapted from
Serjeant, S. et al. (1998) ‘A spectroscopic study of IRAS F10214+4724’, Monthly
Notices of the Royal Astronomical Society, 298, 321, Royal Astronomical Society;
Figure 5.9: adapted from Surace, J. A. et al. (1998) ‘HST/WFPC2 Observations
314
Acknowledgements
315
Acknowledgements
Digital Sky Survey II-data release 7’, Astronomy & Astrophysics, 505, 1087,
European Southern Observatory; Figure 8.10: Prochaska, J. X. et al. (2005)
‘The SDSS damped Ly alpha survey: data release 3’, Astrophysical Journal,
635, 123, The American Astronomical Society; Figure 8.12: Reynolds, S.
C. (2007) ‘Quasar Absorbers and the InterGalactic Medium’, taken from a
pedagogical Seminar at the Royal Observatory, Edinburgh, 8 March 2007,
[Link]/ifa/postgrad/pedagogy/2007− [Link]; Figure 8.13: Möller,
P. and Warren, S. J. (1993) ‘Emission from a damped Ly alpha absorber at
z = 2.81’, Astronomy & Astrophysics, 270, 43, European Southern Observatory;
Figure 8.15: Smette, A. et al. (1992) ‘A spectroscopic study of UM 673 A & B:
on the size of the Lyman-alpha clouds’, Astrophysical Journal, 389, 39, The
American Astronomical Society; Figure 8.16: Nick Gnedin, Department of
Astronomy & Astrophysics, The University of Chicago; Figures 8.17 & 8.19:
Fan, X. et al. (2006) ‘Observational constraints on cosmic reionization’, Annual
Review of Astronomy & Astrophysics, 44, 415 92006c by Annual Reviews;
Figure 8.18: Becker, G. D. et al. (2007) The evolution of optical depth in the Ly
alpha Forest: evidence against reionization at z 6, The Astrophysical Journal, 662,
72, The American Astronomical Society; Figure 8.20: adapted from Möller, P.
and Jakobsen, P. (1990) ‘The Lyman continuum opacity at high redshifts: through
the Lyman forest and beyond the Lyman valley’, Astronomy & Astrophysics, 228,
299, European Southern Observatory; Figure 8.21: Smette, A. et al. (2002)
‘Hubble Space Telescope Space Telescope Imaging System Observations of the
He II Gunn–Peterson effect toward HE 2347-4342’, Astrophysical Journal,
564, 542, The American Astronomical Society; Figure 8.23: Carilli, C. L. et
al. (2002) ‘H I 21 centimeter absorption beyond the epoch of reionization’,
The Astrophysical Journal, 577, 22, The American Astronomical Society;
Figures 8.24 & 8.25: Cristiani, S. et al. (2007) ‘The CODEX-ESPRESSO
experiment: cosmic dynamics, fundamental physics, planets and much more . . .’,
Il Nuovo Cimento, 122B, 1165, Societa Italiana di Fisica.
Every effort has been made to contact copyright holders. If any have been
inadvertently overlooked the publishers will be pleased to make the necessary
arrangements at the first opportunity.
317
Index
Items that appear in the Glossary have page numbers in bold type. Ordinary
index items have page numbers in Roman type.
21 cm forest, 279 axion, 93 break luminosity, 135
21 cm Gunn–Peterson test, 277 bremsstrahlung, 99
B stars, 132, 154, 169
21 cm transition, 277 brightest cluster galaxy, 100, 178
Baldwin–Phillips–Terlevich diagram,
2dF galaxy redshift survey, 110 broad line region, 194
142
2dF quasar survey, 116 brown dwarfs, 38, 92
Balmer decrement, 132
3C273, 28, 139, 140 Bullet cluster, 248
Balmer line, 132, 169
Butcher–Oemler effect, 104, 176
Abell cluster catalogue, 100 Balmer series, 256
BzK galaxies, 146, 166
Abell 1835 galaxy, 247 baryogenesis, 46
Abell 2218 galaxy cluster, 216, 246 baryon asymmetry, 46 calculus of variations, 188
absorption distance, 260 baryon density, 24, 47, 72 Calzetti extinction law, 148
acceleration four-vector, 288 baryon drag, 73 Canis Major dwarf galaxy, 108
accretion efficiency, 188 baryon number, 46 Cartesian coordinates, 16
accretion luminosity, 186 baryon wiggles, 116, 245, 277 causality, 15, 53
achromatic, 217 baryonic acoustic oscillations, 116, caustics, 231, 235
acoustic peaks, 72, 76 246 CDM, 121, 240, 257
action, 188 baryosynthesis, 46 central engine, 141
active galactic nuclei, 141 B-band filter, 96 central limit theorem, 138
active galaxies, 155, 209 BCGs, 178 Cepheid variables, 105, 106
ADAFs, 187 beam, 149 chain galaxies, 144, 154
adaptive optics, 196 BeppoSAX, 205 Chandra, 205, 248
adiabatic, 65 bias parameter, 115 Chandra Deep Field North, 173, 206
adiabatic expansion, 44, 56 Big Bang nucleosynthesis, 46, 238 Chandra Deep Field South, 165, 173,
adiabatic perturbations, 66 big rip, 87 206
advection, 187 binary pulsar, 211 Chandra space telescope, 205
advection-dominated accretion flows, binary supermassive black holes, 212 chronology protection conjecture, 185
187 Birkhoff’s theorem, 123, 183, 228 cirrus confusion noise, 170
age of the Universe, 25, 26 black body, 45 cirrus dust, 170
AGN, 170, 180, 205 black body radiation, 42 CLASS, 249
AKARI space telescope, 170, 172 black body spectrum, 40 cloud-in-cloud problem, 128
ALMA, 165 black hole accretion, 273 Cloverleaf lens, 249
Andromeda galaxy, 109, 154, 196 black hole mass density, 194 CMB, 17, 19, 40, 160, 253, 257, 258,
angular correlation function, 114, 126 black holes, 37, 38, 55, 115, 140, 141, 270
angular diameter distance, 32, 34, 99, 155, 159, 170, 172, 183, CMB photons, 100
105, 220 237, 273 CMB power spectrum, 68, 211
annihilation, 47 Blandford and Kochanek elliptical CMB spectrum, 100
density profile, 236 CO emission, 98
Antennae galaxies, 168
BLAST, 164 CO molecules, 98
anthropic principle, 78
blazars, 141 COBE satellite, 42, 45, 68, 257, 276
anti-hierarchical, 178
blue cloud, 154, 156 CODEX, 278, 279
apparent recession velocity, 22, 32
blurring, 126 cold dark matter, 121
Arp 220 galaxy, 174
B-mode, 79 colour–density relation, 155
ASCA, 205
bolometric correction, 194 colour–magnitude diagram, 154
associated Legendre polynomials, 67
bolometric luminosity, 33 column density, 132, 204, 254,
astronomical filters, 97
Boltzmann distribution, 48 258–261, 264
ATIC, 93
bottom-up structure formation, 122 Coma, 109
318
Index
COMBO-17 survey, 135, 154 cycloid equations, 123 Dust Obscured Galaxies, 166
comoving coordinates, 29, 31 cycloid solution, 123 duty cycle, 207
comoving distance, 29, 30, 31, 34, Cygnus A, 139
100, 220 early Integrated Sachs–Wolfe effect,
damped Lyman α systems, 261 74
comoving volume, 34
dark energy, 84, 88 early-type galaxy, 94
comoving volume derivative, 35
dark energy density, 84 Eddington limit, 187, 193
complete sample, 137
dark matter, 24, 84, 92, 99, 121, 128, Eddington luminosity, 186
complex numbers, 61
236 Eddington ratios, 209, 210
Compton scattering, 101, 204
dark matter density, 72 Eddington timescale, 187
Compton y parameter, 102
dark matter haloes, 115 Einstein Cross, 249
Compton-thick active galaxies, 204,
206 dark sector, 84 Einstein radius, 221, 228, 237
concordance cosmology, 78 Darwin mission, 172 Einstein ring, 222
confusion limit, 149, 151 de Sitter spacetime, 36 Einstein tensor, 83
conservation of angular momentum, de Sitter universe, 87 Einstein’s field equations, 19, 183
218 de Vaucouleurs law, 98 Einstein–de Sitter model, 27, 124
conservation of energy, 19 deceleration parameter, 23 ekpyrotic Universe, 81
convergence, 241 density fluctuations, 257 elliptical galaxies, 98, 102, 144, 152
convolution, 126, 127 density parameters, 23, 23, 27, 28 eMERLIN, 173, 229
cooling flow problem, 105 density perturbations, 55, 63, 81 E-mode, 79
coordinates DES, 244 energy conservation, 44, 224, 225
comoving, 29, 31 deuterium, 51, 257 energy density, 43, 44, 59, 86
proper, 29, 31 deuterium abundance, 51, 257 energy–momentum tensor, 44, 83
correlation function, 113 deuteron, 49 entropy per baryon, 47, 47
cosmic censorship hypothesis, 185 differential magnification, 218 equation of state, 56, 59, 83, 84
cosmic microwave background, 17, 19, differential number counts, 120 equivalent width, 263
40, 160, 253, 257, 258, 270 differential source counts, 13, 120 ERO, 146
cosmic near-infrared background, 276 diffraction limit, 165, 171 EROS, 239
cosmic rest frame, 17 diffusion damping, 73 escape fraction, 273
cosmic shear, 240, 241, 277 Digitized Sky Survey, 151 EUCLID, 245
cosmic star formation history, 146, dimensionless frequency, 102 Euclidean-normalized differential
148, 148, 162, 174, 175, 178 dimensionless power spectrum, 64 source counts, 160
cosmic time, 17 of galaxies, 113 Euclidean source count, 120, 160
cosmic variance, 68 dimensionless scale factor, 20, 124, Euclidean space, 12
cosmic X-ray background, 203 126 Euler–Lagrange equation, 190, 192
Cosmical Dynamics Experiment, 279 dipole, 70 European Extremely Large Telescope,
cosmological constant, 19, 27, 28, 36, DIRBE, 276 279
83, 84, 86, 107 Distant Red Galaxies, 166 event, 14
cosmological event horizon, 37 Dn –σ relation, 99 event horizon, 37, 53, 184
Cosmological Evolution Survey, 243 DOGs, 166 e-VLA, 173
cosmological redshift, 20 Doppler peaks, 72 exoplanet, 240
cosmological time dilation, 20, 33 dormant quasars, 194 expansion of the Universe, 278, 279
COSMOS, 151, 243–245, 251 double quasars, 249 extinction, 131, 148
critical density, 23, 41, 47 downsizing, 178 extragalactic background, 159, 162
critical lines, 235 DRGs, 166 Extremely Red Objects, 146, 166
cross section, 102, 254 dry merger, 157 extrinsic curvature, 70
CTIO, 244 DSS, 151 Faber–Jackson relation, 98, 207
Curie temperature, 54 dust, 56, 129, 131, 148, 159, 168, 175, failed dwarf galaxies, 129
curvature, 16, 18 204, 218 faint blue galaxies problem, 120
curve of growth, 264 tori, 172 false vacuum, 57
319
Index
320
Index
invariant, 14, 287 light element abundances, 50 Madau diagram, 148, 149, 156, 208,
inverse magnification tensor, 234, 241 lightcone, 16 269
ionization, 41 light-travel distance, 29 Madau–Lilly diagram, 148
ionization parameter, 266 LIGO, 212 Maffei 1 Group, 109
ionizing background, J−21 , 266, 267 limb darkening, 238 Magellanic Clouds, 108
ionizing radiation, 267 Limber’s equation, 115 magnetic monopole, 54
IRAS, 166 linear regime, 63 magnification, 224
IRAS FSC 10214+4724 galaxy, 166, linear theory, 124 magnification bias, 225, 225
206, 217–219 LINERs, 142 magnification tensor, 226
iron Kα line, 205 Liouville’s theorem, 223 Magorrian relation, 201, 202, 267
IRTS, 276 LIRGs, 166 Malmquist bias, 174, 174
ISO, 167 LISA, 212 masers, 197
isotropic, 17, 19 lithium abundance, 52 mass–energy equivalence, 285
IXO, 205 Local Group, 109–111 mass function, 127
I Zw 18 galaxy, 255 local supercluster, 109 mass–metallicity relation, 157
Lockman hole, 206 massive compact halo objects, 237
James Clerk Maxwell Telescope, 162 lookback time, 26, 30 mass-to-light ratio, 96
James Webb Space Telescope, 172 Lorentz contraction, 285 matter density, 19, 24, 47
jansky, 160 Lorentz transformation, 17, 70, 223, matter overdensity, 126
JDEM, 245 227, 285, 286 matter power spectrum, 64
Jeans mass, 103 Lorentzian profile, 262 matter–antimatter asymmetry, 59
jet-induced star formation, 180 low surface brightness galaxies, 95, 97 megamasers, 197
jets, 141, 180, 199 LSB galaxies, 95, 97 merger rate, 143
JVAS, 249 LSST, 244, 246 MERLIN, 173, 249, 251
JWST, 172, 275 luminosity density, 161 Mészáros effect, 121
luminosity distance, 32, 105 metallicity, 96
Kaiser effect, 112
luminosity function, 135, 148 metric, 14
Kα line, 198
luminous infrared galaxies, 166 metric coefficients, 14
K-band filter, 96
Lyman α absorption line, 254 metric tensor, 14, 287
K-correction, 34, 162, 175, 204
Lyman α blobs, 146 MGC-6-30-15 galaxy, 198
Kelvin–Helmholtz instability, 179
Lyman α cloud, 255, 259, 263, 269, microlensing, 237
Kepler’s second law, 218
271 millisecond pulsars, 278
kernel, 127
Lyman α emission line, 253 Minkowski spacetime, 287
Kerr metric, 184
Lyman α forest, 253, 254, 269, 271, mixed dark matter, 121
kinetic mode, 180
273 MOA, 239
kinetic S–Z, 101
Lyman β transition, 256 modified Newtonian dynamics, 248
Lagrangian, 188 Lyman break galaxies, 133, 146 momentum conservation, 44
Large Magellanic Cloud, 108, 237 Lyman limit, 133, 274 momentum four-vector, 289, 290
large-scale structure, 19, 29, 40, 46, Lyman limit absorption, 256 MOND, 248
74, 81, 108, 121, 126, 152, Lyman series, 133, 256 monolithic collapse model, 103, 153,
240, 242, 257 177
laser guide star, 196 M31, 196 monopole problem, 54, 55
laser interferometry, 212 M33, 209 Moon, 203
late-time Integrated Sachs–Wolfe M57, 22 morphological K-correction, 154
effect, 74 M81 Group, 109 morphology–density relation, 102
late-type galaxy, 94 M82, 163, 174 M-theory, 21
Legendre polynomial functions, 67 M83 Group, 109 multiple images, 231
lens equation, 221, 232 M87, 197 multiverse, 79
lepton asymmetry, 46 M106, 197
light echo, 106 MACHOs, 92, 237, 239 natural units, 58
321
Index
322
Index
SASSy, 250 spacetime interval, 14, 15 thermal radiation from dust, 159
scalar field, 57, 85, 88 sparse sampling, 137 thermalization, 45
scale factor, 16, 16, 19, 21, 31, 53, 87 spatial flatness, 56 thermal S–Z, 101
scale height, 97 spatially homogeneous, 17 thick disc, 97
scale length, 97 special relativity, 14, 285 thin lens approximation, 220
scale-invariance, 63 spectral energy distribution, 162 Thomson scattering, 40, 76, 102, 123,
scale-invariant potential fluctuations, spectral indices, 81 186, 204, 257, 270
64 spherical coordinates, 16 cross section, 186
scale-invariant spectrum, 63, 69 spherical harmonics, 67 three-vector, 227
Schechter function, 135, 146 SPICA, 171, 172 three-velocity, 288
Schmidt law, 97 spiral galaxies, 96, 102, 152 time delay surface, 233
Schwarzschild metric, 183, 192, 230 spiral galaxy collisions, 102, 152 time dilation, 285
Schwarzschild radius, 123, 183, 195 spiral–spiral mergers, 102, 152 time-like interval, 15
SCUBA, 162 Spitzer Space Telescope, 165, 167 tip of the red giant branch, 105
Sculptor Group, 109 SQLS, 249 tired light universe, 20
SDSS, 201 stacking analysis, 75 Tolman surface brightness test, 95, 144
second acoustic peak, 73 standard candle, 105, 212 topology, 69, 70
selection effects, 138 standard rod, 73, 105 torus, 141, 206
selection function, 136 standard siren, 212 transfer function, 126
self-shielding, 256 star formation, 103, 155, 179, 273 transverse proximity effect, 267
semi-analytic model, 128, 179 star formation history, 169 true vacuum, 58
semi-analytic modelling, 162 star formation rate, 96, 97, 175, 265 Tully–Fisher relation, 96, 106, 149
Sérsic profile, 98 starburst, 103 turn-around time, 123
Seyfert 1 galaxies, 141, 187, 199 starburst galaxies, 179, 206 21 cm forest, 277, 279
Seyfert 2 galaxies, 141 star-forming galaxies, 154, 172, 176 21 cm Gunn–Peterson test, 277
Sgr A*, 187 Steady State models, 45 21 cm transition, 277
Shapiro delay, 231 Stefan–Boltzmann law, 43 Two Micron All-Sky Survey, 108
shear, 241 Strömgren spheres, 270 type 1 active galaxies, 194, 207
σ0 , 23 strong anthropic principle, 79 type 1 AGN, 141
Silk damping, 73 submillimetre galaxies, 163 type 2 active galaxies, 207
singular isothermal sphere, 228, 229, Sunyaev–Zel’dovich effect, 101 type 2 AGN, 141, 203
234, 246 super-Eddington accretion, 187 type Ia supernovae, 106, 245
SKA, 245, 246, 277, 279 supermassive black hole, 140, 142,
SLACS, 250 155, 195, 202 U-band dropouts, 133
Sloan Digital Sky Survey, 110, 133, Supernova Cosmology Project, 107 UDF, 206
201 supernovae, 20, 21, 78, 170, 179 ULIRGs, 166, 167, 178, 209
SLOAN survey, 249 supersymmetry, 21 Ultra-Deep Field, 173
slow-roll approximation, 60 surface brightness, 11, 34, 95, 216, ultra-deep survey, 144
Small Magellanic Cloud, 108, 197 222, 224, 253 ultraluminous infrared galaxies, 166
SMGs, 164, 165, 176, 178, 209 surface brightness fluctuation, 106 ultraluminous X-ray sources, 207
smoothing, 126 surface mass density, 227 uncertainty principle, 262
SN 1987A, 106 surface of last scattering, 40 unified model, 141
softened isothermal sphere, 234 synchrotron radiation, 170 unobscured AGN, 205
Solar System, 18 S–Z effect, 101 unsharp masking, 180
sound horizon, 72
source count model, 162 tensor, 44, 226 vector field, 57
source counts, 12, 150 Tensor–Vector–Scalar theory, 248 velocity dispersion, 153, 208
source plane, 235 TeVeS, 248 velocity four-vector, 288
space-like interval, 15 thermal broadening, 263 velocity width, 96
spacetime curvature, 220 thermal radiation, 169 Very Large Array, 173
323
Index
Very Long Baseline Array, 197 warm dark matter, 121 WIMP, 93
Virgo cluster, 22, 109–111 water maser, 106 WMAP measurements, 75, 78
virial equilibrium, 125 wave four-vector, 290 WMAP satellite, 24, 42, 68, 81, 258
virial theorem, 99, 104 wave number, 61, 62 wrap-around scales, 70
virialized assemblage, 99 wave number vector, 63 wrap-around topology, 70
VLA, 173 weak anthropic principle, 79
VLBA, 197 white dwarfs, 38, 106 X-ray background, 203, 273
voids, 74 white holes, 185 X-ray spectral paradox, 203
Voigt profile, 263 wide-field survey, 151 Xallarap, 239
volume-limited sample, 135 Wien regime, 41 XBONGs, 205
XMM-Newton, 173, 205
324
The luminosity function provides insights into galaxy evolution by describing the distribution of galaxies' brightness across comoving volumes. It is fundamental for deriving information about the number density and evolution of galaxies over time. Through techniques like the 1/Vmax statistic and studying the evolution of the rest-frame ultraviolet luminosity density, researchers can track how galaxy populations change. Peaks in the redshift histograms from surveys like those of the Hubble Deep Fields reflect large-scale structures, further elucidating evolutionary patterns .
Redshift affects the understanding of matter-radiation equality by relating the scale factor of the universe to the density and dominant energy forms over cosmic time. The redshift at matter-radiation equality is given by the formula 1 + zeq = 23,800 Ωm,0 h^2 (TCMB,0/2.725 K)^−4, where Ωm,0 is the normalized matter density, h is the Hubble parameter, and TCMB,0 is the current CMB temperature . This expression shows how the energy densities of matter and radiation evolve differently with redshift, thereby impacting when they become equal.
The Magorrian relation is significant because it reveals a strong correlation between supermassive black hole masses and the luminosity of their host galaxy's bulge. This implies a deep connection between the growth of black holes and galaxy formation. The tight correlation suggests that the processes governing star formation and black hole accretion are linked, possibly through feedback mechanisms influencing star formation and gas inflow .
Hubble's Law helps determine the universe's early state by establishing a relationship between the distance of galaxies and their redshifts, suggesting the universe's expansion. This law, expressed as v = H0d, where v is the recession velocity, d is the distance, and H0 is the Hubble constant, implies that the early universe was more compact and dense, supporting the Big Bang theory and the expansion of space over time .
The correlations between supermassive black holes and host galaxy properties, like the Magorrian relation and the MBH-σ relation, imply a co-evolutionary history that informs galaxy evolution theories. These correlations suggest feedback mechanisms where black hole growth regulates star formation in galaxies, affecting their development. The tightness of these correlations indicates that galaxy and black hole growth are interlinked processes, challenging traditional hierarchical models that view them as distinct events .
The cosmic microwave background's uniformity presents the horizon problem, challenging our understanding of early universe conditions. Given its uniform temperature over vast angular sizes, regions have had no causal contact since the universe's creation. This contradicts standard Big Bang cosmology, which suggests homogeneous regions should have been causally disconnected. The problem implies the need for inflationary theory or other mechanisms to explain how distinct regions achieved thermal equilibrium .
Astronomers face challenges like the faintness of high-redshift galaxies and redshift confusion when identifying these objects. They overcome these challenges using photometric redshifts, which analyse observed colors to infer redshifts. Additionally, surveys use the Lyman break technique, exploiting the galaxy's spectral dropout at certain redshifts, to efficiently select high-redshift candidates. These methods mitigate issues related to cosmic dust, telescope sensitivity, and the evolved luminosity function .
Magnetic monopoles pose a problem in cosmology because Grand Unified Theories (GUTs) predict their formation in the early universe when the GUT symmetry breaks. Their expected abundance, given the small horizon size at that time, should have resulted in a dominant presence in the present-day universe, contributing significantly to its energy density. However, observationally, magnetic monopoles are not detected, which contradicts these predictions and suggests issues with our understanding of the universe's early conditions .
Extremely Red Objects (EROs) are crucial for understanding galaxy formation and evolution due to their distinct optical and infrared properties, suggesting high mass and potential association with more massive dark matter haloes. EROs include both dusty star-forming galaxies and old stellar populations. Their strong clustering indicates significant structures and informs us about the complex processes of star formation and galaxy assembly at high redshifts, offering insights into the chronological sequence of galaxy formation .
Lyman α clouds are important because they serve as tracers of the universe's underlying matter distribution. They are less affected by complex physical processes that influence galaxy formation, such as non-linear gravitational collapse and feedback, allowing them to provide a clearer indication of large-scale structures. By studying these clouds, researchers can test cosmological models and gain insights into the distribution of baryonic matter .