Coaxing Cosmic 21cm Fluctuations from the Polarized Sky using m-mode Analysis J. Richard Shaw,1,∗ Kris Sigurdson,2 Michael Sitwell,2 Albert Stebbins,3 and Ue-Li Pen1 1Canadian Institute for Theoretical Astrophysics, 60 St. George St., Toronto, ON M5S 3H8, Canada 2Department of Physics and Astronomy, University of British Columbia, Vancouver, BC V6T 1Z1, Canada 3Theoretical Astrophysics Group, Fermi National Accelerator Laboratory, Batavia, IL 60510, USA Inthispaperwecontinuetodevelopthem-modeformalism,atechniqueforefficientandoptimal analysisofwide-fieldtransitradiotelescopes,targetedat21cmcosmology. Weextendthisformalism to give an accurate treatment of the polarised sky, fully accounting for the effects of polarisation leakage and cross-polarisation. We use the geometry of the measured set of visibilities to project down to pure temperature modes on the sky, serving as a significant compression, and an effective first filter of polarised contaminants. As in our previous work, we use the m-mode formalism 4 with the Karhunen-Lo`eve transform to give a highly efficient method for foreground cleaning, and 1 demonstrate its success in cleaning realistic polarised skies observed with an instrument suffering 0 from substantial off axis polarisation leakage. We develop an optimal quadratic estimator in the 2 m-mode formalism, which can be efficiently calculated using a Monte-Carlo technique. This is used to assess the implications of foreground removal for power spectrum constraints where we n findthatourmethodcancleanforegroundswellbelowtheforegroundwedge, renderingonlyscales a J k(cid:107) < 0.02 hMpc−1 inaccessible. As this approach assumes perfect knowledge of the telescope, we perform a conservative test of how essential this is by simulating and analysing datasets with 9 deviationsaboutourassumedtelescope. Assumingnoothertechniquestomitigatebiasareapplied, we find we recover unbiased power spectra when the per-feed beam width to be measured to 0.1%, ] O and amplifier gains to be known to 1% within each minute. Finally, as an example application, we extend our forecasts to a wideband 400–800MHz cosmological observation and consider the C implications for probing dark energy, finding a pathfinder-scale medium-sized cylinder telescope . h improves the DETF Figure of Merit by around 70% over Planck and Stage II experiments alone. p - o I. INTRODUCTION with GMRT [5], HERA [6], LOFAR [7], MWA [8], MI- r t TEoR[9]andPAPER[10]targetingtheEpochofReion- as Recent years have seen a surge in excitement at the isationandBAOBAB[11],BAORadio[12],BINGO[13], [ promise of radio cosmology. By using low frequency ob- CHIME[14],EMBRACE/EMMA[15],Ooty[16],Parkes servations of the 21cm line we can survey the distribu- [17] and Tianlai [18] aiming at the low redshift intensity 1 tionofneutralhydrogenthroughoutlargevolumesofthe mapping era. v 5 Universe. Radiointerferometersprovideanefficient,and In this paper we will focus on the low redshift, in- 9 cost effective method for doing this. tensity mapping epoch, though most of the results and 0 This transformation of radio interferometers into sur- techniques we describe apply equally well at higher red- 2 vey instruments has been driven by recent technological shift. Observationsattheselowredshiftstargetthesame 1. advances, particularly in the cheap low-noise amplifiers science as spectroscopic galaxy redshift surveys such as 0 required for mobile phones, and the constant progress [19, 20]: measuring Baryon Acoustic Oscillations and 4 ofMoore’slawmakinglarge, highbandwidthcorrelators through them probing the expansion history of the Uni- 1 economical. By correlating a large number of low cost verse [21–23]. However, they are very complimentary, v: feeds in a compact area we can produce a telescope ide- withradioobservationsprobingalargervolumeathigher i ally suited for wide-field surveys. redshift, with a completely different set of systematics. X There are three main epochs we can observe: low red- To make effective use of this new generation of radio r shift (z < 4), where we observe the large scale emission interferometers, we must develop new methods of inter- a from un∼resolved galaxies, a technique termed intensity preting and analysing their data. Progress has acceler- mapping [1, 2]; the Epoch of Reionisation (z 6–10) ated in recent years with many developments [24–31]. wheretheneutralIGMiseatenawaybythefirst∼ionising In a previous paper [32] we developed a new techique sources [3]; andperhapseven theprimordial structure in fortheanalysisofdatafromtheseexperimentscalledthe the dark ages (z > 30), though observations at these m-mode formalism. This method departs from the usual very low frequenci∼es (ν < 50MHz) will be extremely interferometric analysis—making no flat-sky or small challenging [4]. These eras are of huge cosmological im- field approximations—at the expense of being limited to portance, a fact reflected in the large number of current transit telescopes for which it is an exact treatment. It and planned experiments targeting 21cm observations, also brings computational advantages by allowing us to breakthedataintouncorrelatedm-modes,makingitfea- sible to treat the full statistics of the data. This opens up the possibility of performing optimal map making, ∗ [email protected] foreground subtraction, and power spectrum estimation, 2 which would be extremely difficult otherwise. and demonstrate its effectiveness on simulated polarised Perhapstheforemostchallengefacing21cmcosmology skies. In Section VIII we construct an optimal power is the presence of bright astrophysical radio sources at spectrum estimator in the m-mode formalism, which we frequenciesbelow1.4GHzwhicharearoundsixordersof use to study the performance of our foreground filter magnitude brighter than the 21cm signal. This emission (Section IX). In Section X we use this estimator to show comes mainly from synchrotron radiation, which is spec- how instrumental uncertainties give rise to power spec- trally smooth, and in principle this allows it to be sepa- trum biases. Finally we forecast the performance of our ratedfromthe21cmasitisdescribedbyasmallnumber example telescope at measuring the expansion history of of modes [33] and these can simply be removed. The re- the Universe and constraining the nature of dark energy mainingmodes,whichhavesignificantspectralstructure (Section XI). are assumed to be free of contamination. Unfortunately, thispictureiscomplicatedbytherealitiesofradioobser- vation: II. POLARISED TRANSIT TELESCOPE ANALYSIS Frequency dependent beams lead to mixing of an- • gular structure into spectral structure which con- In this section we develop a fully polarised version of taminates the foreground clean modes [34]. This problem, known as mode mixing, means that look- them-modeformalism,anewmethodforanalysingtran- ing at only the frequency direction of our data is sitinterferometersthatwasfirstintroducedinaprevious paper [32]. That treatment encapsulates all the essential insufficient to separate these two signals. ideasbutavoidstheaddedcomplexityoftrackingthepo- Synchrotronemissionfromourgalaxyishighlypo- larisation, and is a useful introduction to the full treat- • larised, and though the totally intensity is spec- ment given here. Polarised descriptions of full-sky inter- trally smooth, Faraday rotation by the magnetic ferometry have been given elsewhere (notably [36, 37]), interstellarmediummeansthatthepolarisedemis- but here we develop the transit telescope limit. sion is not. Unfortunately, the complicated polari- Any transit telescope can be viewed as a collection of sationresponseofrealtelescopesirreversiblymixes feeds, fixed relative to the ground frame. Each feed, F i some fraction of the polarised sky, introducing sig- measuresacombinationoftheelectricfieldE (nˆ)coming a nificant frequency fluctuations [35]. As the emis- fromvariousdirectionsonthesky. Inordertoaccurately sioncomesfromarangeofFaradaydepths,wecan- treat the polarisation when the response varies over the not simply de-rotate the emission. sky, we need to be able to keep track of the contribution from each direction to the electric field at a point. In Fundamentallytherearestillthesamenumberoflarge order to do this we define ε as the electric field density foregroundmodes,modemixingonlymakesthemharder in a frequency interval dν and solid angle d2nˆ by to identify. In [32] we developed a foreground removal techniquebasedontheKarhunen-Lo`eve(KL)transform. dE =(µ c)1/2ε(nˆ,ν)d2nˆdν . (1) Thisusesthefullcovariancestatisticsofthecontaminat- 0 ing foregrounds to find an optimal separation from the With this definition the Poynting flux is conveniently 21cm signal, fully accounting for this mode mixing ef- written as fect. However, the technique presented there was lim- ited in two important ways: no attempt was made to 1 S = E H address the problem of polarised foregrounds; and it as- p µ c × 0 sumed that we have full knowledge of the properties of (cid:90) our instrument, including the full polarised response of = d2nˆd2nˆ(cid:48)dνdν(cid:48)nˆ(cid:10)ε(nˆ) ε(nˆ(cid:48))(cid:11) . (2) · theprimarybeam,andanyper-feedamplitudegainsand phaseshiftsintroducedinthereceiversystem. Inthispa- Radio emission from the sky is generally incoherent and per we continue to develop both the m-mode formalism so we can write the correlations of ε explicitly in terms and KL transform for foreground cleaning, with particu- of the Stokes parameters lar emphasis on investigating these two limitations. a fWulel tsrteaarttmbeyntexotfenpdolianrgistahteiomn-(mSeocdteiofnorImI)a,laisnmd dtoiscguivses (cid:10)εa(nˆ,ν)ε∗b(nˆ(cid:48),ν)(cid:11)= 2λk2Bδ(nˆ −nˆ(cid:48))δ(ν−ν(cid:48)) how the unpolarised approach of [32] is a limiting case (cid:104) (cid:105) (Section III)). The example telescope we use throughout × PaTbT(nˆ)+PaQbQ(nˆ)+PaUbU(nˆ)+PaVbV(nˆ) , (3) is described in Section IV, and its harmonic space sen- sitivity is examined in Section V. Next we take a care- wheretheindicesareoverbasisvectorstransversetothe ful look at the geometry of the measured m-modes (Sec- line of sight. As in the unpolarised case we are more in- tion VI), leading us to a technique which both efficiently terested in the brightness temperature on the sky, and compresses the data and effectively removes polarised so we have written Equation (3) to make that explicit contamination. We give an overview of the Karhunen- (thus Q, U, and V are polarisation brightness tempera- Lo`eve scheme for foreground removal in Section VII, tures). The polarisation tensors X are related to the Pab 3 Pauli matrices (in an orthonormal basis), specifically n . In this work we will assume that the noise from ij different antennas, and frequencies are uncorrelated (we (cid:18) (cid:19) (cid:18) (cid:19) I = 1 1 0 , Q = 1 1 0 , discuss the statistics in more detail in Appendix A). Pab 2 0 1 Pab 2 0 1 Wenormaliseourvisibilitiessothattheyarethecorre- − (cid:18) (cid:19) (cid:18) (cid:19) lated antenna temperature in the noiseless limit (in par- 1 0 1 1 0 i PaUb = 2 1 0 , PaVb = 2 i −0 . (4) ticulartheauto-correlationis theantennatemperature). The factor of two in the definition of the transfer func- The standard basis to use in spherical geometry are the tion ensures that for an unpolarised sky with uniform polar and azimuthal directions, θˆ and φˆ, as these allow brightness Tb, the measured autocorrelation Vii = Tb. Note that in Equation (7) and onwards the symbol V spin spherical harmonics to be used straightforwardly to denotes two different quantities, the visibility V (φ) and decompose the polarisation field. ij theStokesVskyfieldV(nˆ). Thedistinctionwillbeclear Any feed on the telescope measures a weighted com- from the context. bination of the electric field coming from each direction The above Equations (7) and (8) are completely ex- on the sky. In particular we need to keep track of the act. The general approach to interferometric analysis antenna’s sensitivity to the orientation of the incoming is to approximate the above to a 2D Fourier transform, electric field. We’ll write the measured signal at the i-th which is valid for small fields of view. For wide-field ob- feed, as F which is given by i servations we can attempt to relax this with techniques (cid:90) such as mosaicing [38] and w-projection [39] though this F (φ)= d2nˆAa(nˆ;φ)ε (nˆ)e2πinˆ·ui(φ) , (5) i i a quickly becomes complicated. In our case we are inter- estedinaspecificclassoftransit interferometersintended and is directly proportional to the voltage induced in forsurveys. However,astheseinstrumentsareextremely the circuit. Here, and onwards, we will implicitly sum wide-field, this approach is limiting. Instead we will try over the polarisation index a. The antenna reception a different route, restricting our domain to transit tele- pattern Aa is a vector quantity describing the electric scopes, but otherwise attempting to keep the analysis i field response in a given direction. The response A exact. ∝ leff, the effective antenna length (choosing them equal To continue, we decompose into spherical harmonics, wouldmakeFi betheantennavoltage). WenormaliseA as they are a natural way of representing fluctuations suchthatthenormalisedantennapowerpatternPn(nˆ)= on the sky. As polarisation is not a scalar field we must A(nˆ)2 ensuring the solid angle of the beam is expandQandU inspin-2harmonicsY(±2)(nˆ)(theStokes lm | | V field transforms as a scalar). This yields (cid:90) Ωi = d2nˆ|A(nˆ)|2 . (6) T(nˆ)=(cid:88)aT Y (nˆ), (9) lm lm lm In Equation (5) we have also included an exponential (cid:88) Q(nˆ)+iU(nˆ)= a(+2)Y(+2)(nˆ), (10) factor which gives the phase relative to an arbitrary ref- lm lm erence point. As both this, and the antenna orientation lm (cid:88) change with the Earth’s rotation relative to the sky, we Q(nˆ) iU(nˆ)= a(−2)Y(−2)(nˆ), (11) − lm lm write them as functions of φ, the rotation angle. lm The fundamental quantity in radio-interferometry is V(nˆ)=(cid:88)aV Y (nˆ). (12) the cross correlation between two feeds, the visibility lm lm V =(cid:10)F F∗(cid:11). Using Equations (3) and (5) we can write lm ij i j down exactly what a visibility measures, explicitly keep- The polarised beam transfer matrices also transform as ing track of the different sky polarisations to give spin fields, and so we decompose them in the same way, with (cid:90) (cid:104) Vij(φ)= BiTj(nˆ;φ)T(nˆ)+BiQj(nˆ;φ)Q(nˆ) BT(nˆ;φ)=(cid:88)BT (φ)Y∗ (nˆ), (13) ij ij,lm lm (cid:105) lm +BiUj(nˆ;φ)U(nˆ)+BiVj(nˆ;φ)V(nˆ) d2nˆ+nij(φ) (7) BQ(nˆ;φ) iBU(nˆ;φ)=(cid:88)B(+2) (φ)Y(+2)∗(nˆ), (14) ij − ij ij;lm lm where the beam transfer functions BX encode all the lm informationabouttheopticsandgeomiejtryoftheinstru- BiQj(nˆ;φ)+iBiUj(nˆ;φ)=(cid:88)Bi(j−;2lm) (φ)Yl(m−2)∗(nˆ), (15) ment. They are given by lm (cid:88) BV(nˆ;φ)= BV (φ)Y∗ (nˆ). (16) 2 ij ij,lm lm BiXj(nˆ;φ)= Ω Aai(nˆ;φ)Abj∗(nˆ;φ)PaXbe2πinˆ·uij(φ) , (8) lm ij Note that we have decomposed with the complex con- (cid:112) where Ω = Ω Ω . The measured visibilities are cor- jugates of the spin-harmonics. This allows us to use the ij i j rupted by noise, which we include as an additional term orthogonalityofthe(spin)sphericalharmonicstorewrite 4 the visibility equation Equation (7) as productsoftheindividualFouriermodes(witharemain- ing summation over the l index). This equation fully de- (cid:88)(cid:104) 1 scribes how the measured visibilities are related to the V (φ)= BT (φ)aT + B(+2) (φ)a(+2) ij ij;lm lm 2 ij;lm lm polarised sky that we are observing. lm Itisworththinkingaboutwhatwearemeasuring. The 1 (cid:105) + B(−2) (φ)a(−2)+BV (φ)aV +n (φ). (17) visibility we see is a complex time series, which roughly 2 ij;lm lm ij;lm lm ij corresponds to the signal from the sky modulated by a complex Fourier mode. In our case the time variable is Though this has completely transformed the problem φ, the Earth’s rotation. Taking the Fourier transform into harmonic space, it will be more convenient if we of a visibility splits the time series into right and left change into the conventional E and B mode decomposi- moving waves (positive and negative m respectively). A tion as they are real scalar fields [40]. This can be done correlated beam pointing south of the north pole only by making the standard substitutions producesmodesmovinginonedirection(asthebeamon a(+2) = (cid:0)aE +iaB (cid:1) , (18) the sky is a Fourier mode), however pointing the same lm − lm lm beam beyond the north pole (that is north of it as de- a(−2) = (cid:0)aE iaB (cid:1) (19) lm − lm− lm fined in the ground frame), produces the other modes as the Fourier mode on the sky moves in the opposite aswellasthecorrespondingchangesforthebeammatri- directionwithrespecttotheEarthrotation. Oneimpor- ces tant consequence of this is that if we use the freedom to B(+2) = (cid:0)BE iBB (cid:1) , (20) choosetheorderoftheourfeedpairssuchthatthebase- ij;lm − ij;lm− ij;lm linevectorspointtowardstheeast,positivem-modesare B(−2) = (cid:0)BE +iBB (cid:1) , (21) produced below the pole, and negative m-modes come ij;lm − ij;lm ij;lm from above. If the primary beam does not extend over leaving the visibility as the pole only positive m’s are produced, though a small amount of negative m’s are seen because of the effect of V (φ)=(cid:88)(cid:104)BT (φ)aT +BE (φ)aE the primary beam. ij ij;lm lm ij;lm lm Infact, whilstthepositiveandnegativem-modesmay lm (cid:105) be independent measurements they are still observations +BiBj;lm(φ)aBlm+BiVj;lm(φ)aVlm +nij(φ). (22) of the same sky — for a real field alm = a∗l,−m and thus both Vij and Vij∗ measure the same harmonics on the In the above the harmonic coefficients are now all the m −m sky. Itwillbeusefultochangeournotationtomakethis transformsofrealscalarfields(theBX arethecomplex ij;lm fact transparent. conjugates of the spherical harmonic coefficients). Let us separate out the positive and negative m parts Giventheperiodicityofthesysteminφ,Fouriertrans- by defining forming Equation (22) is an obvious next step BX,+ =BX n+ =n (25) (cid:90) dφ ij;lm ij;lm ij;m ij;m Vij;m = 2πVij(φ)e−imφ . (23) BiXj;,l−m =(−1)mBiXj;∗l,−m n−ij;m =n∗ij;−m (26) which is valid for m 0. Additionally to prevent double WecalltheseFouriercoefficients,m-modes,andtheywill countingthem=0m≥easurementweneedtosetBX− = ij;l0 becomethekeyquantityinouranalysis. Asthevisibility n− = 0. For brevity of notation, we will introduce a is a complex timestream, the positive and negative m’s ij;0 label α which indexes both the positive and negative m are independent measurements. partsofallincludedfeedpairsij,suchthatanyparticular As the φ dependence simply rotates the functions α specifies exactly the values of ij, (exactly how α is about the polar axis the transfer function is trivially ± packed is unimportant). This gives the final form of the BX (φ) = BX (φ = 0)eimφ. The integral over the ij;lm ij;lm m-mode visibility equation that we use as the basis of exponential factors generates the Kroenecker delta δmm(cid:48) this work, and removes the summation over m entirely, and we can write the m-modes as V =(cid:88)(cid:104)BT aT +BE aE α;m α;lm lm α;lm lm (cid:88)(cid:104) l V = BT aT +BE aE (cid:105) ij;m ij;lm lm ij;lm lm +BB aB +BV aV +n . (27) l α;lm lm α;lm lm α;m (cid:105) +BB aB +BV aV +n . (24) As in [32] we can write this equation in an explicit ij;lm lm ij;lm lm ij;m matrixformwhichwillallowustosimplifythenotation. Though slightly hidden, this a property of the convo- The beam transfer matrices above can be written in an lution theorem. For a transit telescope the visibility explicit matrix notation timestream is an azimuthal convolution of the beam and (cid:16) (cid:17) sky. This means its Fourier conjugate, the m-modes, are BXm =BαX;,mνδνν(cid:48) (28) (αν)(lν(cid:48)) 5 wheretherowindexlabelsallbaseline(α)andfrequency N combinations (ν), whereas the column index is over all W multipole(l)andfrequencies(ν(cid:48)). Similarlywecandefine vectors for the visibilities and harmonic coefficients (v ) =Vν (cid:0)aX(cid:1) =aXν . (29) m (αν) α;m m (lν) lm D To keep track of the different polarisation states we define the block matrix and vector aT B=BT BE BB BV , a= aaBE (30) a V such that v =Ba+n. (31) FIG.1. Aschematicofacylindertelescope,consistingoftwo cylinders aligned North-South on the ground. Each cylinder Thisistheessenceofthem-modeformalism: asimple, is of width W, and has Nfeeds regularly spaced a distance D apart. In this paper we will only consider cylinders which linear matrix relation that exactly describes the whole are touching, making the total width of the array 2W. The measurement process for a transit interferometer. As we cylinders are assumed to be long enough that there are no will discuss in Section V both the number of m-modes opticaldifferencesbetweenfeedsattheedgeandinthecentre and the dimensionality of the B matrices is bounded by of the array. the physical size of the instrument. This means that we caneasilyapplyallthestandardtoolsofstatisticalsignal processing without even remembering that we’re dealing Under these constraints there is only one relevant linear with an interferometer. In the following sections we do combination of the four XX, YY, XY and YX visibili- this with gusto. ties that is sensitive to the total intensity, the average of Despite this being an interferometry paper, the uv- the XX and YY visibilities plane has not been mentioned at all so far. Though it is prevalent in many interferometric applications, as both 1 V = (V +V ) (34) an extremely useful aid for physical understanding and u 2 XX YY for computational efficiency (by virtue of the FFT), the 1 (cid:90) 1 = d2nˆA2(nˆ)e2πinˆ·uT(nˆ)+ (n +n ) . m-mode formalism does not make use of it. Eschewing Ω 2 XX YY the uv-plane is part of its power, helping it to work triv- ially for wide-field analysis, and focusing us on only the Because of the properties of the polarisation matrices, measured degrees of freedom. However, it comes at the this is not sensitive to the different polarisation modes costofmakingitdifficulttohaveconcretephysicalinter- Q, U and V, whilst all the orthogonal combinations are pretations of the process. insensitive to the total intensity T. In this limit, the combination V is equivalent to the u unpolarisedformalismgivenin[32],ifwerelabelthenoise III. UNPOLARISED LIMIT terms such that n=(n +n )/2. Provided the noise XX YY terms are uncorrelated this reduces the power spectrum In[32]wedevelopedanunpolarisedformalismbecause downbyafactoroftwo—thatistheunpolarisedsystem it gives a simpler problem to analyse, both conceptually temperature is T =T /√2. sys,u sys,p and computationally. However, under certain assump- tions it is directly equivalent to the full polarised case. For a telescope with dual polarised antennas, let us IV. CYLINDER TELESCOPES suppose that we can engineer our telescope optics such that the field patterns of the two feeds (labelled X and Cylinder telescopes are interferometric arrays consist- Y) obey two constraints. First, that their normalised ing of one or more parabolic cylindrical reflectors. They power patterns are equal everywhere have a long history in radio astronomy, with well known A 2 = A 2 =A2 , (32) facilities like the Molongo Synthesis Telescope [41], and | X| | Y| the Ooty Radio Telescope [42]. Though advances in am- and second that their polarisation orientations are or- plifiertechnologymeanttheysteadilylostfavourtodish- thogonal all over the sky basedinterferometers, interestinthemhasrecentlybeen revived. Reasons are twofold: the development of cheap, A A =0. (33) room temperature, low noise amplifiers has dramatically X · Y 6 antenna beam. Rather than try to accurately solve for TABLE I. Parameters of the example cylinder telescope. the beam, we try to capture these two effects. We will Parameters Value break the model down into the product of two 1D func- Number of cylinders 2 tions: a function for the E-W direction, calculated by Cylinder width [m] 20 illuminatingthecylinderwiththedipolebeam,andsolv- Feeds per cylinder 64 (dual-pol) ing for the diffraction in the Fraunhofer limit; and a N-S Feed spacing [m] 0.3 function which is just the reflected feed amplitude in the T [K] 50 sys N-S direction. We will also model the polarisation direc- Bandwidth [MHz] 400–800 tionasbeingthesameasthatofanunfocuseddipole(in Channel width [MHz] 2.5 spherical co-ordinates, for a dipole along the polar axis, Number of Channels 160 (in groups of 40) Telescope Latitude 45◦ the polarisation direction is θˆ). First we model the beam amplitude for the unfocused dipole in the E-plane and H-plane as taking the form improved sensitivity; and 21cm intensity mapping has (cid:18) ln2 tan2θ (cid:19) providedanapplicationforwhichtheyareideallysuited. AD(θ; θW)=exp − 2 tan2θ , (35) W Intensity mapping requires a large collecting area in a compact region to achieve high brightness sensitivity, where θ is the full width at half-power of the beam. W whichcylinderscanprovidedcheaply. Additionallycylin- For a horizontal dipole mounted a distance λ/4 over a der telescopes are a cost effective way of surveying large conducting ground plane (see [46, section 4.7]), we can amounts of sky at high speed [43]. And whilst arrays exactly calculate the widths in the H-plane (θ =2π/3) H of dipoles provided a bigger instantaneous field of view, and E-plane (θ 0.675θ ). We use these value for our E H ≈ the large number of elements required at a fixed angular fiducial beam model, though we will vary them later in resolution makes the receiver and correlation hardware this paper. increasingly expensive. In the E-W direction we are solving the Frauhofer Each cylinder has a parabolic cross section such that diffraction problem of a cylinder feed illuminating an they focus only in one direction. In the layout we as- aperture of a finite width. This has the solution sume(seeFigure1), thisgivesalongandandthinbeam othnethNeorstkhy-,Seoxuttehnddiinrgecntieoanrlybufrtowmhhicohrizisonontloyhaorroizuonndin2 AF(θ;θW,W)∝(cid:90)−WW2 AD(2tan−1(2Wx);θW)e−ikxsinθdx degrees wide East-West. Feeds are spaced along the axis 2 of each cylinder — when correlated these provide resolu- (cid:90) 1 − ln2 u2 −iπWusinθ e tan2θW 1−u2 λ du (36) tion in the N-S direction. As the telescope operates as a ∝ −1 transit telescope this means that the entire visible sky is observed once per sidereal day. where we have used the fact that for a cylinder with Inthispaperweillustratethem-modeformalismusing an f-ratio of 1/4 a ray striking a distance x from the cylindercentrereflectsbyanangleθ =2tan−1(2x/W)= amediumsizedcylindertelescope,similartotheCHIME 2tan−1u where W is the cylinder width. Pathfinder. Table I lists the parameters of this example Putting these components together, our overall beam instrument. model can be written as the product of three functions. For the X feed A. Beam model AX(nˆ)=A (sin−1(nˆ xˆ); θ ,W) a F · E In the m-mode formalism knowledge of the primary AD(sin−1(nˆ yˆ); θH)pa(nˆ; xˆ) (37) × · beams of our instrument is crucial. In our model we as- and for the Y feed sume an arrangement such that at each location there are two perpendicular dipoles: the X feed where the dipole is aligned across the cylinder (pointing East), and AYa(nˆ)=AF(sin−1(nˆ ·xˆ); θH,W) Y feedwherethedipoleisalignedalongtheaxis(pointing A (sin−1(nˆ yˆ); θ )p (nˆ; yˆ) (38) D E a North). Inbothcasesthefeedshangbelowaconducting × · ground plane which stops the beam spilling above the where the vectors xˆ is a unit vector transverse to the cylinder (which is assumed to have an f-ratio of 1/4). cylinder, pointing East, and yˆ is along the cylinder, Solving for the beam on the sky for a feed placed in pointing North. The function pa gives the unit vector a parabolic cylinder is a complex problem (for one ap- polarisationdirection on the sky for adipole in direction proach see [44, 45]). Crudely the cylinder acts in two dˆ ways: in the parabolic direction it focuses the antenna beam to a diffraction limited beam on the sky; in the pˆ (nˆ;dˆ)= 1 (cid:2)dˆ (nˆ dˆ)nˆ(cid:3) . (39) orthogonal direction it acts like a mirror, inverting the a (cid:0)1 (nˆ dˆ)2(cid:1)1/2 − · a − · 7 where the part in square brackets is the coefficient in a 0.001 0.01 0.1 1 0.0003 0.003 0.03 0.3 spherical harmonic expansion of the plane wave. The 90 amplitude of a spherical harmonic function can be con- I I P I venientlywrittenintermsofintegralsofBesselfunctions → → 60 [47, section 5.4]. For large l 1 we find (cid:29) 2l+1(cid:90) ∞(cid:20) (cid:18)tsinθ(cid:19)(cid:21)2 30 Y (θ,φ)2 = J J (t)dt | lm | 4π m 2 2l+1 0 g NS/de 0 ≈ 2lπJm(lsinθ)2 (43) 30 − where we have used the approximation that lim J (x) = δ(x n). Combining this with n→∞ n −60 Equation (42) shows that−the magnitude of the spherical harmonic coefficients of a plane wave are 90 − −10 −5 EW/0deg 5 10 −10 −5 EW/0deg 5 10 |alm|2 =8πljl(2π|u|)2Jm(lsinθ)2 . (44) In particular this shows that the coefficients are effec- FIG. 2. The primary beam of the cylinder telescope forms tivelyboundedinatrianglebyl<2π u and m <lsinθ a long strip on the sky from North to South. This figure | | | | because of the exponential decay of the Bessel functions illustrates the transfer into an instrumental Stokes I (XX+ for large order. YY polarisations), from the total intensity on the sky (left Thehighestfrequencyfouriermodemeasuredonasky panel), and from the polarised sky only (right panel). The red contour in the top panel, marks the half power point of by an individual baseline comes from the maximum dis- the beam. tance between illuminated areas on the correlated an- tennas (this is the largest distance from the origin in the uv-plane). FollowingthroughfromEquation(44),weex- In Figure 2 we illustrate the on-sky beam for this our pecttherangeofmeasureablemodestobel<2πd /λ max example telescope. We plot the response of an ‘instru- and m < 2πd /λ, where d is the largest distance E-W max mental Stokes I’, constructed from the combination of assoicated with baselines and d the largest in the E- XX+YY polarisation,toStokesIandpolarisatedemis- E-W W direction. sion on the sky. The reponse to Stokes I on the sky is Let us consider our cylinder (see Figure 1). A feed given by on the cylinder effectively illuminates the whole width R =(cid:0)Aa Ab +AaAb (cid:1) I . (40) of the cylinder, but a very short distance along its axis. I→I X X Y Y Pab This makes the largest E-W distance of all feed pairs As a measure of the response to polarised radiation we N W,andthelargestN-SdistanceN D. Intermsof cyl feeds use sphericalharmonicscoefficientsonthesky,wearelimited R2 = (cid:88) (cid:2)(cid:0)Aa Ab +AaAb (cid:1) P(cid:3)2 . (41) to P→I X X Y Y Pab P∈{Q,U,V} 2π(cid:113) l< (N W)2+(N D)2 , (45) For a beam with no polarisation leakage, this response λ cyl feeds is zero. Though our example has no leakage on-axis, 2π m< N W . (46) Figure 2 clearly shows that there is significant pickup of λ cyl polarisation away from the beam centre. Though this result is correct for a cylinder telescope, foraninterferometerwithacompactfieldofview,point- ing away from the celestial equator, it needs modifying. V. SENSITIVITY IN HARMONIC SPACE As before the resolution in the E-W direction is deter- mined by the maximum distance d , however, if the The geometry of an interferometer on the ground de- E-W primary beam does not cross the equator this resolution terminesitsangularsensitivityonthesky,withthetotal corresponds to a larger fraction of the circle of constant sizeoftheopticalsystemdeterminingthesmallestscales declinationatthatpoint. Asthem-modecorrespondsto that can be measured. This limits the number of har- the Fourier mode in the azimuthal direction, this means monic modes on the sky that we are able to measure, thatthelimitonmisinfactm<2πcosδd /λ,where reducing Equation (27) to finite sums. E-W δ is the declination of the point in the primary beam The set of spherical harmonics that a given baseline is closest to the celestial equator. sensitive to can be found by expanding a plane wave on Tolookatthesensitivityofthetelescopeinmoredetail the sky wecancalculatetheFishermatrixofthea coefficients lm e2πinˆ·u =(cid:88)(cid:2)4πilj (2π u)Y∗ (uˆ)(cid:3)Y (nˆ) (42) (we discuss the interpretation of Fisher matrices in de- l | | lm lm tail in Section VIII). For Gaussian noise the likelihood lm 8 function for the a ’s is lm Sensitivityin(l,m)space—log10[F−1/2/µK] 8 (cid:18) (cid:19) 1 6 L(a;v)∝exp −2(v−Ba)†N−1(v−Ba) . (47) 400 4 7 2 6 From this we can calculate the Fisher matrix for a par- ticular m 300 1 5 (cid:28) ∂2 (cid:29) Fll(cid:48) =− ∂aT∂aT lnL l 200 4 (cid:104) l l(cid:48)(cid:105) 0.5 dth dge 3 Weexpectthatinge=neraBl†TthNis−m1BatTrixll(cid:48)w.illbesingular(a4n8d) 100 0.5 cylinderwi telescopee 12 hencewecannotfindthecovariancematrixofthea co- lm efficients by finding F−1. One obvious source of this is 0 0 0 100 200 300 400 500 thattheinterferometerdoesnotseethewholesky—any- m thingdeclinationlessthanδ = 45◦ isbelowthehorizon —andthismanifestsitselfasc−orrelatedcombinationsof FIG.3. Sensitivityofthearraytotemperature,derivedfrom a ’s that we cannot separate. Additionally the angular the inverse of the diagonal elements of the Fisher matrix lm (F )−1. The plot above shows the log of the sen- resolution falls off towards the horizon meaning that we (lm)(lm) 10 sitivity in units of µK. The sensitivity to the three remain- do not have uniform sensitivity across the sky. ingStokesparametersarelargelyidentical. Thedashedblack Thisconsequenceofthisisobviousfromsimplycount- linesmarkthemcorrespondingtotheseparationbetweenthe ing the degrees of freedom involved. For the example cylinders, and the total width of the cylinders. As we would telescope, there are 762 unique baselines each of which expect the sensitivity peaks in m at the zero separation, and gives a noisy complex measurement of the sky. However, the single cylinder separation. It then falls off rapidly at the we are sensitive up to lmax 400 for each polarisation, edge of the telescope. ∼ giving 4(l m) complex degrees of freedom on the max − sky,sotheremustbesomecombinationsaboutwhichwe have no information. V 3 As in general we cannot determine the covariance ma- trixofthea , wewillusetheFishermatrixitselftode- lm scribe the sensitivity. In Figure 3 we show the diagonal P elements of the Fisher matrix at each m for the exam- ple telescope, this gives an illustration of the amount of V informationwehaveaboutanysphericalharmonicmode. 2 T T’ VI. SVD PROJECTION V For21cmcosmologyweareonlyinterestedinderiving 1 Visibility Space Sky Subspace real properties of the unpolarised sky. As we shall see thisisusuallyofmuchlowerdimensionthanthespaceof measurementsmadebyaninterferometer,leavingalarge FIG. 4. The information about the sky does not spread number of redundant degrees of freedom which are just throughout thespace of visibilities butis containedin a sub- filledbytheinstrumentalnoise. Eliminatingthesewould space, a linear combination of the measured signals which allow us to significantly compress the data space, with- does not span the whole visibility space. Directions orthog- out losing useful information. In Figure 4 we illustrate onal to this subspace are excited only by the instrumental the geometry of the measured visibilities. The matrix B noise, and contain no information about the sky. The left wholly describes the mapping between the sky and the panelillustratesthegeometryofthefullvisibilityspace,show- measured visibilities, and understanding its structure is ing the sky subspace as a plane. In the right panel we show the key to isolating the important degrees of freedom. only the sky plane. Within this sky subspace, there are yet lowerdimensionalsubspacesthatthetotalintensity(labelled To start with let us concentrate on how to reduce to T) and polarised (P) signals get mapped to. However, they only the degrees of freedom on the sky (ignoring their need not be orthogonal, an effect we must take into account. polarisations for now). The matrix B tells us how a sub- One way of treating this is to project onto the space orthog- space of the spherical harmonics a map into a subspace onal to polarisation (labelled T(cid:48)), this eliminates polarised in visibility space v. This visibility subspace (shown by contamination at the expense of some sensitivity to total in- the plane in the Figure 4), is termed the image of B. tensity. This is discussed in detail in the text. The subspace of visibilities orthogonal to the image, is 9 called the cokernel. The cokernel has no mapping to the As a first attempt we might consider projecting onto sky, and so measuring this subspace yields no useful in- the image of B , the total intensity transfer matrix, T formation. By projecting our data onto the image, we rather than the full B. In Figure 4 this would corre- remove the cokernel and compress our data by retain- spond to projecting straight onto the T vector, rather ingonlytherelevantdegreesoffreedom. InFigure4this than just the plane. correspondstoprojectingontotheplane,eliminatingthe Unfortunately as illustrated in Figure 4, the image of perpendicular dimensions. thetotalintensityneednotbeorthogonaltothesubspace The number of retained degrees of freedom is given by containingthepolarisedimage. Thisisamanifestationof the dimensionality of the image — that is, the rank of polarisationleakage. Inthiscasebydoingthiswelosethe B — and cannot exceed the number of measured modes abilitytodifferentiatebetweenpolarisedandunpolarised on the sky. For a single frequency and m, the rank is signals from the sky, resulting in catastrophic leakage of guaranteed to be less than the total number of spherical polarised foregrounds. harmonics required to describe the polarised sky, that is A resolution to this problem is to project not onto 4(lmax−m). However, inthecaseofincompleteskycov- the image of BT but to perform another projection, this erage, we cannot measure all spherical harmonic modes time onto the polarisation cokernel. In Figure 4 this is independently, and this coupling means that the rank is equivalent to projecting onto the vector T(cid:48). By doing likely to be reduced to around 4fsky(lmax m), where this we ensure that there is no leakage of the polarised − fsky is the fraction of sky observed. skyintoourcompresseddata,attheexpenseofthrowing Thesenumbersdependonlyonthephysicalsizeofthe awayinformationaboutthetotalintensitysignalthatlies telescope, and not details of the feed distribution. For in the overlap between the two spaces. compactinterferometerswithlittleredundancy,thenum- To project out the polarised signal, we first construct ber of feed pairs rapidly exceeds the rank of the matrix, the polarisation transfer matrix and so projecting onto the image gives a large computa- tional saving. To find the image of B we can use the Singular Value Bpol =BE BB BV , (51) Decomposition (SVD). However, first we will pre-whiten the visibilities with respect to the instrumental noise. This transforms it to be uncorrelated with unit variance then we use this to isolate the polarisation cokernel in and can be done by multiplying them with N−21, a ma- the sky compressed basis by performing another SVD trix such that N−21(N−21)† = N−1. As N is positive definite this factorisation always exists and can be found U†IN−12Bpol =UpolΣpolV†pol . (52) by Cholesky factorisation or eigendecomposition. This leaves Equation (31) as As before we separate into the image and cokernel of this matrix, by dividing up U into U and U N−12v =N−21Ba+N−21n. (49) respectively. Asbeforethesepaproaltionontpoolt,hIetwosppaocl,eNs We then take the SVD of the whitened beam transfer isnotexact,butdonethroughanumericalthreshold. By matrix projecting our dataspace onto the cokernel we achieve this final compression. N−12B=UΣV† . (50) Overall we have applied three transformations to our data: The matrix U defines the image and cokernel, given by columnsofUcorrespondingtonon-zeroandzerosingular Whiten the instrumental noise by applying N−12 values respectively. In practice many singular values are • numerically small but not precisely zero, giving modes Project onto the sky subspace by using U† whichareeithernon-zerobecauseofnumericalprecision, • I orsimplycarryaverysmallbutnon-zeroamountofinfor- Project out the polarised sky using U† mationaboutthesky. Inthiscaseweseparatetheimage • pol,N and cokernel using a numerical threshold. We partition Combined these define a new basis in which to consider the columns of the matrix U into two matrices UI and our data. One which strives to preserve as much of the UN which give the image and cokernel respectively. To relevant information as possible, whilst vastly reducing compress our data we simply filter with the matrix UI the number of degrees of freedom we must consider. We to give v(cid:48) =U†N−21v. define our filtered visibility data as I While this filtering can yield a large compression, we should note that it preserves all the information about v¯ =U† U†N−21v . (53) pol,N I the sky. However, the cosmological signal we are in- terested in is purely unpolarised and requires only We can write a modified version of the measurement ∼ fsky(lmax m) modes per frequency and m to describe equation (31) which relates this to the sky signal − it. This suggests that we should be able to improve our compression by around another factor of four. v¯ =B¯ a+n¯ (54) 10 where we have defined SVDspectrum(log10Σ/Σmax) 0 B¯ =U† U†N−12B, (55) 140 pol,N I n¯ =U† U†N−21n. (56) 120 pol,N I 100 2 As the columns of UI and Upol,N are orthonormal, de − the instrumental noise still has the identity covariance mo 80 (cid:104)on¯fn¯th†e(cid:105)=neNw¯ =maIp.pIinngFimguarteri5xwB¯e,schloewarltyheillsuinstgrualtairngvatlhuaest SVD 60 -3 4 we are only sensitive to a small number of modes on the 40 -5 − sky. -2 20 In order to visualise our data we will want to make -1 mapsfromourfiltereddataset. ForGaussiandistributed 0 6 0 50 100 150 200 250 300 350 400 − instrumental noise it is straightforward to make maxi- m mum likelihood maps of the sky as discussed in [32]. As we have whitened the instrumental noise, our data has a FIG. 5. The singular values of B¯ for the 400MHz channel likelihood function after removal of the polarised modes. Large singular values (cid:18) (cid:19) represent modes on the sky that are well measured. We see (a;v˜) exp 1(cid:12)(cid:12)v¯ B¯a(cid:12)(cid:12)2 (57) that at each m there are less than 100 measured degrees of L ∝ −2 − freedom from the sky, with the spectrum dropping off very steeply beyond this. This is a significant saving, before com- and thus we can solve for the maximum-likelihood solu- pression there are twice as many modes as there are unique tion using the Moore-Penrose pseudo-inverse, giving our baselines, including positive and negative m. In our example best estimate of the spherical harmonics simply as thereare762uniquebaselines(withoutauto-correlations),so there would be ∼1500 modes. aˆ =B¯+v¯ . (58) As in [32], to make a full map of the sky, we simply use skywillvarywiththeobservedfrequency,drivenby this estimator on a per-m and per frequency basis and theopticaleffectsofusingafixedphysicalaperture collate the estimates. We can then perform an inverse or feed spacing. Even if the angular fluctuations Spherical Harmonic Transform to produce sky maps at on the sky were frequency independent as we scan each frequency. As we have projected onto the polarisa- through in frequency the beam structure changes, tion cokernel, the data does not contain any information andthisintroducesvariationsofourmeasurements about the polarised sky. Combined with the minimum with frequency. powerpropertyoftheMoore-Penrosepseudoinversethis meansthatthepolarisedsphericalharmonicswillbezero. Model uncertainties: Astrophysical foregrounds are poorly constrained at the small angular and frequency scales that will VII. FOREGROUND REMOVAL WITH THE be probed by upcoming 21cm intensity mapping KARHUNEN-LOE`VE TRANSFORM experiments. Whilst there exist theoretical and phenomenological models of this regime, a success- The foremost challenge for any 21cm intensity map- fulforegroundremovalmethodshouldberobustto ping experiment is separating the cosmological signal uncertainties in the foreground statistics. Though from astrophysical contaminants which are around 104– most effort has focused on the uncertainties in the 106 times larger. The primary sources are the diffuse two-point correlations, we must also ensure that synchrotron emission from our own galaxy and emission higher order moments do not impair our analysis. from extra-galactic point sources [48]. All significant foregrounds are expected to be spectrally smooth [33], Given these complications, we would prefer a fore- however, the 21cm signal decorrelates quickly as each ground removal method to be conservative, throwing frequency corresponds to a different spatial slice. This away potentially useful information in order to be ro- gives an opportunity to separate the two. bust to them. It is better to be cautiously correct than Conceptuallyforegroundremovalissimple—wejustre- precisely wrong. move the smooth frequency component from our obser- Acceptingthatwemayprefertoloseinformationabout vations. Unfortunatelytherealityisfarfromstraightfor- the21cmsignalinordertobeunbiasedbyresidualfore- ward. The large dynamic range between the amplitude grounds, we would still like to perform the best job we of the foregrounds and the 21cm signal makes several can, requiring that we are effects extremely problematic. Statistically Optimal: Mode mixing: Whateverspacetheforegroundsareremovedinwe In a real experiment the shape of the beam on the must be able to keep track of the statistics of both