Astronomy & Astrophysics manuscript no. preprint February 2, 2008 (DOI: will be inserted by hand later) A robust algorithm for sky background computation in CCD ⋆ images F. Patat European Southern Observatory,K.Schwarzschild Str.2, 85748 - Garching - Germany 3 0 0 Received October 24, 2002; accepted Januar 24, 2003 2 n Abstract. In this paper we present a non-interactive algorithm to estimate a representative value for the sky a backgroundonCCDimages.Themethodwehavedevisedusesthemodeasarobustestimatorofthebackground J brightness in sub-windows distributed across theinput frame. The presenceof contaminating objects is detected 7 throughthestudyofthelocalintensitydistributionfunctionandtheperturbedareasarerejectedusingastatistical 2 criterion which was derived from numerical simulations. The technique has been extensively tested on a large amount of images and it is suitable for fully automatic processing of large data volumes. The implementation 1 we discuss here has been optimized for the ESO-FORS1instrument, but it can be easily generalized to all CCD v imagers with a sufficiently large field of view. The algorithm has been successfully used for the UBVRI ESO- 4 Paranal night sky brightness survey (Patat 2003). 3 5 Key words.techniques:photometric 1 0 3 0 1. Introduction changeablecollimators,whichgiveanimagingfieldofview h/ of 6.′8 6.′8 (standard resolution, SR) and 3.′4 3.′4 (high The study of the night sky brightness is fundamental to × × p resolution, HR) respectively. - monitor the quality of very dark astronomical sites and o sets the stage for the study of a whole series of effects FORS1 is offered during dark time both in Visitor r Mode (VM) and Service Mode (SM). Imaging data ob- t whichtakeplaceintheupperlayersofEarth’satmosphere s tained during SMruns are bias andflat-fieldcorrectedby (Leinert et al. 1998 and references therein). a the pipeline and undergo a series of quality checks before : With the exception of a very few cases, all night sky v they are finally distributed to the users. Due to the high i brightness surveys are executed with photoelectric de- X number of imaging frames produced by this instrument vices coupled to small telescopes and usually span a lim- (morethan4500fromApril2000to September 2001)and r ited number of nights (Benn & Ellison 1998), distributed a thevarietyofscientificcaseswhichdriveit,itisclearthat across several years. For these reasons, the data are usu- a complete and systematic study of the night sky bright- ally rather scanty and suffer from the inclusion of bright ness can be performed only by means of a robust and stars (V 13; see for example Walker 1988). ≥ automatic procedure, capable of identifying and rejecting Nowadays, with the availability of large telescopes all the cases which are not suitable for sky background equipped with CCD imagers and the possibility of re- measurements (e.g. large galaxies, crowded stellar fields ducing the data via dedicated pipelines, the paradigm is and so on). changing and different approaches become feasible. In this work we present and discuss the algorithm we In this spirit, during the beginning of year 2000 we have specifically designed for this purpose, while the re- have started a project to monitor the UBVRI night sky sults of the night sky brightness survey are reported in brightness at ESO-Paranal Observatory (Chile) as part Patat (2003). of the quality control (QC) procedures implemented for the FOcal Reducer/low dispersion Spectrograph (here- The paper is organizedasfollows.In Sec.2 we discuss afterFORS1).Thismulti-modeopticalinstrument,which the problems connected with the sky background mea- is mounted at the Cassegrainfocus of ESO-Antu/Melipal surement in digital images, while Sec. 3 deals with the 8.2m telescopes (Szeifert 2002), has two remotely ex- technique we have adopted to compute the mode of the image intensity distribution. The algorithm we have de- Send offprint requests to: F. Patat; e-mail: [email protected] vised to identify the presence of contaminating objects in ⋆ Partially based on observations collected at ESO-Paranal. the field and the tests we have performed on real FORS1 2 FerdinandoPatat: Skybackground computation data arepresentedin Sec.4and5 respectively.Finally,in Sec. 6 we summarize our conclusions. 2. Problems in estimating the sky background Widely available programs for object detection and pho- tometry like DAOPHOT (Stetson 1987) and Sextractor (Bertin & Arnouts 1996) use the mode of the image in- tensity distribution to estimate the local background or to construct the background map of a two-dimensional frame.Infact,besidesitsrobustness,themodeisastatis- tically powerful estimator, being the most probable value of the background brightness in a given region of the im- age. The use of the mode as the maximum-likelihood es- timator for the sky background implicitly assumes that the image has some regions which are not seriously con- taminated by any astronomical object. Of course distant and faint galaxies (and stars) will always be present, but we will consider them as part of the global background throughoutthispaper.Therefore,whenwetalkaboutcon- Fig.1. Illustration of the effect on the intensity distri- taminatingobjects,wemeanobjectswhicharedetectable bution when a Poissonian noise is added to a stellar field in the image, well above the background noise. There is withaconstantbackgroundI .Theverticalscaleinthe also another assumption that one has to make, i.e. that sky lower plots is arbitrary, while the intensity I is counted it is really improbable that the images to be analysed are from the input I value and normalised to σ = I . filled by a single extended object with a roughly constant sky p sky The solidthincurveinthe lowerleftplotdepicts the con- surface brightness. In fact this is the only case where one p tributiontothefinalintensitydistributionbythevalueof wouldhaveanoverallbackgroundincreasewithouthaving F(I) at I =I +σ . anysecondaryeffectsontheintensitydistributionfunction sky p andinthatcasetheprocesswouldleadtoanoverestimate of the sky background. In the case of FORS1 frames, es- pecially with the commonly used SR collimator field of 6.′ 8 6.′8, this assumption seems reasonable. × Due to the field of view of FORS1 and the wide va- Of course there is always the possibility that none of riety of scientific projects which are carried out by this the sub-windows is clean enough to allow for a reliable multi-modeinstrument,oneexpectstodealwithverydif- measurement. A few examples are galactic stellar fields ferentastronomicalobjects,whichwouldperturbinadif- with heavily saturated stars, outer parts of globular clus- ferent way the local sky background. Possible examples ters,nearbyinteractinggalaxiesandclose-bycomets,just are comets, clusters of galaxies, outskirts of big spirals, to cite a few real examples we encountered during this large ellipticals,diffuse nebulosities, crowdedstellar fields analysis. andsoon.Inallthesecasestherestillmightexistpartsof the image which are suitable for a sky background mea- Thedifficultiesonefacesinestimatingthebackground surement. For this reason it is clear that the analysis has are easily understood from the following considerations. to be performed using sub-windows distributed on a grid If the images to be analysed were noiseless and the sky acrosstheinputimage.Thechoiceofthesub-windowsize background constant and equal to I , the solution of sky hastobedoneinsuchawaythatthisisneithertoosmall, theproblemwouldbetrivial.Infact,inthiscase,thecor- becauseinthatcasethefractionofuncontaminatedpixels responding intensity distribution function F(I) would be might become statistically insignificant, nor too big, oth- 0 for I <I , would then suddenly peak at I =I and sky sky erwisetheprobabilityofincludinglargediffuseobjectsbe- finally show an extended tail, whose shape depends on comeslarge.Afterrunningsometestswehaveseenthata the number and intensity of contaminating objects. This 300 300px sub-window (whichcorrespondsto 1 arcmin2 is illustrated in the left partof Fig. 1, where we have pre- × for the SRcollimator)givessatisfactoryresults.Since the sented an artificial stellar field and its intensity distribu- guide probe of FORS1 is sometimes vignetting the outer tion. As usual, Nature behaves in a more subtle way and parts of the images, we have decided to use the central due to the photon statistics (and marginally to detector 1800 1800pxregionofthedetectoronly.Thisallowsone read-out)the imageswearegoingto dealwith arealways × toanalysetheimagesina6 6sub-windowsgridincluding affected by noise. From a mathematical point of view it 9 104 px each. × is very easy to predict the noise effect on the intensity × FerdinandoPatat: Skybackground computation 3 distribution becomes more and more skewed, with a tail appearing at the highest intensities. Clearly effect A and B will respectively lead to overestimates of the sky back- ground I and the noise, which in the case of an un- sky contaminated field is expected to be σ = I , accord- p sky ing to the Poissonian statistics. One possible solution to p this problem is achieved by reconstructing the intensity distribution one would have if no contamination effects were present. A good example of such a solution is repre- sentedbytheasymmetricclippingalgorithmdevelopedby Ratnatunga & Newell (1984), which computes the back- ground distribution via an iterative clipping of the right wing of the perturbed intensity distribution. Due to the high number of available FORS1 frames and the purpose of this work, we can afford a different approach. Instead of attempting to reconstruct the un- derlyingskybackgroundintensitydistributionincontam- inated regions,we rather try to identify these regionsand exclude them from all further calculations. For this pur- posewehavedevisedasimpleandrobusttestthatcanbe used to estimate the degree of contaminationin an image Fig.2. Intensitydistributions forthreesimulatedimages and operated in an automatic way on large amounts of obtainedinjecting 10,200and500stars(fromtopto bot- data. tom) on a background I =5000 e− with a Poissonian sky The first step in the application of our method is the noise distribution. The solid curves are Gaussian profiles mode estimate, which we describe in the next section. centred on the distribution mode and having the σ ex- pected for a Poissonian noise (see text for more details). distribution. In fact, if n(I) is the noise distribution, the observed intensity distribution is simply given by: 3. Mode estimate I Themode I ofadistributionf(I)isdefinedasthemost f(I)= F(I z)n(I,z)dz (1) h i − probable value of I (see for example Lupton 1993) or, in Z−∞ otherwords,the valueofI wheref(I)takesitsmaximum i.e. the convolution F n of the signal F(I) with a ⊗ value. For moderately skewed distributions the mode can variableresponsefunctionn(I,z),whichcanbeexpressed as n(I,z) = 1/√2π z exp[−21(z2/(I −z)], provided that bmeeaanpp(rKoxeinmdaatllely&cSomtupaurtte1d9a9s7)h.IUin≃fo3rt×unmateedlyi,anth−e 2ob×- the read-out noise is negligible. From these simple con- servedintensitydistributionsinFORS1imagesoftenshow siderations it is clear that what really matters for the very extended tails, due to several effects like saturated degradation of the resulting peak sharpness and the con- stars,cosmic-rayevents and so on.While this tailusually sequentuncertaintyintheestimateofitsoriginalposition does not affect the mode, it does perturb the mean and istheshapeoftheinputdistributionfunctionF(I)within theaboveformulawouldleadtowrongresults.Apossible 5 I fromits peak, while the behaviour at higher in- sky ∼ solution is given by the application of this approximation tensities is unrelevant (see Fig. 1, right plots). With re- p after an iterative clipping of the distribution around its spect to these effects, it is quite instructive to look at median, as it is done for example in Sextractor (Bertin & the results of some numerical simulations. In Fig. 2 we Arnouts1996).Inpracticeoneisforcedtousethismethod have plotted the relative intensity distributions of three whenonehastocomputethebackgroundmapofanimage 300 300px artificial frames with a fixed value of the sky × usingsmallsub-windows.Infactinthatcasethestatistics background (I =5000 electrons) on which we have ran- sky would be too poor to compute the mode in a reliable way domlyinjected10,200and500starsrespectively.Wehave directly using the distribution shape. usedaMoffatprofileforthestarswithβ=4andaFWHM of 5.3 px. In the case of FORS1 and SR collimator this This is not the case here, since we are more interested would correspond to a field of 1 arcmin2 and a seeing of inanaveragevalueratherthaninamapofthebackground 1′′. The peak intensity of the stars was randomly gener- withinthesameimage.Forthisreasonwecanperformour ated in the range 0-104 electrons. analysis in muchlargersub-windows so that we canbuild Basicallythreeeffectsarevisibleasthenumberofstars up a very good signal-to-noise distribution function. This grows: A) the mode I of the distribution steadily in- allows us to compute the mode just using its definition, h i creases; B) the distribution core width increases; C) the without any loss of generality and in a very robust way. 4 FerdinandoPatat: Skybackground computation 3.1. Introducing the Optimal Binning Technique (OBT) Since we are dealing with discrete distribution functions, finding the mode implies that the data have to be binned and the maximum of the distribution has to be found. To reduce the noise due to the finite (and possibly large)bin size∆I,wehavechosentorefinethedirectmodeestimate using a quadratic interpolation on the modal bin and the twoadjacentbins(seeforinstanceSpiegel1988).IfI and m f = f(I ) are the values of the intensity and intensity m m distribution in the modal bin and (I ,f ),(I ,f ) are the r r l l correspondingvaluesinthetwoadjacentbinsrespectively, then the mode I is estimated as the value of I where h i the interpolatingparabolareachesits maximum.One can show that this is given by the following expression: f f ∆I l r I =I + − (2) m h i f 2f +f 2 l m r − This correctionhas the effect of moving the estimated mode I within the whole bin ∆I according to the local h i shape ofthe distribution function. The modalbin bound- Fig.3. Upperpanel:Signal-to-noiseratioSNR reached m aries are reached when f = f or f = f (provided m l m r in the modal bin for the optimal value of ∆I as a func- that f =f ). Numerical simulations haveshownthat the l 6 r tion of npix. The solid line represents a linear fit to the parabolicinterpolationisveryeffectiveinreducingtheer- datawithn 90.Middlepanel:normalisedoptimalbin pix roronthemodeestimate(seebelow),atleastforthekind ≥ width. The solid line traces Eq. 8. Lower panel: optimal of distributions considered here. For more general cases, values of the RMS error on the mode. The dotted line is the interpolation option will improve the mode accuracy a best fit to the data (ǫ n−0.75) while the dashed onlyiftheasymmetryofthedistributionaroundthemode RMS ∝ pix line traces the law ǫ n−1. In all panels each point perturbs (fl fr)significantlyless thanthe averagevalue RMS ∝ pix − is the result of 5000 simulations. of the correction corresponding to an offset of the central bin. The choice of bin size is critical. If the bin ∆I is too This implies that if f(I) is the number of pixels falling in small the resulting histogram will be too noisy to give a a given bin, the RMS uncertainty on f(I) is simply given robust estimate of I . On the other hand, if ∆I is too by f(I) and hence we can define a signal-to-noise ratio h i large, then the mode estimate will be affected by a large SNR = f(I). In particular, we can introduce SNR , p m uncertainty due to the small signal-to-noise in the two i.e.thesignal-to-noiseratioreachedinthe modalbin.Iff p outer bins. For this reason we need to find an optimal is the fraction of pixels falling in the modal bin, we have value for ∆I which minimises the error on the mode. obviously Forthispurposewehaverunaseriesofnumericalsim- f =(SNR /n )2 (3) ulations of uncontaminated windows with a known sky m pix levelontopofwhichwehaveaddedaPoissoniannoiseand Having this in mind, we can now have a look at the atypicalread-outnoise(6electrons).Foreachofthe sim- behaviourofSNR asafunctionofn whentheoptimal ulatedframesthemodewasestimatedusingtheparabolic m pix bin size is used to estimate the mode in our simulations. interpolationdescribedabove.This hasbeen donefordif- This is portrayed in the upper panel of Fig. 3, where one ferent values of the bin size ∆I and the number of pixels caneasilyseethatSNR dependsalmostlinearlyonn . N = n2 included in each testing window. For each pair m pix pix Intherange90 n 300,thebestfittothesimulated (∆I,npix)wehaveperformed5000simulations.Whatone data (see Fig. 3≤, upppiexr≤panel) gives sees is that for a given value of n , the RMS deviation pix of the estimated mode from the known input value Isky SNRm 6.2+0.48npix (4) decreases as ∆I increases, reaches a minimum and then ≃ grows again, as expected from the above considerations For large values of n this relation can be approx- pix (see also Fig. 4). For this reason it makes sense to adopt imated as SNR n /2, which means that optimal m pix ≈ the value of ∆I where the RMS error ǫ reaches its results are obtained when 25% of the pixels fall in the RMS ∼ minimum as the optimum bin size for the mode estimate. modal bin. Beforeweproceedwiththepresentationoftheresults, Having this result, it is easy to compute the optimal we wantto discuss a few morepoints. The pixelcountsin fraction f of pixels by means of Eq. 3 and Eq. 4 (or its the intensity distribution bins obey Poissonian statistics. approximate expression). Then, assuming the underlying FerdinandoPatat: Skybackground computation 5 distribution f(I) to be a Gaussian with σ = σ , one can In the previous section we have seen that, for a suf- p derive the corresponding bin size in terms of σ as ficiently large value of n , the optimal bin size can be p pix expressed as: ∆I =2k(f)σ (5) p SNR2 ∆I =2.6 m I (9) wherekisimplicitlydefinedbythefollowingequation: sky N p 1 hIi+kσp (I I )2 Of course this requires to know the value of Isky, or f = √2πσ exp − −2σh2i dI (6) at least to have a rough estimate of it. For this purpose, p ZhIi−kσp (cid:20) p (cid:21) one can approximate it with the median of the distribu- This equation can be solved numerically and the so- tion Imed, and SNRm can be computed using Eq. 4. For lution fitted by a low order polynomial. For f 0.33 npix=300 we have SNRm 150. ≤ ≃ (n >60), the solution can be approximated rather ac- Once this guess value for the bin is computed, the pix curately by the following expression: histogram of the intensity distribution between I and min I is built and the mode is found. Since in the case of max k(f) 1.28f (7) skewed distributions the median is only a rough approxi- ≃ mation of the mode and the distribution is definitely not Finally,usingEqs.5and7onecaneasilycomputethe Gaussian, the actual signal-to-noise ratio SNR′ in the optimalbin width. Forn 90 we canapproximatethe m pix ≥ modalbinismeasuredandfromthisvalueanewbinisre- expression for the optimal bin as follows: computedas∆I′ =∆I (SNR /SNR′ )2.Thehistogram m m ∆I 6.2 2 is recalculated and a new mode value is estimated. This 2.6 0.48+ (8) procedure has a clear effect: if the distribution is skewed σ ≃ n p (cid:18) pix(cid:19) (and hence less peaked), then a larger bin size is required while, for very large values of npix, the ratio between to achieve the same SNRm. theoptimalbinsize∆I andσ approachesasymptotically The method has been tested using numerical simu- p the value 0.6. The result is shown in the central panel of lations of uncontaminated 300 300 px windows with a × Fig.3,wherewehavenormalisedthe optimalbintoσp to known sky level Isky on top of which we have added a remove the dependency on the sky level. The comparison Poissoniannoiseandatypicalread-outnoise(6electrons). betweenthepredictedvalues(solidline)andtheresultsof Foreachsimulatedfield,wehavethencomputedthemode the simulations are in good agreement across all the npix I andthedeviationfromtheinputskybackgroundvalue h i explored range. defined as Finally,theRMSdeviationfromtheinputvalueinour I I sky simulations clearlydecreasesfor increasingvalues ofn , ǫ= h i− (10) pix I as shown in the lower panel of Fig. 3. For comparison we sky have overplotted the n−1 law (dashed line) expected if InFig.4wepresentthe resultswe haveobtainedfora pix the error would scale proportionally to the overall signal- very low sky background (100 electrons). The RMS devi- to-noise ratio. It is interesting to note that the RMS er- ation ǫ (solid line) reaches its minimum value (0.2%) RMS ror in the simulations decreases at a slower rate, approxi- for SNR =150, as expected, and so does the maximum m mately as n−0.75 (dotted line). We will come back to this error ǫ (dotted line). In the same figure we have also pix max pointinthenextsection,whendiscussingtheefficiencyof plotted the RMS error one obtains without using the re- the method in reducing the error; for the time being, we finement given by Eq. 2 (ǫ′ , dashed line). This shows RMS onlynoticethatwithnpix=20theexpectedRMSerrorfor thatthe twomethodsgiveoptimalresultsattwodifferent Isky=100electronsisabout1%,whichreducesto0.2%for bin sizes. In such conditions the interpolationreduces the npix=300. RMSerrorbyaboutafactorof6.Forcomparisonwehave also plotted the bin half width (long-dashed line), which canbeassumedasanestimatorofthemaximumdeviation 3.2. Application of OBT to the real case when the parabolic interpolation is not used. As we have discussed in Sec. 2, the real data show a va- On the basis of these results one would tend to adopt riety of contaminating effects, which tend to skew the in- SNR =150,butthere isacaveat thatwehavetokeepin m tensity distribution and affect in different ways the mode mind.Theseresultsarevalidforuncontaminateddistribu- estimate. In that sectionwe have also mentionedthat the tions,wherethe signal-to-noiseratioSNR inthe modal m optimal window size for the mode estimate is n =300. bin is reached using the bin size given by Eq. 9. Since pix Both simulations and tests with real data show that with realdistributionsarecontaminated,alargerbinsizeisre- such a number of pixels we can achieve RMS errors on quiredinordertoachievethesameSNR andthisinturn m the mode smaller than 1% for I 100 electrons, which generates larger errors. The simulations and extensive sky ≥ we believe is a sufficient accuracy for our purposes. For tests with real data have shown that a good compromise this reason, from now on we will concentrate on the spe- is reached using SNR =120 which, for mildly contam- m cific case of n =300 and discuss several aspects of the inated distributions gives maximum errors smaller than pix method application. 1% for a backgroundlevel of 100 electrons (see Tab. 1). 6 FerdinandoPatat: Skybackground computation fitting Eq. 11 to the simulated data, we get N 1970. eff ∼ Combining Eq. 9 and Eq. 11 one gets the following rela- tionbetweenthebinsize∆I andthe expectedRMSerror σ on the mode: hIi N ∆I σ = (12) hIi N 2.6SNR2 eff m p Substituting the proper values in this expression, we obtain simply σ 0.05 ∆I, which means that for hIi ≃ SNR =120 and moderately contaminated distributions, m theexpectedRMSerroronthemodeisoftheorderof5% of the bin size. This is clearly shown in the lower panel of Fig. 5, where we have plotted the measured error as a function of the relativebin size derivedfromthe same set of simulations previously described. The match between the expected RMS (dotted line) and the observed RMS (dashed line) is fairly good. In both panels of Fig. 5 we have plotted the maxi- mum errors encountered during the simulations. As one cansee,theyarewellconfinedwithinthe5σ level(dashed line), which was computed using the RMS error given Fig.4. EstimatedRMSmodeerrors(solidline)fromsim- by the simulations. More precisely, maximum errors lie ulated 300 300 px un-contaminated windows as a func- × with a good approximation on the 4σ level. From Fig. 5 tion of SNR . The maximum error is traced with a dot- m we can conclude that the expected RMS errors on the ted line, while the dashed line indicates the RMS error mode estimate are below 0.3% at all sky backgroundlev- when the parabolic interpolation is not used. Finally, the els larger than 100 electrons, while maximum errors are long-dashed line indicates the percentage bin half width. always smaller than 1%. The values for each SNR level are the result of 5000 m Theintroductionofartificialstarshastheeffectofsys- simulations. tematically increasing the mode of the distribution with respecttotherealvalueoftheskybackground(seeSec.2). To see how the error behaves as a function of the sky For this reason, in the general case of contaminated dis- background,wehaveperformedanothersetofsimilarsim- tributions, we can talk about two different errors. While ulations,wherewehaveadoptedSNR =120andwehave m theformerisarandommeasurementerrorintrinsictothe variedthe input skylevelfrom102 to 104 electrons.A to- adopted method and to the quality of the data,the latter talof5000simulationsperlevelwereexecuted.Theresults is a systematic error which depends on the intensity dis- are presented in Fig. 5 and summarized in Tab. 1. tributionofthecontaminatingobjects.Aswehavesaidin In the upper panel of Fig. 5 we have plotted the per- Sec.2,wewanttouseonlythecaseswherethesystematic centage errors ǫ as a function of the backgroundlevel. As error is smaller than some threshold value. The approach expected, the percentage RMS error, traced with a solid to this problem is describedin Sec. 5, while here we focus line, decreases for increasing values of I , following very only on the randomerrorsrelatedto the way the mode is h i well the usual law for the standard error of the average: estimated. Toevaluatetheeffectofcontaminationonthemethod I error, we have performed several sets of simulations in- σ = h i (11) hIi sNeff jecting a fixed number of stars N∗ with random positions on a sky background of intensity I . The stars intensi- sky with the difference that Neff is smaller than the total ties I were uniformly generated in the range 0 < I ∗ ∗ number N of pixels. For this reason, Neff can be consid- (Isat Isky), where Isat is the detector’s numerical satu≤- eredastheeffectivenumberofdatapointsdrawnfromthe ration−level (I 106,000 electrons for the FORS1 de- sat originalsetonewouldhavetousetogetthesameRMSer- tector, high gain)∼. Finally, a Poissonian noise was added ror when adopting the averageas backgroundestimator1. to the artificial 300 300 px frames to simulate the pho- As we have seen at the end of Sec. 3.1, the RMS error tonshotnoiseandt×he modewasmeasuredwith the OBT scales as n−pi0x.75, and hence we have Neff = N0.75 when using SNRm=120.Numerical tests using a more realistic theoptimalSNRmisadopted.Sincewehavechosentouse intensity distribution, drawn from observed star counts, SNRm=120, we expect Neff to be even smaller. In fact, showthatverysimilarcontaminationeffectsareachieved. 1 Of course this is true only if the distribution is not con- The only difference is that one needs to generate much taminated, because otherwise the average gives much larger more artificial stars in the non-uniform case, and this systematic errors makes it numerically less efficient. FerdinandoPatat: Skybackground computation 7 Table 1. Estimated errors from numerical simulations of 300 300 px uncontaminated windows for three different × values of the sky background. Maximum errors are indicated within parenthesis. A total of 5000 simulations per sky level and SNR value were performed. m SNRm f (%) nb(±σ) ∆I/hIi(%) RMS error (%) 102 103 104 102 103 104 120 16.0 5 4.7 1.3 0.4 0.25 (1.00) 0.07 (0.27) 0.02 (0.11) 150 25.0 3 7.5 2.1 0.7 0.17 (0.60) 0.04 (0.16) 0.02 (0.06) 4. The ∆-test Oncethemode I oftheskybackgrounddistributionand h i the corresponding error σ have been computed, one is hIi left with the task of recognizing and rejecting contami- nated sub-windows, which is the topic of this section. As we have already mentioned, in an uncontaminated image the intensity distribution f(I) obeys the photon shotnoisedistributionandhencetheexpectedRMSnoise is given by σ = I . Since the left wing of f(I) is p h i much less disturbed than the right wing (cfr. Fig. 2), we p can estimate the underlying background noise using only the N pixels for which I I . As a robust estimator l ≤ h i we have chosen to use the Median Absolute Deviation (MAD) from the mode I applied to the left wing of h i the distribution: MAD =median I I I I (13) l {| −h i| ≤h i} Wenotethattheuseofthestandarddeviationinstead of MAD has the effect of over-estimating the noise in the real cases. In fact, the lower tail of the intensity distri- Fig.5. Relative errors on the mode estimate as a func- bution is always affected by defects like bad pixels and tion of the sky background (upper panel) and the rel- possible vignetting. While this fact has a rather strong ative bin size (lower panel). The circles represent the influence on the standard deviation, it leaves the MAD measured RMS deviations, while the dotted line indi- unperturbed, which is much less sensitive to the presence cates the expected RMS (see Eq. 12). In both panels, the of outliers. crossesindicate the maximumdeviationencountereddur- One can easily show that for a Gaussian distribution ing the simulations, while the dashed line traces the 5σ theratiobetweenthestandarddeviationandtheMADis level. Simulations were performed using SNR =120 and 1.483 (Huber 1981). Hence, to have an estimator which m ∼ RON=6 electrons. hasthesamemeaningofthestandarddeviationinthenor- malcase2,wedefineσ =1.483MAD .Thisallowsoneto l l comparedirectly σ andσ which, foranuncontaminated p l distribution,shouldbe approximatelythe same.Toquan- tifypossibledeviationsfromthePoissonianbehaviour,we As expected, the simulations show that the system- introduce the parameter ∆: atic error grows with N , while the random error keeps ∗ obeyingto Eq.12witha goodapproximation,atleastfor ∆= σl2−RON2−σp (14) N∗ <200 and I 500 electrons. More precisely, in the p σp h i ≥ case of N = 200, the simulations give σ 0.08 ∆I ∗ hIi in the range 102 I 104 electrons. We≃can safely which equals 0 in the Poissonian case, whilst it gets concludethattheR≤MhSier≤rorsintroducedbythemodede- largerandlargerasthedeviationfromthePoissoniandis- termination method we have described in this section are tribution grows. The idea is to try to estimate the un- smaller than 0.3% for I 5 102-104 electrons. Finally, knownsystematicerrorfrom∆,whichisameasurablepa- the RMSerrorcanbechonis≥erva×tivelyassumedtobe 8%of rameter. To illustrate this concept we use a realexample, the bin size ∆I in the same intensity range, which corre- 2 Infact,sincethecentralvaluehIioftheintensitydistribu- spondstoNeff 1000.Thesevalueshavebeenusedinthe tionswearedealingwithisalwayslargerthanseveralhundred ∼ remaining sections of this work and for all sky brightness counts, the noise Poissonian distribution can be very well ap- measurements discussed in Patat (2003). proximated with a Gaussian with σ= hIi. p 8 FerdinandoPatat: Skybackground computation Fig.6. An image of NGC 3521 obtained by FORS1 on Fig.7. Intensity distribution for the sub-window placed 04-04-2000 (R, 30 seconds, standard resolution collima- on a outer region of NGC 3521 (see Fig. 6). The vertical tor). The box indicates the test sub-window used for the linesareplacedatthedistributionmode(dashed)andthe exampledescribedinthetext.Itincludes300 300pixels, which correspond to 1′ 1′ on the sky. × realskybackground(dashed-dotted).Thelatterwasman- × uallymeasuredintheupperleftcorneroftheinputimage, in an object-free area. The solid and dotted lines plotted onthe leftside ofthe distributionaretwoGaussianswith where we have performed the analysis on a 300 300 px sub-window centred on the outer parts of the lar×ge spiral σ = σp and σ = σl respectively. The figure insert shows an intensity contour plot of the region. galaxy NGC 3521,as shown in Fig. 6. The corresponding intensity distribution is shown in Fig. 7, where the mode (indicatedbytheverticaldashedline)wascomputedwith i.e. 102, 103 and 104 electrons. The circles represent ǫ , theOptimalBinningTechnique(OBT)wehavedescribed whilethedashedanddottedlinesindicatetwodifferenthesi- in Sec. 3. timatesoftherandomerrorσ ;oneistheRMSdeviation hIi A superficial inspection of the input image shows al- directly computed from the simulated data and the other ready that this sub-window is strongly contaminated by is estimated using Eq. 11. While the systematic error is diffuse emission from the galaxy’s spiral arms. The mode clearly due to the presence of contaminating objects, the of the intensity distribution is in fact 34% higher than randomerrorσ isrelatedtotheaccuracyofthemethod ∼ hIi the background level manually measured on an object- one adopts to estimate the mode I (see Sec. 3). h i free areaof the same image close to the upper left corner. A clear result emerges from Fig. 8: the ∆ parameter However, the presence of strong fluctuations within the is effective in estimating the systematic error and hence sub-windows produces a significant increase in the distri- gives a statistically significantcriterionto decide whether bution core width, which becomes 80% larger than the a sub-window can be considered as unperturbed or not. ∼ expected photon shot noise (∆=0.78). This suggests that In order to have a simple description of the behaviour of ∆ may provide a tool to estimate the systematic error ǫ as a function of ∆, we have performed a linear fitting induced by contaminating sources and this, coupled with hofithe simulated results in the range 0 ∆ 10 (see some threshold criterion, would allow us to recognise and Fig. 8, continuous lines). Actually ǫ ten≤ds to≤bend for disregard the critical cases. To study this possibility, we larger values of ∆ and this has theheffiect that our linear haveexecuteda seriesofsimulationsofcontaminateddis- fittingslightlyoverestimatesthevalueof ǫ .Thisisnota tributions of the same type of those described in Sec. 3.2 problem, since in this sense the criterionhwie are going to for several values of the input sky background intensity establishisjustmorerestrictive.Asexpected,theslopeof Isky. For each simulation one can then compute the er- the ∆, ǫ relation is inversely proportional to the signal- ror ǫ of the mode estimate (see Eq. 10) and measure ∆. to-noisehriatiooftheskybackground.Thebestfitgivesthe The resulting range in the ∆ parameter can be binned to following result: some suitable value and the corresponding average error ǫ computed. In Fig. 8 we have plotted the results one 3.3 h i ǫ = (∆+0.96) (15) obtains for three different values of the sky background, h i I sky p FerdinandoPatat: Skybackground computation 9 intensity. For a typical value of ǫ =1%, ∆ is 1.1, max max 7.7 and 28.4% for sky levels of 102, 103 and 104 electrons respectively. Once all sub-windows in an input image have been analysed with the criterion we have just discussed, a first sky brightness guess can be obtained using the median of all selected values. This has the effect of excluding cases like those producedbyocculting masks insertedin the fo- calplane.Infact,toavoidstrongsaturationeffects,FORS instruments allow the user to place a number of movable blades on specific positions of the focal plane. The height of these blades is about 20′′, which correspond to 100 px whentheSRcollimatorisused.Whentheoccultedregions are larger than the adopted sub-window size, there is a chance that such areas pass the ∆-test. Since the counts in those regions are very low, the median always removes them successfully, provided that the occulted fraction of the fieldofviewisnotlargerthan50%,aconditionwhich is always fulfilled. After this selection is done, the only background fluc- tuations which are expected to be left in the remaining sub-windowsareduetoanonperfectflat-fielding.Infact, Fig.8. Errors on the estimated sky background I as sky as we have mentioned in the introduction, the use of twi- a function of ∆ (see text) from numerical simulations light flats introduces large scale gradients,which produce for three different sky background levels (102, 103 and maximumpeak-to-peakdeviationsof6%fromperfectflat- 104 electrons). The continuous lines represent linear least ness (see also next section). For this reason, we have de- squares fits in the range 0 ∆ 10, while the circles are ≤ ≤ cidedtooperateafurtherrefinement,choosingonlythose the average (systematic) errors. Each point is the result n sub-windows which deviate less than 3% from the me- of5000simulations.The dashedand dottedlines indicate g dianvalue.Thefinalestimateofthe backgroundintensity the RMS random errors computed from the simulations I is eventually obtained computing the weighted mean and Eq. 11 (N = 1000) respectively. The horizontal sky eff I = ng I w / ng w , (w = 1/σ2 ). To allow dotted line is placed at ǫmax=1%. sky j=1h ij j j=1 j j hIi for a statistically significant result, we have imposed that P P n 5inourautomaticprocedures.Ifthisconditionisnot g ≥ where I is given in electrons and ∆,ǫ are expressed met, then the input frame is considered as unsuitable for sky as percentage values. The fact that for ∆=0 ǫ =0 is skybackgroundmeasuresandrejected.Thishastheeffect h i 6 due to the following effect. When the sky background is ofoperatingafirstroughfilteringontheinputdata.Tobe contaminated by faint stars, whose maximum intensity is conservative,n isloggedtogetherwiththeotherrelevant g comparable to I , the mode and the width of the in- parameters,andafurtherandmorerestrictiveselectionis sky tensity distribution increase in such a way that ∆ tends always possible. p to be small, while the effective error is not. Intensive tests with realdata,as discussed in the next We can now use the total maximum error ǫ to section, have shown that the set of selection criteria we max fix a limit on ∆ below which we can consider a given havediscussedmakethemethodreliable,robustandsuit- intensity distributions as practically unperturbed. Since able to be implemented in a fully automatic procedure. the distribution of random errors around the system- atic deviation is Gaussian, we can conservatively assume 5. Testing the method on real data ǫ ǫ +3σ . Using Eq. 15 for ∆ one gets max hIi ≃|h i| The method we have just outlined has been tested on ǫ ∆ =0.9 I max σ 0.96 (16) a sample of 4678 FORS1 reduced frames, obtained with max h i 3 − hIi − different filters between April 1, 2000 and September 30, p h i 2001. For each input frame the results relative to all 36 whereσ isgivenbytheadoptedmodeestimator.As hIi sub-windows have been logged and this has allowed us to it is shown in Sec. 3, this can be approximated by Eq. 11 build-up a sample with more than 168,000 entries, each with N =1000, which gives the following expression: eff of which has been flagged according to the results of the ∆-test. The basic results are shown in Fig. 9, where we I ∆max = h i ǫmax 1.9 (17) have plotted the measured noise (corrected for the read- 3.3 − p out noise) as a function of the expected Poissoniannoise. Thecriticalvaluefor∆dependsofcourseontheaccu- The solid line traces the locus where the two noises have racyonewantstoreachintheestimateofthebackground thesamevalue,whilethedashedoneindicatesthelimiton 10 FerdinandoPatat: Skybackground computation Fig.9. Comparisonbetween the expected and measured Fig.10. Maximum deviations from the sky background noises on a sample of more than 4600 FORS1 images. weightedmeaninallframeswhichpassedthe∆-test.Solid Each point refers to a single sub-window. The solid line line indicates the data from all sub-windows, the dotted indicates the locus where the expected noise equals the line refers to the selected windows and the dashed line measured one, while the dashed line traces the limit set corresponds to the cases where n =36. The upper right g by Eq. 17. The dashed-dotted line indicates the expected insert shows the cumulative functions with the same line global noise generated by the photon statistics and the codings. flat-fielding correction for a typical FORS1 configuration (seetextformoredetails).Theupperpanelshowsthefrac- ally negligible with respect to the noise present in the tionofsub-windowswhichpassedthe∆-testasafunction science frame to be corrected. When the background ex- of the expected noise. posure level reaches high values, this is no longer true and the noise added by the flat-fielding process becomes the measured noise imposed by Eq. 17. All sub-windows significant. The expected global noise is in fact given by lying below the latter line wouldbe selectedfor sky back- σ2 = σ2 +I2 /(N I ), where I is the background p sky F F sky ground estimates. level in the input science frame. This law gives a fair re- The fraction of selected windows is plotted in the up- production of the observed behaviour, as it is shown in per panel of the same figure, again as a function of the Fig.9(dashed-dottedline),wherewehaveusedthetypical expected noise. As one can see, for noise values of less values of I and N quoted above. The few data points F F than 20 electrons (or background intensities smaller than lying below this line are generated by observations ob- 400 electrons), this fraction is very low. For larger in- tained with the low gain mode ( 0.3 ADU electron−1), ∼ ∼ tensities, itreachesa roughlyconstantvalue of80%.This for which I is abouta factor of2 largerthan in the high F means that basically all frames with I 400 electrons gainmode( 0.6ADUelectron−1),whichisalsothestan- h i ≤ ∼ will be rejected. In the adopted test sample, this however dardforFORS1imaging.Inprinciple,whencomputing∆, accounts for 11% of total number of frames only. From one could correct the measured noise for the flat-fielding the test sample we can conclude that, on average, 86.5% effect. Inpractice,atanexposurelevelof5 104 electrons × of FORS1 frames are suitable for sky-background mea- thecorrectiononthemeasurednoiseisoftheorderof20% surements, 95% of which have n 12. and hence a very small number of sub-windows which are g ≥ An interesting feature visible in Fig. 9 is the system- rejected by the ∆ criterion would move to the safe re- atic trend shown by the minimum measured noise. In gion (see Fig. 9). Furthermore, 90% of all sub-windows fact, this tends to deviate more and more from the ex- included in the test sample have a background intensity pectedPoissoniannoiseforbackgroundvalueslargerthan smallerthan2 104 electrons.Atthis level,the correction 104 electrons. This effect is explained by the following amounts to ab×out 10% only. For this reason we have de- ∼ considerations. FORS1 master sky flat-fields are usually cidedto ignorethe flat-fieldingeffect whenevaluating the obtained by the combination of N =3 twilight sky flats, deviation from the pure Poissoniannoise. F which have typical exposure levels of I 3.5 104 elec- Aswehavementioned(seealsoSec.3formoredetails), F ∼ × trons. The resulting noise on the combined frame is usu- the maximum formal errors on the mode estimate within