A New Technique for Galaxy Photometric Redshifts in the Sloan Digital Sky Survey 1 James J. Wray [email protected] 8 0 2 James E. Gunn 0 2 [email protected] n a J 0 ABSTRACT 1 ] h Traditional photometric redshift methods use only color information about p - the objects in question to estimate their redshifts. This paper introduces a new o r method utilizing colors, luminosity, surface brightness, and radial light profile to t s measure the redshifts of galaxies in the Sloan Digital Sky Survey (SDSS). We a [ take a statistical approach: distributions of galaxies from the SDSS Large-Scale 2 Structure (LSS; spectroscopic) sample are constructed at a range of redshifts, v and target galaxies are compared to these distributions. An adaptive mesh is 3 4 implemented to increase the percentage of the parameter space populated by 4 3 the LSS galaxies. We test the method on a subset of galaxies from the LSS . 7 sample, yielding rms ∆z of 0.025 for red galaxies and 0.030 for blue galaxies (all 0 with z < 0.25). Possible future improvements to this promising technique are 7 0 described, as is our ongoing work to extend the method to galaxies at higher : v redshift. i X r Subject headings: galaxies: distances and redshifts — techniques: photometric a — catalogs 1. INTRODUCTION Since Hubble (1929) discovered a linear relationship between the distances and redshifts of other galaxies, redshift measurements have been the primary method for determining 1 Department of Astronomy, Space Sciences Building, Cornell University, Ithaca, NY 14853,USA 2 Princeton University Observatory,Princeton, NJ 08544, USA – 2 – distances to extragalactic objects. This is normally done using spectra of sufficiently high resolution that individual spectral lines can be resolved and matched to the same features in nearby objects, or by matching the spectrum to a model. However, measuring the spectrum of an object with high spectral resolution and suf- ficiently high signal-to-noise requires a significantly longer integration time than recording broadband photometry of comparable quality. Thus, it is desirable to be able to measure an object’s redshift from broadband photometry alone. Redshifts measured this way are called photometric redshifts, or photo-z’s. Throughout this paper, we will refer to objects for which photo-z’s are sought as targets. Photo-z techniques date back to Baum (1962), who combined nine photometric bands to form low-resolution SEDs for elliptical galaxies. These traced the steep 4000 ˚A break feature, which remains an excellent tool for photo-z determination since it produces a strong differenceinfluxbetween whichever two passbandsstraddleitatagivenredshift. Koo(1985) was able to measure fairly accurate photo-z’s for both red and blue galaxies using only 3 or 4 photometric passbands; his method involved comparisons of observed galaxy colors with those predicted by the Bruzual spectral evolution models (Bruzual 1983, and companion papers cited therein) at a range of redshifts. Connolly et al. (1995) took a purely empirical (training set-based) approach, deriving a correlation between four-band photometric data and the measured spectroscopic redshifts of a sample of galaxies. Sawicki, Lin & Yee (1997) compare four-bandtarget photometry to that predicted byempiricaltemplate spectra. More recently, hybrid techniques combining spectral template-fitting with training sets have been introduced (Budav´ari et al. 2000; Csabai et al. 2000, 2003). All of the methods listed above use only the photometric fluxes (i.e. colors or apparent magnitudes)oftheirtargetsforcalculatingphoto-z’s. However, galaxyimagesgenerallyyield additional geometrical information, such as angular size, shape, and light distribution (radial and azimuthal). In a review of photometric redshift techniques, Koo (1999) suggested that galaxy structural parameters—including surface brightness and radial light profile—could be used to reduce the number of passbands needed for precise redshift estimates. Indeed, the bulge-to-totalflux ratio was used by Sarajedini et al. (1999) along with I-magnitude and V −I color, and Kurtz et al. (2007) have recently developed a novel method that uses only one color and the surface brightness from a single band. Supervisedneuralnetworkshaverecentlybeenusedtocomputephoto-z’sfromarangeof input parameters, including Petrosian radii(Firth, Lahav, & Somerville 2003;Vanzella et al. 2004), concentration index (Collister & Lahav 2004), surface brightness and axial ratios (Ball et al. 2004). D’Abrusco et al. (2007) have incorporated Petrosian radii and informa- tionabout theradialprofileinto their neural network. Wadadekar (2005) has used a different – 3 – machine learning method to compute photo-z’s based on five passband fluxes along with the concentration index, while Way & Srivastava (2006) have used ensemble learning and Gaus- sian process regression to derive photo-z’s fromcolors andvarious morphologicalparameters. This paper introduces a new, statistically-based photo-z technique, first conceived by David Schlegel, that uses surface brightness and the S´ersic index—a measure of the radial light profile—in addition to five-band photometry. The method is empirical: the seven prop- erties listed are measured for a spectroscopic sample of galaxies, whose redshift information is used to estimate photo-z’s for the target galaxies. Note that photometric redshifts have also been successfully applied to quasars (e.g., Richards et al. 2001; Budav´ari et al. 2001). This paper focuses on galaxy photo-z’s. Thepaperisstructuredasfollows: in§2,wedescribethespectroscopicsampleofgalaxies used by the photo-z code. The photo-z technique and its development are discussed in §3, along with other variations that were explored. A test of the photo-z code is described in §4. We present our conclusions in §5, and suggest future improvements for increasing the accuracy and applicability of the method. 2. THE SOURCE SAMPLE 2.1. SDSS & the NYU-VAGC As of its Fourth Data Release (Adelman-McCarthy et al. 2006), the Sloan Digital Sky Survey (SDSS; York et al. 2000; Gunn et al. 1998, 2006) has imaged roughly 7000 square degrees of sky in five bands (u,g,r,i,z)ranging from the near-ultraviolet to the near-infrared (Fukugita et al. 1996; Smith et al. 2002). Follow-up spectroscopy has been performed on objects selected by one of several precisely defined target selection algorithms (Strauss et al. 6 2002;Eisenstein et al.2001;Richards et al.2002). SDSShasmeasured ∼ 10 galaxyspectra, but the number of galaxies detected in SDSS imaging is greater by roughly two orders of magnitude. Thus, despite the great size of the SDSS spectroscopic sample, which includes botha“Main”sample (flux-limited tor = 17.77)anda LuminousRedGalaxy(LRG)sample (flux- and color-selected, reaching down to r = 19.5), the huge size of the imaging survey makes it a very attractive target for photometric redshift techniques. Thus we work with SDSS data, although the method is in principle applicable to any other imaging survey with similar observable parameters. The New York University Value-Added Galaxy Catalog (NYU-VAGC; Blanton et al. 2005) is essentially an “extended Main sample;” it extends the low-magnitude limit down to – 4 – r = 18, and makes the other cuts on the Main sample less restrictive. It also includes all galaxies within 2 arcseconds of any target from the Main, LRG, or QSO samples, and thus is useful for analyzing large-scale structure. In fact, also available are subsets of the NYU- VAGC called Large-Scale Structure (LSS) samples, which contain only well-characterized galaxies with measured spectroscopic redshifts. These samples are continually updated and expanded; we use sample14, which contains 221,617 galaxies with good photometry. Specif- ically, our sample results from an apparent magnitude cut, 14.5 < r < 17.5, an absolute magnitude cut, −23. < M < −17., and a redshift cut, 0.01 < z < 0.25. The redshift cut r eliminates only a handful of galaxies that are not already eliminated by the photometric cuts. Finally,theNYU-VAGCalsocontainsafewderivedparameters, includingK-corrections and S´ersic indices for all galaxies. The S´ersic index n (S´ersic 1968; Graham & Driver 2005) is defined by fitting the radial surface brightness profile with a model of the form: I(r) = Aexp[−(r/r0)1/n]. (1) The value n = 1 produces an exponential light profile, typical of late-type galaxies (in addi- tion to some low-luminosity early-type galaxies), whereas n = 4 produces a “de Vaucouleurs profile,” long considered a good description for many early-type galaxies. The SDSS pho- tometric pipeline only performs fits for these two particular values, because computing an arbitrary best-fit value is computationally very expensive (Stoughton et al. 2002). Thus, Blanton et al. (2005) calculate this best-fit value of n themselves, for each galaxy in the NYU-VAGC (though they do the fits to circularly averaged profiles, whereas the SDSS pipeline performs a full 2-dimensional elliptical fit.) 2.2. Examining the LSS samples Blanton et al. (2003a) used the slightly older LSS sample12, with cuts very similar to theonesweused, toexaminecorrelationsamongobservablepropertiesofSDSSgalaxies. The quantities theystudied were thefour colorsu−g,g−r,r−i,i−z; theabsolute magnitudeM ; i the surface brightness µ ; and the S´ersic index n, with all parameters “corrected” to z = 0.1. i That is, using each galaxy’s redshift, its colors were K-corrected to the rest frame, but to ugriz bandpasses shifted blueward by a factor (1+0.1) in λ. The absolute magnitude and surface brightness are also for the (z = 0.1)-shifted i-band. Blanton et al. (2003a) produced arrays (e.g. their Fig. 7) of two-dimensional galaxy distributions for each pair of the seven properties listed, and discussed in depth the features of these bivariate distributions. The plots along the diagonal of their Fig. 7 are one-dimensional distributions of each property. – 5 – Weusesample14togeneratesimilarplotarraysatarangeofredshifts(Figs. 1-4),butwe choose to use the apparent magnitude i, K-corrected and corrected for cosmological surface brightness dimming, instead of a band-shifted M . Thus all properties plotted are photo- i metric observables for the galaxies in question, shifted to a common redshift. K-corrections are performed using the IDL code Kcorrect v3 2 (Blanton et al. 2003b). As in Blanton et al. (2003a), all magnitudes are Petrosian magnitudes (see descriptions in Blanton et al. 2001; Strauss et al. 2002), which measure a fraction of the galaxy light that is constant with dis- tance or size (ignoring the effect of seeing); Graham et al. (2005) have described a simple method for converting Petrosian magnitudes to total magnitudes. Note that Figs. 1-4, like Blanton et al. (2003a)’s Fig. 7, attempt to show what a true sample of galaxies at the indicated redshift looks like; this is achieved by weighting each galaxy by 1/V , where V is “the volume covered by the survey in which this galaxy max max could have been observed” (Blanton et al. 2003a). This weighting accounts for the window function of the survey and the redshift distribution of the galaxies in the sample; §3.4 of Blanton et al. (2003a) provides further details. As a result of this weighting, our 1-D i- distributions have the form of Schechter functions, but with a sharp drop at the faint end due to the absolute magnitude cut described above (the drop-off is not vertical because the cut was performed in the r-band). Comparing Figs. 1-4 reveals the changes in photometric properties that occur as the same sample of galaxies is observed at different redshifts. These changes are plotted directly inFigs. 5-6. Fiverandomlyselectedgalaxiesthatappearfaintandblue(atz = 0.1)andhave exponential profiles are plotted at a range of redshifts (Fig. 5); the same is done separately ∗ for five randomly selected bright (L ), red, de Vaucouleurs galaxies (Fig. 6). The plots along the diagonal of each figure have redshift z increasing along the horizontal axis. By comparing Figs. 5-6 with Fig. 2, one sees that the S´ersic index is a very useful parameterforredgalaxyphoto-z’s, sinceitisconstant withredshift whileallotherproperties are not, and red galaxies exhibit a wide range in n. That is, the trajectory along which a red galaxy moves in redshift (Fig. 6) is roughly perpendicular to the galaxy distribution in all the 2-D plots containing n. The i-band apparent magnitude is also clearly a useful property when combined with any of the other observables: it changes strongly with redshift, and the red and blue galaxy trajectories never overlap in the 2-D plots. Note that there are degeneracies in some of the color-color plots (i.e., high-z blue galaxies look like low-z red galaxies), particularly those incorporating r-band data but not u-band data. However, the other colors and the apparent magnitude clearly are sufficient to break the degeneracy. – 6 – 3. THE PHOTO-Z CODE 3.1. Theory We can determine a galaxy’s redshift by combining its apparent (observable) properties with absolute quantities, i.e. by specifying its type T. Thus, for a given galaxy targeted for photo-z measurement, we want to find the peak of P(T), the probability distribution of galaxy types that it could be. This information will allow us to compute its redshift. The starting assumption of our photo-z technique is that the (shifted) empirical galaxy distributions of §2.2 can be used as probability distributions. That is, we want to use the 7-D distribution of the previously named observables (of which Figs. 1-4 show 2-D projections), corrected to a given redshift z, to approximate P(T|z), the probability distribution of galaxy types at that redshift. If the redshift corrections are reliable, then this should be a fairly good approximation given the large sample size. According to Bayes’ Theorem, a photo-z can then be computed as the redshift that maximizes P(T) = P(T|z)∗P(z), (2) where P(z) is the total probability distribution of redshifts for the sample of target galaxies. Estimating this function well will be an important step in applying this photo-z method to any new target sample. The 7-D distributions are generated across a range of redshifts that is believed to cover all galaxies in the target sample, with an interval between the redshifts that is less than the rms error of the photo-z’s. At each redshift, a target galaxy falls somewhere in the P(T|z) distribution, and the value P(T|z)∗P(z) is computed and stored for comparison to values at other redshifts. Initially, a slightly different approach was considered: only one distribution would be generated, and each target galaxy would be assigned many different redshifts in turn. Roughly speaking, the best-fit photo-z would then be that which places the target at the highest point in the distribution. However, Figs. 1-4 demonstrate that the distributions change shape with redshift, so information would be lost with this approach. Furthermore, for reasons described by Blanton et al. (2003b), “one can observe a galaxy at z = 0.1 and reliably infer what it would look like at z = 0.3; it is only the reverse process that is diffi- cult.” Since the median redshift of the LSS samples is z ∼ 0.1, we are much better off doing K-corrections to the sample galaxies than to a target galaxy that may have redshift z ∼ 0.3. Finally, the multiple-distribution method is more computationally efficient because we can generate the requisite distributions just once and store them, so that no K-corrections need be performed when we run the code on a set of targets. For all of these reasons, the method – 7 – of generating multiple distributions is favored. 3.2. Implementation We use IDL to implement the algorithm described above. Distance moduli (for shifting the source galaxies) are computed using the cosmological parameters Ωm = 0.3, ΩΛ = 0.7, and H0 = 100 km/s/Mpc (following Blanton et al. 2003a). To avoid assigning as photo- z’s only those discrete redshifts at which the distributions are generated, we interpolate quadratically between the maximizing redshift and its immediate neighbors at higher and lower z. We assign the z-value corresponding to the peak of the fit parabola. For galaxies assigned the minimum or maximum redshift tested, we simply use that value; however, the redshift range can always be expanded so that there are few of these cases. The shifted galaxies are placed into cells in a 7-D array, each dimension of which spans a range broad enough to include virtually every galaxy in the source sample, at every redshift to be tested. Given this broad range, we must have a large number of cells in each dimension in order to have reasonably high type-resolution. However, the resolution is limited by both theamountofmemoryavailableonthesystemonwhichthecodeisrun(thisisarealproblem for 7-D arrays of numbers that can become fairly large near a peak in the distribution), and 2) the fact that the number of points (source galaxies) that go in the array is fixed, so that increasing resolution makes the array more and more sparsely populated. We balance these competing factors by using a resolution of 15 cells per dimension. However, for a typical distribution generated at this resolution (in particular, for z = 0.01), only ∼ 0.03% of the cells in the array are populated, and the majority of these contain just one galaxy. Therefore an adaptive mesh is implemented, “smoothing” each single-galaxy cell across all neighboring cells. Specifically, the occupation number of each cell is multiplied by a (large) constant N, and then all cells that lie within one unit (in any combination of dimensions) of a single-galaxy cell are populated with numbers, the total of which—for any given single-galaxy cell—is N. Thus, after the initial multiplication by N, no points are added to the distribution; it is merely smoothed around each cell that formerly contained a single galaxy. Furthermore, not all the cells newly populated by this step are given the same value, for they lie at different distances in parameter space from the central cell (the one that had only one galaxy). For example, a cell that has six coordinates in common with the central cell and only one that differs by unity is much “closer” than a cell with all seven coordinates differing by unity from those of the central cell. Thus we compute the center-to-center – 8 – distance between each cell and the central one (in units of a cell), and place values in the cells that are inversely proportional to that distance. The central cell gets the largest value of all, though this is greatly reduced from the value it had before smoothing. After this smoothing is performed, the z = 0.01 distribution mentioned previously populates ∼ 3% of the array, an improvement by two orders of magnitude. In the next section, we will see how this change affects photo-z measurements. 4. RESULTS Photometricredshift routinesareusuallytestedbyapplying themtoobjectswithknown (i.e., spectroscopic) redshifts. Since redshifts are known for all galaxies in LSS sample14, we can simply trim the sample that we use to generate the distributions, and use the remaining galaxies as the target sample. Specifically, we test the code on 1/4 of the sample (55,405 galaxies), using only the remaining 3/4 to generate the distributions. Distributions are generated over the redshift range 0.02 < z < 0.30, at intervals of 0.02 in z (note that the upper limit extends beyond the greatest redshift present in our source sample; still, we include z = 0.30 in order to verify that no galaxies are incorrectly assigned such a high redshift). As explained in §3.2, an estimate of P(z) for the target sample is needed. In this special case, P(z) is the same for both the source and target distributions. P(z) is usually “divided out” from the source population when each galaxy is weighted by 1/V . Instead, in this max case we can avoid estimating P(z) entirely by giving each source galaxy a weight equal to unity, effectively skipping the division by P(z). Then there is no need to multiply by P(z) later, for the probability computed from the distribution at each given redshift z gives us P(T) directly. Fig. 7 shows the 2-D projections of a unity-weighted distribution, as used in this particular test. We define ∆z ≡ z−z , where z is the spectroscopic redshift and z is our photo-z. phot phot Without the adaptive mesh smoothing, this test yields an rms ∆z of 0.029, with systematic offset of essentially zero (mean ∆z ∼ −0.0005). However, our failure rate, i.e. the percentage of galaxies that are not assigned a redshift because they do not fall inside an occupied cell at any of the redshifts tested, is ∼ 29%. With the smoothing incorporated, the failure rate drops to ∼ 11.3%, which should be acceptable for most purposes; the rms ∆z also improves slightly, to ∼ 0.0275. Fig. 8 is a plot of z vs. z for all the galaxies here tested. phot In addition, we examine the performance of the photo-z code on red and blue galaxies separately, using the “optimal color separator” of Strateva et al. (2001), u−r = 2.22. The – 9 – target sample, thus divided, contains 25,296 “blue” galaxies and 30,109 “red” galaxies. The rms ∆z for the red galaxies is ∼ 0.0246; for the blue galaxies, it is ∼ 0.0303. Interestingly, theredgalaxieshaveanotablyhigherfailurerate(∼ 16.5%)thanthebluegalaxies(∼ 5.1%). Figs. 9 & 10 are plots of z vs. z for the red and blue galaxy subsets, respectively. Table 1 phot divides thetargetsample even further, bothbyu−r color andbyi-magnitude, andshows the variation of rms ∆z with these parameters. The errors are smaller for the brighter galaxies of all colors, despite the fact that the fainter galaxies are more numerous in both the training set and target sample. Table 2 compares our photo-z accuracy to that achieved by other methods. Our rms ∆z is lower than that obtained by Csabai et al. (2003) using two template-fitting methods and their own hybrid technique, and comparable to the results of Connolly et al. (1995)’s quadratic-fitting approach and the support vector machine method of Wadadekar (2005). The template-fitting methods also produce significant systematic offsets (underestimates), while our method does not. Csabai et al. (2003) reported rms ∆z of 0.029 for red galaxies and 0.04 for blue galaxies, so our method shows the most pronounced improvement in the photo-z’s for blue galaxies. Csabai et al. (2003) used a smaller sample of ∼ 35,000 galaxies, but using smaller training sets does not significantly increase the errors from our method (Mandelbaum et al., in preparation). Padmanabhan et al. (2005) have achieved rms ∆z ∼ 0.03 using a template-fitting ap- proach, but they used the deeper SDSS LRG sample, so their results are not directly com- parable to ours. As Table 2 shows, smaller rms ∆z has been obtained using the neural network technique of Collister & Lahav (2004) and two techniques (ensemble model and Gaussian process re- gression) introduced by Way & Srivastava (2006). Other neural network methods have sim- ilarly attained rms ∆z ∼ 0.02 (Vanzella et al. 2004; Ball et al. 2004; D’Abrusco et al. 2007). However, our method is arguably more transparent than the neural network techniques. The next section discusses additional improvements that could further reduce our errors in future implementations. 5. CONCLUSIONS We have described a new method for determining photometric redshifts of SDSS galax- ies. The method is empirical, and uses a large spectroscopic sample of SDSS galaxies to infer distributions of galaxy properties at a range of redshifts. The best-fit redshift is determined by comparing these distributions to a galaxy for which a photo-z is desired. The properties – 10 – used are the five-band SDSS photometry, along with surface brightness and the S´ersic index. This represents one of the first alternatives to neural networks for deriving photo-z’s from imaging information beyond the photoelectric fluxes. Our test of the method produces rms ∆z = 0.025 for red galaxies in the Main sample, and rms ∆z = 0.030 for blue galaxies. These variances are an improvement over those achieved by template-fitting and hybrid photo-z codes previously applied to SDSS galaxies, but are somewhat worse than the errors typical of neural network methods. Implementing an adaptive mesh reduces our method’s failure rate, but has only a small effect on the rms ∆z, so further adjustments to the smoothing technique alone would not likely reduce our errors. Similarly, training sets even larger than the 166,212 galaxies used in our test are unlikely to improve the errors significantly (Mandelbaum et al., in preparation). Becauseourerrorsarecurrently largerthantheredshift spacing(0.02inz)usedingenerating the arrays for the test described here, generating the arrays at finer intervals does not by itself reduce our errors. One modification that may help would be to change the cell spacing for various observ- ablesinthearray—e.g.,fortheS´ersicindex, cellscouldbeevenly spacedinlog(n)ratherthan evenly spaced in n. Alternatively, the spacing could be chosen (for any or all observables) such that the peaks in the distribution are spread across many cells, effectively providing higher resolution in P(T|z). This approach would have the added advantage of populating a larger fraction of the array, potentially reducing the failure rate. Looking ahead, the next major challenge for photometric redshift techniques (including our own) is to make them applicable to higher-redshift galaxy samples. At redshifts only a little higher than the maximum for our sample, the intrinsic evolution of the target galaxies becomes significant. This evolution can be calculated with some reasonable confidence for the red, passively evolving galaxies, but not for the actively star-forming blue ones. In any case, it is clear that to extend the present techniques to higher redshifts, evo- lutionary corrections will have to be applied if one wishes to use the SDSS Main sample to generate the 7-dimensional probability arrays. Of course, this approach will require one to estimate the redshift distribution P(z) of the target sample in order to compute the in- dividual galaxy redshifts. Alternatively, deeper surveys covering the larger redshifts could be used to generate a high-z training set, but the necessity to populate the arrays and de- termine evolutionary effects self-consistently demands very large datasets. It is likely that moderate-sized deep surveys can be used to verify empirical evolutionary corrections to the SDSS Main sample for higher-redshift photo-z estimates, and this is the path now being pursued here (Mandelbaum et al., in preparation). There are several redshift surveys deeper