A&A manuscript no. (will be inserted by hand later) ASTRONOMY AND Your thesaurus codes are: ASTROPHYSICS 03 (03.13.6; 11.17.4; 12.07.1) 10.1.1995 The light curve and the time delay of QSO 0957+561 1 2 2 2 J.Pelt , R.Kayser , S.Refsdal , and T. Schramm 1 Tartu Astrophysical Observatory, EE-2444T~oravere, Estonia 2 Hamburger Sternwarte, Gojenbergsweg 112, D-21029 Hamburg-Bergedorf, Germany received |; accepted | Abstract. We present a new analysis of the presently as probable as the delay obtained by Press and his col- available photometric data for the gravitationally lensed laborators. We present here an analysis of the presently quasar 0957+561 A,B with the aim of determining the available data with re(cid:12)ned versions of our new methods timedelaybetweenitstwoimages.Thebasicmethodused as described inPaperI.Ourresultnot onlygivesthe(cid:12)rst is the dispersion estimation technique. Even by using the statistical reliable estimation of the time delay but also simplest non-parametric form of our method we can con- gives evidence for gravitational microlensing in at least vincingly rule out a time delay near 536 days and show one of the two light curves. thatthetimedelayisinthe vicinityof 420days.Wethen The time delay between the images of a gravitation- introducere(cid:12)nementstoourmethodinordertogetasta- allylensedobject isofgreat astrophysicalinterestsinceit ble and reliable result for the time delay independent of maybeusedtodeterminetheHubbleparameteraswellas highfrequencynoiseinthedataandsamplingerrors.Our themass ofthe lens(Refsdal1964a,b, Borgeest& Refsdal bestresultforthetimedelay,checkedbyvariousstatistical 1984,Borgeest1986,Kayser1990,Falcoetal.1991b).The tests and using bootstrap error estimates, is 423(cid:6)6 days. double quasar 0957+561 A,B (Walsh et al.1979, Young We furthermore con(cid:12)rm our earlier result that the radio 1980, Falco 1992) is up to now the only gravitational lens data are compatible with this value. Using the best avail- system for which serious attempts have been made to de- ablemodelforthe massdistributionin thelensinggalaxy termine the time delay (cid:28) between its images (cf.Paper I and cluster, our result for the time delay constrains the and references therein). However, the controversy around Hubble parameter to be smaller than 70 km/(s Mpc). the di(cid:11)erent results obtained by di(cid:11)erent groups of au- thors has not yet been decided. Key words: Methods: statistical { time series analysis Afteradescriptionofthedatasetusedthroughoutour {Quasars: 0957+561A,B{gravitationallensing{Hubble paper (Sect.2) we recall in Sect.3.1 the methods used in parameter Paper I and describe some re(cid:12)nements of the techniques whichweapplylater.Inthefollowingsubsectionsweanal- yse thelightcurves withincreasing accuracy, compare re- sults to previous ones and conclude our best result for the time delay, including an estimation of the amount of 1. Introduction microlensing. In addition (Sect.3.5) we reanalyse the radio data The time base of the optical light curves of the dou- (Leh(cid:19)ar et al.1992) with the re(cid:12)ned methods and con(cid:12)rm ble quasar QSO 0957+561A,B has been signi(cid:12)cantly en- the result of Paper I that the radio data are compatible hancedduetoobservationsperformedmainlybyR.Schild with the result obtained from the optical data. and his collaborators (Schild & Thomson 1994). An anal- In Section 4 we discuss the methodology, our results ysisofthemuchsmallerdatasetpresentedbyVanderriest and their physical interpretation and meaning, including et al.(1989) as well as the radio data of Leha(cid:19)r (1992) led constraints on the Hubble parameter. Press et al.(1992a,b) to an estimation of the time de- laybetween the images of 536 days.Reanalysing thedata with a di(cid:11)erent but nevertheless equally sound statistical 2. Data sets method convinced us (Pelt et al.1993, hereafter Paper I) The extended data set, kindly made available to us by that the result of 536 days is in a statistical sense rather R.E. Schild and D.J.Thomson (1994, below ST94), con- unstableandthatatimedelayaround410daysisatleast tains observationsfrom November1979to July1994 (JD. Send o(cid:11)print requests to:R.Kayser 244193.9{249540.7). While it contains observations from 2 J. Pelt etal.: Light curve andtime delay of QSO 0957+561 3. Time delay 3.1. Dispersion spectra To estimate the time delay between the images A and B we use basically the same methods as described in Paper I. Nevertheless we repeat here shortly the general setup and describe some new variations of our techniques. Our data model consists of two time series: Ai =q(ti)+(cid:15)A(ti);i=1;:::;NA (1) and Bj =q(tj(cid:0)(cid:28))+l(tj)+(cid:15)B(tj);j =1;:::;NB (2) where q(t) denotes theinner variabilityof the quasar,l(t) is a variability component which contains the unknown Fig. 1. The window function NA;B((cid:28)) for the two data sets. ampli(cid:12)cation ratio and an additional low frequency noise Upper curve: data set of Schild and Thomson (1994, ST94); lower curve: data usedby Vanderriestet al.(1989, V89) componentduetohypotheticalmicrolensing,(cid:28) isthetime delay, (cid:15)A(t) and (cid:15)B(t) are observational errors for which 2 2 crude estimates for their dispersions (cid:14)A(ti) and (cid:14)B(tj) (or corresponding statistical weights WA(ti);WB(tj)) are di(cid:11)erent sources, the overwhelming majority of the data available. pointshavebeenmeasuredbyR.Schildandhiscollabora- For every (cid:12)xed combination(cid:28);l(t) we generate a com- tors. It is therefore the most homogeneous data set avail- bined light curve C by taking all values of A as they are ablefor QSO0957+561. A fulldescription of thedata set and \correcting" the B data by l(t) and shifting them by withexplanation ofreduction, correctionandcompilation (cid:28): procedures can be found in Schild and Thomson (1994). (cid:26) Ai; if tk =ti, Ck(tk)= (3) There are in total 831 time points with photometric Bj (cid:0)l(tj); if tk =tj +(cid:28), estimates for both components of the double quasar. For the A and B lightcurves the full amplitude of variability where k = 1;:::;N and N = NA +NB. The dispersion spectra is 16.36{16.96 and 16.27{16.96, respectively. The mean error for both light curves as estimated by the observers 2 2 D ((cid:28))=minD ((cid:28);l(t)): (4) 2 is 0.0158 magnitudes, giving a mean dispersion of D = l(t) 0:000251. However, one should note a striking di(cid:11)erence cannowbecomputedandsearched forsigni(cid:12)cantminima between the (cid:12)rst and second part of the ST94 data: The with respect to (cid:28). In Paper I we used (with di(cid:11)erent no- dispersion is signi(cid:12)cantly decreasing in time. tation) a (cid:12)xed l(t) = l0 as unknown magni(cid:12)cation ratio. There is a simple statistic which can be used to char- In this paper we do the same, but consider additionally acterise the sequences of observations from the point of perturbations modelled by low degree polynomials. It is view of usefulness in the search for time shifts. The win- important to note that the parameterization of l(t) is al- dow function NA;B((cid:28)) measures the number of pairs of ways done in the framework of the original time points nearby observations in the combined data set (where the (not the shifted ones). B light curve is shifted in time by (cid:28)) which contain one The properties of the dispersion spectra depend observation from curve A and one observation from curve stronglyonthechoice of theexactformfortheestimates. B (cf. Paper I). In Fig.1 the window function NA;B((cid:28)) is The simplest dispersion estimate (see also Paper I) shown for two di(cid:11)erent data sets{the ST94 data and the PN(cid:0)1 2 data compiled by Vanderriest et al. (1989, below V89). 2 k=1 Wk(Ck+1(cid:0)Ck) The absolute minimum (43 pairs) for the V89 data oc- D1 =ml(ti)n 2PNk=(cid:0)11Wk ; (5) curs for shifts in the range 534{537 days. This was one of the reasons which guided us in Paper I to seek another wheretheWk arestatisticalweightsof thecombinedlight probable value for the time delay besides the 536 days curve data proposed by Press et al.(1992a). For the ST94 data the WiWj absoluteminimumofNA;B((cid:28))occursforshiftsof 514and Wk =Wi;j = Wi+Wj; (6) 515 days with 327 pairs. Consequently the statistical re- liability (even for the shifts with most unfortunate time does not contain any free parameters except those which point spacings) is now at least seven times higher. describe the hypothetical microlensing (the degree of the J. Pelt etal.: Light curve andtime delay of QSO 0957+561 3 perturbingpolynomial inourparticularcase).Thesimple modi(cid:12)cationof the (cid:12)rst statistic PN(cid:0)1 2 2 k=1 WkGk(Ck+1(cid:0)Ck) D2 =min PN(cid:0)1 (7) l(t) 2 k=1 WkGk where Gk = 1 only when Ck+1 and Ck are from di(cid:11)er- ent images and Gk = 0 otherwise, measures the disper- sionspeci(cid:12)callyintheoverlapareasof thecombinedlight curve. Obviously, this is the dispersion we are interested 2 in, whereas D1 may become strongly dominated by the dispersion of only one of the lightcurves (A or B) in de- pendence of the window function (cf. Paper I). Thepairsofconsecutive observationsareincludedinto thedispersionestimatesD1 andD2 withoutconsideration of the distance between the two time points ti;tj. In this 2 2 Fig. 2.SpectraD1 (lowercurve)andD2 (uppercurve)forthe way two observations, one of which occurs before a long ST94 data set gap in time and another one which occurs after this gap can make a strong contribution to the overall estimate. However, this kind of long time correlation is certainly includes only pairs with less distance between two obser- ruled out for the red-noise like lightcurves of quasars. To vations (in the combined light curve) than (cid:14) (let us call avoid this we therefore add an additional constraint into thisparameterdecorrelationlength).Inthesecondscheme our dispersion estimate: the pairs with longer distances between observations get linearly decreasing weights PN(cid:0)1 2 2 k=1 SkWkGk(Ck+1(cid:0)Ck) (cid:26) D3 =ml(ti)n 2PNk=(cid:0)11SkWkGk (8) Sl(;2m) = 1(cid:0) jtl(cid:0)(cid:14)tmj; if jtl(cid:0)tmj(cid:20)(cid:14), : (12) 0; if jtl(cid:0)tmj>(cid:14) where The third weighting scheme (cid:26) 1; if jtk+1(cid:0)tkj(cid:20)(cid:14), Sk = (9) (3) 1 0; if jtk+1(cid:0)tkj>(cid:14) Sl;m = (cid:0)tl(cid:0)tm(cid:1)2 (13) 1+ (cid:12) and (cid:14) is the maximumdistance between two observations which can be considered as nearby. where(cid:12)playstheroleofthedecorrelationlength,includes 2 Thenumberofpairsofobservationswhichareincluded all pairs of observations into the D4;3 estimate. However, in the estimates D1(cid:1)(cid:1)(cid:1)D3 is normally not su(cid:14)cient to in our actual calculations we were forced to use a certain avoid strong noise in the corresponding dispersion spec- cut-o(cid:11) distance to makecomputing times acceptable. tra. Very often there are some in(cid:13)uential time points in The number of di(cid:11)erent dispersion estimates gives us the observational sequence, removal of which can change some (cid:13)exibility when analysing the actual data but also the local minima of the spectra quite signi(cid:12)cantly.To get involves a degree of arbitrariness. We therefore tried to statistically more stable spectra we used a fourth form of be as conservative as possible when using di(cid:11)erent algo- the dispersion estimates which now includes not only the rithms. We present our results step by step and invoke neighbouring pairs: modi(cid:12)cations only when the motivation to do so is quite obvious. 2 PNl=(cid:0)11PNm=l+1Sl(;km)Wl;mGl;m(Cl(cid:0)Cm)2 Error bars for the delay estimates were computed as D4;k =ml(ti)n PNl=(cid:0)11PNm=l+1Sl(;km)Wl;mGl;m (10) (dseesecrAibpepdeinndPixa)pteorIs.mWoeotahppthlieedcoamnbadinaepdtilvigehmtecduiarvne(cid:12)altnedr reshu(cid:15)ed the computed residuals 1000 times to generate where Wl;m are statistical weights, Gl;m selects pairs ac- bootstrap samples. The shifts obtained in this way were cording to the origin of the values Cl and Cm (as for 2 2 (k) then used to calculate the error bars. the estimates D2 and D3), and Sl;m weights the squares depending on the distance between corresponding time 3.2. Standard estimates points.Inthevariouscalculationswe usedthreeformsfor (k) the weights Sl;m. The (cid:12)rst one Inthissectionwethedescribe theoverallbehaviourofthe simplest dispersion spectra, computed for the ST94 data (cid:26) (1) 1; if jtl(cid:0)tmj(cid:20)(cid:14), set. We must stress here that in the following computa- S = (11) l;m 0; if jtl(cid:0)tmj>(cid:14) tions the data were used without any preprocessing. No 4 J. Pelt etal.: Light curve andtime delay of QSO 0957+561 2 2 Fig. 3. Spectra D2 for the ST94 data set (lower curve) and Fig. 4. Spectra D3 for (cid:14)=2;4;8;16;32;64 forthe V89 data set used inPaper I (upper curve) mates we are forced to introduce some additional consid- detrending or censoring was applied. Also, the hypotheti- erations and free parameters into our estimation scheme. cal microlensing was ignored (l(t)=l0). The simplestandmostobviousimprovementisthein- To get a general idea about the behaviour of the dis- troductionofanupperlimitforthedistancebetweendata persion spectra for a wide range of (cid:28) values we computed 2 2 2 points to be included into the cross sums (estimate D3). D1 andD2 dispersionspectra(whichwerealsousedinPa- This excludes possible (cid:13)uctuations due to the long gaps per I and which do not contain any free parameters) for in the combined lightcurve. (cid:28) =(cid:0)2000;(cid:0)1999;:::;1999;2000days.InFig. 2thespec- 2 2 2 In Fig.4 we have plotted a sample of D3 spectra with tra D1 and D2 are shown in logarithmic scale to demon- di(cid:11)erent values of (cid:14). There are some minor details which strate their essential similarity. change from one (cid:14) value to another, but the general (cid:13)uc- We can see that there are essentially global min- tuating nature of the spectra unfortunately remains. ima around 400(cid:1)(cid:1)(cid:1)450 days in both spectra. The fea- ture around 536 days is also well seen, however it is much less pronounced than for the V89 data (see Pa- 3.3. Re(cid:12)nement per I). Below we restrict the plot range for the (cid:28) val- 2 2 2 ues to 350;351;:::;549;550 days to demonstrate the As we saw the behaviour of D1, D2 and also D3 type behaviour of the di(cid:11)erent spectra in more detail for dispersion spectra is somewhat erratic due to the large this important subinterval. However, the actual compu- samplingerrors(themaximumpossiblesizeforasubsam- 2 2 2 tations were carried out for a wider interval (typically ple forD1,D2 andD3 isN(cid:0)1 but isgenerallylessdueto 100;101;:::;799;800 days). thespacingof theA,B or B,Atype pairs).Itisenough to In Fig. 3 the fragment of the D22 spectra is plotted remove only one or two points from the original data set with the analogous fragment computed for the V89 data andtheabsoluteminimuminthespectracan jumptenor set(1989).The(cid:13)uctuatingnatureof bothspectra isquite evenmoredays.Thegeneraldepressionaroundtheselocal obvious.Our goal in PaperI was to demonstrate that be- minimainthespectraisneverthelessquitestableandthis sidesthefeaturearound 536days,which isquiteunstable iswhywelookedformore stabledispersion estimates.We againstdetrendingandcensoringappliedtotheV89data, thus invoked in our analysis a fourth type of dispersion there are also other possible and plausible values for the estimates. 2 time delay. ThedispersionestimateD4;1includesinthecrosssums In the spectrum for the ST94 data the absolute min- all pairs whose corresponding time points do not di(cid:11)er imum is at a time delay of 430 days even without any more than (cid:14) (decorrelation length). Its modi(cid:12)ed version 2 detrending and censoring applied to the data. Around this D4;2 additionally weights pairs linearly.The overallnum- value, however, there are a number of other local minima berofpairsincludedinthecrosssumsisnowsigni(cid:12)cantly (at 405,411,416,424,435 and 439 days). Because the aim larger and consequently the corresponding dispersion es- of the current paper is to determine the time delay for timates are statistically more stable. However, the ad- thedouble quasar 0957+561 as exactly as possiblewe can ditional inclusion of pairs with longer distances between notsimplyusetheabsoluteminimumasthebestestimate time points introduces a certain bias into the estimates. because it depends on minutedetails of data spacing and Wecallit curvature biasbecauseforlinearlyincreasing or observational errors. To get statistically more stable esti- decreasing trends (or trend fragments)this bias is zero. J. Pelt etal.: Light curve andtime delay of QSO 0957+561 5 2 Fig. 5. Estimated shifts for three arti(cid:12)cial data sets. Thin Fig. 6. Spectra D4;2 ((cid:14) = 20) for the ST94 data set. Lower 2 2 lines:D4;1,thick lines:D4;2 curve: allpairsincluded, upper curve:only A,Bpairs included 2 To investigate the dependence of the estimates D4;k on the value of the decorrelation length (cid:14) we generated arti(cid:12)cialdatasetswithknowntimeshiftsfortheB curve. For this purpose we combined curve A (unshifted) and curveB (shiftedby(cid:28)),thensmoothedthecombinedcurve with an adaptive median (cid:12)lter with length 7, and (cid:12)nally put thesmoothedA and B valuesback intotheir original placesinthedataset.InthiswaytheAandB curves can be considered as two replicas of the original source light curve and all the sampling properties and also much of the variability structure of the original data is mirrored in them. In Fig.5 the results for three di(cid:11)erent shifts ((cid:28) = 400;450;500 days) are shown. It can be seen that the (cid:12)rst weighting scheme (11) results in much more bi- asedestimatesthanthesecondscheme(12).Consequently we preferred to use the second scheme. To balance statis- Fig. 7. Variability of shift estimates due to changes in the tical stability and curvature bias we choose (cid:14) = 20, the value of (cid:12). Arti(cid:12)cial data with known shift 423 days: thick highest value for which the maximum bias in all of our line; original ST94 data set:thin line trialcomputationsdidnot exceed2days.Inthe following discussion every shift value must therefore be considered it can be safely predicted that similar results will now be with an inherent (cid:6)2 error which originates from the dis- 2 alsoobtainedbyusingtheoptimalinterpolationmethodof persion estimate D4;2 itself. 2 Pressetal.(1992a),whichisquitesensitivetotheamount In Fig.6the resultingdispersion spectra D4;2, (cid:14) =20, of regularly spaced gaps in data. for the ST94 data set are shown. We see that the spectra are now indeed much smoother. The absolute minimum We get additional support forthe value423 daysfrom 2 forthe spectrum with allpairs included in thecross sums calculations with the third weighting scheme (D4;3). In is 424 days and it is 423 days for the spectrum with only Fig. 7 the dependence of the shift values with minimum A,B pairs included. The bootstrap estimate for the error dispersion is plotted against the value of the parameter is (cid:6)4 days. If we take into account the possible bias due (cid:12). One of the curves is computed for arti(cid:12)cial data with to the dispersion estimation scheme (described above)we a known shift of 423 days (thick line) and one for the can formulateour main result: original ST94dataset(thin line).Ascut-o(cid:11) limit forpair The most probable time delay between the two inclusion (to speed up computations) we used a value of light curves of the double quasar QSO 0957+561 3(cid:12). is 423(cid:6)6 days. There is a characteristic \plateau" for values (cid:12) = It is important to note that the sampling properties 1(cid:1)(cid:1)(cid:1)60 days in the curve for the arti(cid:12)cial data set. If we of the ST94 data set are good enough to obtain similar assumethatasigni(cid:12)cantcurvaturebiasstartsfor(cid:12) values 2 shiftsforbothvariantsofthestatisticD4;2.Consequently, higher than 60 days we can use dispersion estimates for 6 J. Pelt etal.: Light curve andtime delay of QSO 0957+561 2 Fig. 8. Trendin B(cid:0)A Fig. 9. Spectra D4;2 for models with polynomial degrees 0 to 10,thelevelofthedispersiondecreaseswithincreasingdegrees smaller (cid:12) values as statistical sample with an inner vari- abilityjust due to the sampling and observational errors. Table 1. Minima for the di(cid:11)erent degrees of correcting poly- It was extremely satisfying to see that the mean value, nomial. Note that several minima are listed for degrees 9 and themedianvalueandthemostfrequent valueinthissam- 10.In thelast columnthe part (cid:23) of thetotaldispersion which ple were all 423 days! However, as can be well seen from is explained by the particular model is given. Fig.7, the shift estimation error for particular values of (cid:12) can be as large as 5 days. The bootstrap error for the A,B All third weighting scheme is (cid:6)4 days. The mean additional Degree Shift Dispersion Shift Dispersion (cid:23) error due to the curvature bias is less than 2 days. 0 423 0.000396 424 0.000271 0.980 3.4. Hypothetical microlensing 1 424 0.000380 424 0.000265 0.982 2 410 0.000362 409 0.000260 0.984 To check the overall consistency of our solution we pro- 3 423 0.000337 424 0.000252 0.987 ceeded with a rather simple test. For every B value 4 422 0.000328 422 0.000249 0.988 (shifted by (cid:28) = 423 days) we looked for the nearest value 5 422 0.000328 423 0.000248 0.988 in the A curve and plotted B(cid:0)A against time. 6 422 0.000327 423 0.000248 0.988 In Fig.8 itcan be seen that forthe second part of the 7 422 0.000327 423 0.000248 0.988 8 422 0.000327 423 0.000248 0.988 observational sequence our solution works well, whereas 9 431 0.000324 431 0.000247 0.988 for the (cid:12)rst part the situation is quite unclear. First, the 9 424 0.000324 424 0.000248 0.988 dispersion of the observations is signi(cid:12)cantly higher than 10 410 0.000323 409 0.000247 0.988 for the second part and second there is a clearly visible 10 423 0.000323 423 0.000248 0.988 trend in the di(cid:11)erences. 10 431 0.000324 431 0.000247 0.988 To take into account this trend in the di(cid:11)erences be- tween the A and B curves (for the best shift estimate) we introduced a more complicated form for the ampli(cid:12)- cation ratio than l(t)=l0. We introduced simple polyno- There are a few peculiar results. For degrees 9 and 10 mials with degrees 0 to 10 as models for the hypothetical we listed several minima. The minima around 431 and 2 microlensing. The resulting spectra of D4;2 are shown in 410 days are only marginally deeper than the minima Fig.9. The minima and corresponding dispersions for dif- around 423 days and they are obviously due to over(cid:12)t- ferent degrees of polynomials are given in Table 1. In the ting. Note that we saw these (cid:13)uctuations already in the last column of the table we give the part 2 simplestspectrumD2.Thedegree2givesalsoaminimum D42;2;A;B(cid:0)D42;2;All at410dayswhichisoutsideoferrorbracketsforourmain (cid:23) =1(cid:0) 2 (14) result. D Total To seek an explanation for the additional features in ofthetotaldispersionwhichisexplainedbytheparticular the spectra computed with polynomial perturbations, we model.The total variabilityof the both lightcurves (esti- proceededwiththefollowingexperiment.Wedecomposed matedby taking into account weightsgivenby observers) both lightcurves into two by median (cid:12)ltering (see Ap- 2 is Dtot =0:00632. pendix). The (cid:12)ltered curves and the residual curves are J. Pelt etal.: Light curve andtime delay of QSO 0957+561 7 The nextconsistencycheck (alsoemployedinPaperI) 2 was to compute two versions of the D4;2 statistics { (cid:12)rst for all pairs included and the second with only A,B or B,A type pairs included (see Table 1). If our model and ourchoiceofthefreeparametersintheestimationscheme iscorrectthentheglobalminimaofbothdispersioncurves must have about the same value. From Fig.6 we can see 2 thattheresidualdispersionD4;2forthebestshiftwhenall pairs were included (0:000271) is quite near to the value of the mean dispersion of the estimated observational er- rors (0:000251). However, the minimumfor the A,B type spectrum (0:000396) is signi(cid:12)cantly higher. We decreased this value by applying polynomial corrections (up to de- gree 10) to around 0:000324, but still a signi(cid:12)cant part of the dispersion remained unexplained. Nonetheless, since the overall variabilitydispersion for both light curves to- Fig. 10.The (cid:12)ltered and residual light curves 2 gether is D = 0:00632 our models explain 98:0(cid:0)98:8% of the overall variability! In Paper I we used detrending for both light curves to get a self-consistent solution. With the new data the increasing of the correction polynomial degree led to a splitting of the main minimumfor degrees higher than 8. Consequently, more complicated models to describe the hypothetical microlensing can be introduced later in our analysis. At this stage it was importantto check the gen- eral stability of our time delay estimate against di(cid:11)erent choices of perturbing low frequency components whose possible existence is conjectured theoretically. The last point can be well illustrated by the simple plot in Fig.12. We combined the original A curve with the B curve (again shifted by 423 days). We then com- puteddi(cid:11)erencevaluesforA,Bpairswithdistanceintime not exceeding 20 days and (cid:12)tted standard cubic splines 2 with di(cid:11)erent numbers of knots (15(cid:1)(cid:1)(cid:1)21) into these dif- Fig. 11.The D4 spectra for (cid:12)ltered and residual light curves ferences. Depending on the degree of smoothing,di(cid:11)erent details in the (cid:13)ow of the di(cid:11)erences can be detected. The strongdepressioninthe(cid:12)rstpartofthedataspanwellex- shown in Fig.10. We were extremely conservative in our plainswhyitwasnotpossibletogettherightvalueforthe (cid:12)ltering process in that we used a short adaptive median V89 data without previous detrending. Other signi(cid:12)cant (cid:12)lter with a length of 5 points and a gap limit of 5 days. departures from the smooth(cid:13)ow of the di(cid:11)erences can be Consequently, the median (cid:12)ltering was applied only to used to explain the residual dispersion. At this stage of continuous fragments of the light curves, all single and the investigation we refrain from more quantitative and sparselyspaced pointswere left untouched. In Fig. 11the precise evaluation of these features. 2 D4;2 type spectra with (cid:14) = 5 are shown for the smooth and the residual components. The shorter value for (cid:14) was 3.5. Radio data chosentoshowthehighfrequency(cid:13)uctuationsinthespec- 2 tra. It is quite clear that the feature around 423 days is Because the D4;2 estimator gives more stable spectra we mainly due to the variability in the smooth component trieditalsoontheavailableradiodata(Leh(cid:19)aretal.1992) and that the features around 410 and 431 days are just for 0957+561. As for the optical data, we computed sev- 2 (cid:13)uctuations seen in the D4;2 spectrum for the residuals. eral trial spectra with known shifts to get the optimal Therefore we can attribute the exceptional shift values in value of (cid:14). We found that the value (cid:14) = 60 days was the Table1 to the interplay of the under- or over(cid:12)t and some best compromise between resolution and stability. From high frequency details in the original data set. This put Fig. 13 it can be seen that the spectra for the new statis- aside, it can be said that our main result is quite stable tic are smoother than that obtained in Paper I, but the against low frequency perturbations modeled by simple instability due to the two (probably outlying) points is polynomials. still there. It is enough to remove only one or two points 8 J. Pelt etal.: Light curve andtime delay of QSO 0957+561 The conclusion thus remains the same as in Paper I { theradiodatacannotbeusedtocomputetheexactvalue forthetimedelay.However,theyare compatible withour result of 423 days. 3.6. The composite optical light curve The obtained value for the time delay and our model for the hypothetical microlensing allows us to build a com- posite (intrinsic) light curve from both original curves. On Fig. 14 this curve is displayed. To build the compos- ite curve the following parameters were used: time delay 423 days, degree for correction polynomial 7, median (cid:12)l- ter length 7 days (to obtain a smoother representation), maximumgap length (cid:14) =20 days. Fig. 12.Spline approximations for the di(cid:11)erences B(cid:0)A 4. Discussion 4.1. Methodological Let us (cid:12)rst summarize the general methodological princi- ples we tried to follow when analysing the new extensive data set. { The originaldatasetwastreated‘asis’,i.e. nocensor- ing, detrending or smoothing was involved. { Wetriedtoavoid(asfaraspossible)anyinterpolations or approximations of the original data. { We accounted for the signi(cid:12)cant and fundamentaldif- ferencebetween thetwokindsofpairsof observations: the ones which contain observations from both light curves and the others where both of the values are from one and the same lightcurve. { We tried to avoid the introduction of free parameters 2 intoourschemes,andifwewereforcedtodootherwise, Fig. 13. The D4;2 ((cid:14) = 60 days) spectra for the radio data. we used always several values to see their e(cid:11)ect and Alldata,datawithoneandtwoo(cid:11)endingpoints excluded;the check the stability of our results against changes of removal ofeach point decreases the dispersion the parameters. { Weavoidedanyphysicalinterpretationduringthedata processing stage. This was facilitated by the fact that from the B curve and the minimumshifts signi(cid:12)cantly.If 2 oneoftheauthors(J.P.)isextremelycriticalaboutin- we look at the spectrum D4;2 which was computed with terpreting the small residual di(cid:11)erences as microlens- the probable outliers removed, we see the local minimum 2 ing events. at 421(cid:6)25 days (D4;2 = 0:000353, error estimate from { Finally we tried to avoid any mathematical models 1000 bootstrap runs). We used in Paper I the fact that to describe the light curves: no polynomial, Fourier, the global minimum for the dispersion can migrate from spline, wavelet, general random process, autoregres- 533 days to 409 days (after removal of only two points sive random process or whatever kind of models were from the B curve) as one of the arguments to stress the involved as models for the original data in di(cid:11)erent compatibilityof theradio datawithtimedelays inawide stages of analysis. The only principle we used was the range and not just around 536 days. Weignored the local simple recognition that both curves are statistically minimaat425,434and453daysanddidnottrytosmooth continuous, i.e. nearby observationsintimemusthave the dispersion curve to get more stable estimates. We see nearby values in brightness. that the smoothed estimator again favors a value around the best estimate found from the optical data. Still, we According to these general principles the results we ob- are not insisting that the radio data give strong support tained can be ranked. We start from the facts which to our best estimate of the delay value from the optical are methodologically \cleaner" and end with statements data. There are simply too few points in the published which are more open to further discussion and observa- radio data set to compute precise estimates. tional followup. J. Pelt etal.: Light curve andtime delay of QSO 0957+561 9 Fig. 14.The composite optical light curve of QSO 0957+561 2 { The depression in D ((cid:28)) around (cid:28) = 405 ::: 435 for the radio data. Similar results were obtained by days is the global minimum. In all of our cal- other researchers (Vanderriest et al.1989, Schild et culations with the ST94 data set we were not able al. 1990, 1993). Our new estimate is well inside the to see anything so prominent as the general depres- error bracketsof theprevious estimates,but neverthe- sionaround 405(cid:1)(cid:1)(cid:1)435 days.There were alot of other less we would like to comment on this speci(cid:12)cally. It minima (especially in the spectra with short decor- was interesting to obtain the value around 409 days relation length), but they were always less deep and (see Fig.11) in our computations with less smoothing more (cid:13)uctuation-like than the main minimum. The ((cid:14) = 5 days) or with a high degree correction polyno- featurearound536dayswhichwasadvocatedbyPress mial(seeTable1).Itshowsthatthepreviousestimates et al. 1992a is still there but much less apparent. arejustsmall(cid:13)uctuationsinthebottomofthegeneral { The best stable estimate for the time delay is depression around the delay of 423 days. These (cid:13)uc- 423(cid:6) 6 days. For (cid:28) = 423 days we are able to ex- tuations are due to the interplay of the small ‘bumps’ plainatleast98percentofthegeneralvariabilityofthe in the B lightcurve and sampling irregularities. De- twocurves. We useddi(cid:11)erent degrees (0(cid:1)(cid:1)(cid:1)10) forthe pending on themethod used (its degree of smoothing, perturbing polynomial l(t), analysed di(cid:11)erent random detrending etc.) these small features show themselves subsets of the data set (not all computations are ex- often as local (or sometimes even global) minima in plicitlydescribed inthis paper), useddi(cid:11)erent median the dispersion spectra. (cid:12)lterlengthsforgeneratingbootstrapsamplesetc.but were still left with the value 423 days. Even when we 4.2. Physical: Microlensing ignored the statistical weights (observational errors) given by observers and used unitary weights for all The slow trend and the \bump" in the di(cid:11)erence B (cid:0)A data we got the same results. (cf.Fig. 12) is most likely due to microlensing by stars in { There is a signi(cid:12)cant discrepancy between the the lensing galaxy. Since, however, the dispersion in the A and B light curves not fully described by light curves and consequently also in B(cid:0)A also shows a the simple lensing model. This fact is welldemon- strongtrend,therebydemonstratingtheprogressinobser- stratedinFig.12.Wemodi(cid:12)edourdispersionestimates vationaltechnique and experience, we can not completely to take into account smooth (with respect to time) rule out that the trend in B(cid:0)A is also a (learning curve di(cid:11)erences. Even after that, some small unexplained like)artefact of the decreasing observational errors. variations remained. Taken at face value the variability in the di(cid:11)erence { The previous delay estimates { 409 and 415 B (cid:0) A is in agreement with numerical simulations con- days { can be considered as di(cid:11)erent manifes- cerning its time scale and its brightness change (see, tations of the feature at 423 days. In Paper I we e.g., Kayser (1992), Wambsganss (1993), and references obtainedtwoestimatesforthemostprobable timede- therein). The \bump" can be interpreted as an high am- lay: 415(cid:6)32 for the optical data and 409(cid:6)23 days pli(cid:12)cation event in image A or as a deampli(cid:12)cation event due to overfocussing in image B (Kayser et al. 1986). 10 J. Pelt etal.: Light curve andtime delay of QSO 0957+561 4.3. Physical: Hubble parameter With our value for the time delay,the Hubble param- eter is already constrained to be smaller than 70 km/(s While one major uncertainty in the determination of the Mpc). The remaining uncertainty in the determination of HubbleparameterH0 fromthe0957+561lightcurveswas H0 is due to the uncertainty in the mass of the lensing up to now the controversy on the exact value of the time galaxy. delay, with our result the remaining uncertainty is due to the mass model of the lensing galaxy and cluster. The Acknowledgements. We are especially grateful to Rudy Schild mass distribution for this particular object is rather com- and David Thomson for providing us with their optical data plicatedsincethemainlensinggalaxy(atz =0:36)issitu- set prior to publication. Further thanks to P.Helbig for read- atedclosetothecentreofarichgalaxycluster.Bernstein, ing the manuscript and helpful discussions. Part of this work Tyson & Kochanek (1993, hereafter BTK) reported the wassupportedbytheDeutsche Forschungsgemeinschaft,DFG detection of some arcs 2000 away from the lensing galaxy. under 436EST and Schr 417. They concludedthat these arcs can onlybe explainedif a 0 group of galaxies at z = 0:5 and 1:5 away from the lens- A. Appendix. Adaptive median (cid:12)ltering inggalaxycontributes signi(cid:12)cantlytothelightde(cid:13)ection, therebycomplicatingthelensmodelsevenmore.However, Let f(ti);i = 1;:::;N be a time series, M the (cid:12)lter Dahle,Maddox&Lilje(1994,hereafterDML)showedthat length (M (cid:28) N) and (cid:14) the maximum distance between the arcs found by BTK are simple chance alignments of the two observations ((cid:14) (cid:28) tn (cid:0)t1) to be considered as separate objects. Using the method of Kaiser & Squires nearby. Then for every observation f(ti) its (cid:12)ltered value (1993) DML determined the mass distribution and espe- f^(ti) is de(cid:12)ned as the median value of the longest subse- ciallythecentreofthegalaxyclusterfromweaklensingof quence f(ti(cid:0)n);n = (cid:0)L=2;:::;L=2; L (cid:20) M of the orig- background galaxies around QSO 0957+561 and thereby inal data set which is centered around the point under re(cid:12)ned the model of Kochanek (1991). This model basi- discussion and which does not contain gaps longer than cally uses an elliptical pseudo-isothermal cluster and an (cid:14). In this way single observations remain untouched and elliptical isothermal galaxy. groups of nearby observations with less than M observa- With our new result for the time delay (cid:28) = 423 days tions in them will be (cid:12)ltered with shorter (cid:12)lter lengths. we obtain, using the Kochanek-DML model, This de(cid:12)nition allows us to (cid:12)lter the whole sequence in the best achievable way. km km 14 <H0 <67 : sMpc sMpc References The remaining uncertainty is due to uncertainty in the BernsteinG.M.,TysonJ.A.,KochanekC.S.,1993,AJ105,816 massofthelensinggalaxy.Despitethisthedetermination (BTK) of the time delay for QSO 0957+561 is now already con- Borgeest U., 1986, ApJ 309, 467 strainingH0 tobe smaller than 70km/(sMpc) with high Borgeest U., 1984, A&A 141, 318 statistical signi(cid:12)cance. Dahle H., Maddox S.J., Lilje P., 1994, ApJ 435, L79 (DML) Falco E.E., 1992, Lect. Not.Phys. 406, 50 Falco E.E., Gorenstein M.V., ShapiroI.I., 1991, ApJ 372, 364 5. Conclusions KayserR., 1990, ApJ 357, 309 KayserR., 1992, Lect.Not.Phys. 406, 143 Using the dispersion estimation technique developed in KayserR., Refsdal S., Stabell R.,1986, A&A 166, 36 Paper I in its simplest non-parametric form we were able Kochanek C.S., 1991, ApJ 382, 58 toshowonthebasisofthepresentlyavailablephotometric Leh(cid:19)ar J. Hewitt J. N., Roberts D. H., Burke B. F., 1992, ApJ datathatthetimedelaybetweentheimagesofthedouble 384, 453 quasar0957+561isinthevicinityof420daysratherthan Pelt J., Ho(cid:11) W., Kayser R., Refsdal S., Schramm T., 1994, 536 days. A&A 256, 775 (Paper I) Usingre(cid:12)nementstoourmethodsinordertominimise PressW.H., Rybicki G.B., Hewitt J.N.,1992a, ApJ385, 404 noise due to observational errors and sampling problems PressW.H., Rybicki G.B., Hewitt J.N.,1992b, ApJ385, 416 we then derived the (cid:12)rst accurate and statistical reliable Refsdal S., 1964a,MNRAS 128, 295 value for the time delay, namely 423(cid:6)6 days. Refsdal S., 1964b,MNRAS 128, 307 Schild R.E., 1990, Lect.Not. Phys. 360, 102 The remaining discrepancies between the light curves Schild R.E., Thomson D.J., 1993, in: Gravitational Lenses in of the images A and B can most naturally be explained The Universe,Proceedings of the 31th Li(cid:18)ege International by microlensing due to stars in the lensing galaxy. Using Astrophysical Colloquium, Eds.: J.Surdej et al., p.415 simplepolynomial (cid:12)tswe were ableto explain 98% of the Schild R.E., Thomson D.J., 1994, preprint (ST94), submitted variabilityof 0957+561 by our model. to AJ We show again, as in Paper I, that the available radio Vanderriest C., Schneider J., Herpe G., Chevreton M., Moles dataarecompatible withour resultobtainedfrom optical M.,Wl(cid:19)erick G., 1989, A&A 215, 1 (V89) data. WalshD.,CarswellR.F.,WeymannR.J,1979,Nature279,381