Wilson, J., Barker, S., & Ridgwell, A. J. (2012). Assessment of the spatial variability in particulate organic matter and mineral sinking fluxes in the ocean interior: Implications for the ballast hypothesis. Global Biogeochemical Cycles, 26(04), [GB4011]. https://doi.org/10.1029/2012GB004398 Publisher's PDF, also known as Version of record Link to published version (if available): 10.1029/2012GB004398 Link to publication record in Explore Bristol Research PDF-document Copyright © 2012 American Geophysical Union University of Bristol - Explore Bristol Research General rights This document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Full terms of use are available: http://www.bristol.ac.uk/red/research-policy/pure/user-guides/ebr-terms/ GLOBALBIOGEOCHEMICALCYCLES,VOL.26,GB4011,doi:10.1029/2012GB004398,2012 Assessment of the spatial variability in particulate organic matter and mineral sinking fluxes in the ocean interior: Implications for the ballast hypothesis J. D. Wilson,1 S. Barker,1 and A. Ridgwell2 Received30April2012;revised10August2012;accepted16September2012;published2November2012. [1] Multiple linear regression analysis (MLRA) applied to sediment trap data has been highly influential in identifying a plausible ‘ballasting’ mechanism that directly links the settling fluxes of particulate organic carbon (POC) to those of denser, inorganic minerals. However, analysis to date has primarily been carried out at the global scale, missing spatial variability in the flux relationships that may be important. In this paper, Geographically Weighted Regression (GWR) is applied to an updated deep (>1500 m) sediment trap database (n = 156), using the MLRA approach of Klaas and Archer (2002) but now allowing the carrying coefficients to vary in space. While the global mean carrying coefficient values for CaCO , opal, and lithogenics are broadly consistent 3 with previous work, the GWR analysis reveals the existence of substantial and statistically significant spatial variability in all three carrying coefficients. In particular, the absence of a strong globally uniform relationship between CaCO and POC in our 3 spatial analysis calls into question whether a simple ballasting mechanism exists. Instead, the existence of coherent spatial patterns in carrying coefficients, which are reminiscent of biogeochemical provinces, points toward differences in specific pelagic ecosystem characteristics as being the likely underlying cause of the flux relationships sampled by sediment traps. Our findings present a challenge to ocean carbon cycle modelers who to date have applied a single statistical global relationship in their carbon flux parameterizations when considering mineral ballasting, and provide a further clue as to how the efficiency of the biological pump in the modern ocean is regulated. Citation: Wilson,J.D.,S.Barker,andA.Ridgwell(2012),Assessmentofthespatialvariabilityinparticulateorganicmatterand mineralsinkingfluxesintheoceaninterior:Implicationsfortheballasthypothesis,GlobalBiogeochem.Cycles,26,GB4011, doi:10.1029/2012GB004398. 1. Introduction 2007; Honjo et al., 2008]. Understanding the processes that control the efficiency of the biological pump in trans- [2] Sinking particles transfer particular organic carbon portingcarbonandnutrientstodepthiskeytounderstanding (POC) and associated nutrients from the upper ocean to the how the marine carbon cycle functions and regulates atmo- deep ocean and sediments in a process known as the bio- spheric carbon dioxide (CO ) (e.g. Archer and Maier- logical pump [Honjo et al., 2008]. These particles, ulti- 2 Reimer [1994]). mately derived from the growth of phytoplankton at the [3] The ratio of particulate inorganic to organic carbon sunlit surface and carbon fixation through photosynthesis, (PIC:POC) within sinking particles is known as the ‘rain are also often associated with biominerals such as biogenic ratio’ and is important in communicating changes at the silica (opal) and calcium carbonate (CaCO ). As these par- 3 surface to the deep ocean and sediments. For instance, on ticlessink,themajorityofPOCandassociatednutrientsare time-scalesofafewthousandyears,areductionintheexport remineralized (predominantly) by bacterial metabolic pro- rainratioof40%,ifcommunicatedtothesedimentscould,in cessesandzooplanktonfluxfeedingintheupper(cid:1)1000m, theory,leadtoa70–90ppmdrawdownofatmosphericCO leaving a small (5–10%) fraction (relative to that at 100 m) 2 through the increased dissolution of carbonate sediments sinking to depth [Stemmann et al., 2004; Loubere et al., [Archer and Maier-Reimer, 1994]. However, studies based on the analysis of deep sediment trap data have observed a 1SchoolofEarthandOceanSciences,CardiffUniversity,Cardiff,UK. strong global correlation between mass fluxes of POC and 2SchoolofGeographicalSciences,UniversityofBristol,Bristol,UK. CaCO , suggesting some mechanism of coupling exists 3 Correspondingauthor:J.D.Wilson,SchoolofEarthandOcean between these important parameters at depth [Armstrong Sciences,CardiffUniversity,MainBuilding,ParkPlace,CardiffCF103AT, etal.,2001].The‘ballasthypothesis’positsthatCaCO ,and 3 UK.([email protected]) toalesserextentopalandlithogenicmaterial,aidsthesink- ©2012.AmericanGeophysicalUnion.AllRightsReserved. ing of particles through increasing mean aggregate density 0886-6236/12/2012GB004398 GB4011 1of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 [Armstrong et al., 2001; Klaas and Archer, 2002]. If true, [7] SpatialvariabilityintherelationshipbetweenPOCand this would have the effect of buffering changes in the mineralswasnotedinsedimenttrapdatabyRagueneauetal. rain ratio originating at the surface and reducing the [2006]and De LaRocha andPassow[2007]whosuggested potential for altering atmospheric CO [Ridgwell, 2003]. global MLRA was, therefore, inappropriate and may have 2 In the context of ocean acidification and the potential for misleadinglyresultedinthelowcarryingcoefficientsobtained decreased pelagic calcification driven by falling surface foropalandlithogenics.BoydandTrull[2007]alsonotethat ocean carbonate saturation, ballasting creates a positive using globalannualmean fluxesignoresa large partofvari- feedback to CO by reducing the efficiency of the bio- ability resulting from processes like El Niño as well as the 2 logical pump [Barker et al., 2003; Heinze, 2004; Riebesell biogeochemicalsourcesofthefluxes.Apreviousbasin-scale et al., 2009]. analysis(Table1)showedconsiderableregionalvariabilityin [4] Particles have been shown to sink faster under labo- the dominance of one mineral over another. Global MLRA ratoryconditionsduetotherelativelyhighdensityofCaCO maythenbehidingimportantregionalvariabilitywhichhas 3 [Ploug et al., 2008; Engel et al., 2009; Iversen and Ploug, implications for how the ballast mechanism is interpreted 2010], supporting the statistically-based ballast hypothesis. and particularly for how it is mechanistically implemented However, other lines of evidence point to alternative inter- inglobalmodels. pretations.Forinstance,rollingtankexperimentsshowedthe [8] To date there has been no general assessment of the POC:mineral ratio was dependent on the amount of POC spatial variability of the carrying coefficients of ballast present acting as a ‘glue’, aggregating mineral fluxes minerals.Thisisimportantforunderstandingthepreviously [Passow, 2004; Passow and De La Rocha, 2006; De La observed differences between global coefficients and those Rocha et al., 2008] and controlling the sinking of CaCO seenfromindividualsites(Table1).Ragueneauetal.[2006] 3 rather than vice versa. Alternatively, variations in surface took the first step in this respect and applied MLRA to ecosystem composition might dictate the packaging and sedimenttrapdatadividedbymajoroceanbasin.Thisbroad remineralizationofparticles[Francoisetal.,2002;Lamand delineationwasreasonablyjustifiedbutfurtherreductionof Bishop, 2007; Lam et al., 2011] and give rise to the flux the spatial scale poses particular problems. Smaller spatial relationships observed at greater depth. These alternative groupings for regression could be justified, such as biogeo- explanations significantly challenge our mechanistic under- chemical provinces (see Vichi et al. [2011]) but this intro- standingofthedynamicsofPOCfluxesandcreatesubstan- duces a level of subjectivity, as well as problems with the tial uncertainty in both the magnitude and sign of carbon relatively sparse sampling coverage of sediment trap data cycle feedbacks to possible future perturbations [Barker sets compared to the number of biogeographical provinces. et al., 2003; Riebesell et al., 2009]. In response to this we describe the novel application of [5] TheglobalsedimenttrapanalysisofKlaasandArcher Geographically Weighted Regression (GWR), that allows [2002] has been highly influential in quantifying the corre- coefficients to vary in space and helps avoid the problems lation between POC and CaCO and helping to formulate stated above. The data set and technique are described in 3 the ballasting hypothesis. In that study, the mass flux of section2andappliedusingthecarryingcoefficientapproach POC was expressed as a linear function of three dominant ofKlaasandArcher[2002]toexplorethespatialvariability mineralfluxes(CaCO ,opal andlithogenic material),using ofthe thesestatistical parameters. 3 multiple linear regression analysis (MLRA). The derived ‘carrying coefficients’ (the regression coefficients) were 2. Methodology largest for CaCO (0.070–0.094), lowest for opal (0.023– 3 0.030), and rather variable for lithogenics (0.035–0.071) 2.1. SedimentTrap Data [Klaas and Archer, 2002] with the resulting statistical [9] An updated global sediment trap data set has been models able to explain a large proportion of the observed collatedforthisstudy.Thefluxandmetadatafromsediment variabilityinPOCflux.Thethreemineralmodelprovidesa trapsareavailableintheauxiliarymaterial.1Themajorityof basisforunderstandingglobalvariabilityinPOCtomineral the data set used here is from the U.S. Joint Global Ocean ratiosandcanreplacethisterminthemechanisticmodelof Flux Study (JGOFS) available online via: http://usjgofs. Armstrong et al. [2001], making this a useful method for whoi.edu/mzweb/syn-mod.htm. A full description of the parameterizing particle fluxes in a range of ocean carbon JGOFS data set and methodologies can be found in Honjo cycle models [Howard et al., 2006; Oka et al., 2008; et al. [2008]. Other data sets were obtained from addi- HofmannandSchellnhuber, 2009]. tional studies and the World Data Centre for Environ- [6] Theunderlyingassumptionwhenanalyzingtheglobal mental Sciences (WDC-MARE) online database. database in this way is that the statistical relationships (the [10] Sediment traps at relatively shallow depths (approx. coefficient values) are the same for any location in the <1000–1500 m) have been shown to be inefficient at trap- ocean, i.e. they are assumed stationary in space. The use of ping particulate material [Scholten et al., 2001; Yu et al., theseglobalstatisticalrelationshipsinmodelsthenexplicitly 2001]. Forthisreason,andtobeconsistentwiththebulkof makes this same assumption. However, it is reasonable to previous work, only flux data at >1500 m were primarily expectthattheserelationships maynotbeconstant inspace consideredhere,althoughadditionalanalysiswitha>1000m (ortime)i.e.theymayexhibitspatialnonstationarity,which cut-off depth was carried out to enable comparison to can be characterized by a non-random distribution of resi- Ragueneau et al. [2006]. Selected data includes only sedi- dualsinspace [Fotheringham etal., 1998].Thispotentially menttrapsthatsampledoveraminimumperiodof320days raises issues for the interpretation of global regression coefficients and their explicit use as a parameterization in 1Auxiliary materials are available in the HTML. doi:10.1029/ modelingstudies. 2012GB004398. 2of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 Table1. CarryingCoefficients(i.e.,KlaasandArcher[2002])DerivedFromPreviousMultipleLinearRegressionAnalysesAppliedtoa RangeofGlobal,RegionalandTimeSeriesSedimentTrapDatasetsa GlobalAnnualAverage CaCO Opal Lithogenicb R2 3 KlaasandArcher[2002]c n=78 0.075 0.029 0.052 – >2000m 0.064–0.086 0.020–0.037 0.034–0.070 Francoisetal.[2002]c n=62 0.074 0.015 0.074 0.93 >2000m 0.064–0.084 0.08–0.022 0.051–0.097 Ragueneauetal.[2006] n=189 0.081 0.031 0.035 0.89 >1000m 0.073–0.089 0.023–0.039 0.029–0.041 RegionalAnnualAverage Ragueneauetal.[2006] n=84 0.077 0.171 0.031 0.87 (Atlantic) >1000m 0.053–0.101 0.047–0.186 0.023–0.039 Ragueneauetal.[2006] n=16 0.026d 0.201 0.015d 0.96 (Indian) >1000m (cid:4)0.007–0.057 0.123–0.279 (cid:4)0.079–0.049 Ragueneauetal.[2006] n-89 0.063 0.041 0.024 0.95 (Pacific) >1000m 0.055–0.071 0.035–0.046 0.018–0.030 RegionalTimeseries Wongetal.[1999]c 1982–1993 0.021 0.013 0.0233 0.69 (OceanStationP) 3800m 0.002–0.039 0.008–0.034 0.170–0.297 Conteetal.[2001]c 1978–1984 0.045 0.063 0.065 0.98 (BermudaSCIFF) 3200m 0.038–0.053 0.024–0.102 0.034–0.096 HondaandWatanabe[2010]e 1998–2006 0.025 0.044 (cid:4)0.006 0.92 (W.PacificSubarcticGyre) 4810 n/a n/a n/a aRangesgivenindicate95%rangeofcarryingcoefficients(2(cid:5)standarderror). bAlllithogenicmaterialisestimatedasTotalMass(cid:4)(POC*POMConversionfactor)+CaCO +Opal,exceptHondaandWatanabe[2010]whichwas 3 derivedfromAlmeasurements. cAdaptedfromTable3ainBoydandTrull[2007]. dValuesfrommultipleregressionanalysiswereinsignificantatp>0.05. eDatawerenormalizedtoaverageofeachtimeseriescomponentbeforeregressionanalysis. to maximize the quantity of data and its spatial coverage function of the fluxes of CaCO3 (FCaCO3), Opal (FOpal) and while retaining a reasonable annual coverage. Finally, data lithogenic material (F )atdepth (z): litho were excluded if any observations were missing for major F ðzÞ¼b þb (cid:3)F ðzÞþb (cid:3)F ðzÞþb components(totalmassflux,POC,PIC,biogenicsilica).The POC 0 CaCO3 CaCO3 Opal Opal litho massfluxesofCaCO andopalwereestimatedfromPICand (cid:3)F ðzÞ ð1Þ 3 litho biogenic Si using conversion factors of 8.33 and 2.14 respectively [Mortlock and Froelich, 1989]. The flux of Previous analyses assumed that the regression passed lithogenicmaterialwasestimatedastheremainingfractionof through the origin [Ragueneau et al., 2006] requiring that total mass flux when CaCO , opal and particulate organic the flux of POC must be zero when the flux of minerals is 3 matter are subtracted. In a small number of cases this pro- zero.Weincludeanadditionalinterceptterm(b )sothatthe 0 duced negative values for lithogenic flux, which were then analysis is amenable and directly comparable to the geo- treatedaszero(asinSalteretal.[2010]). graphically weighted technique we employ (Section 2.2.2) [11] The resulting data set comprises 156 individual sedi- and also to create a more general model in which it is pos- menttrapobservationswhichincludedataonPOC,PIC(as siblethattherecouldadditionalPOCnotdirectlyassociated CaCO3), biogenic silica (as opal), lithogenic and total mass with themineral flux. fluxes. The data set includes observations from 25 biogeo- 2.2.2. Geographically WeightedRegression Model chemical provinces. In comparison to the previous global [13] Geographically weighted regression is a relatively data sets [Francois et al., 2002; Klaas and Archer, 2002], novel but simple technique of regression which allows the this data set is larger in size (n = 156 c.f. n = 62–78) and estimation of local statistical parameters (for a full descrip- provides greater spatial coverage, particularly for the tion see Fotheringham et al. [2002], also Brunsdon et al. southernhemisphere.Thedatasetisofacomparablesizeto [1998] and Fotheringham et al. [1998]). The global multi- a recent sediment trap collation [Honjo et al., 2008]. How- plelinear regression modelcan beconsidered as: ever, because approximately half of the data set is in com- X monwithpreviousanalyses,wewouldnotexpecttheresults yi ¼a0þ akxikþ(cid:1)i; ð2Þ of our global analysis to be substantially different from k previous studies. where k predictors are used to predict y at the i point in th 2.2. Regression Analysis space. The global model can be re-written to estimate local parameters as: 2.2.1. Global Regression Model X us[e1d2]inHeKrelawaes aapnpdlyAthrcehmerul[ti2p0l0e2l]in.eTarheregbraessisciornegarneaslsyisoins yi ¼a0ðui;viÞþ akðui;viÞxikþ(cid:1)i; ð3Þ k analysis expresses the flux of POC at depth (F ) as a POC 3of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 where (u, v) denotes the coordinates of the ith point in to assess the significance of the spatial variability of GWR i i space. a (u, v) isthen a realization of the continuous func- coefficients. Under thenull hypothesis,any randompermu- k i i tion a (u, v) at point i [Fotheringham et al., 1998]. The tation of thedata isequallylikely tooccur. k global model can be seen as a case where the parameter 2.2.3. Application ofGWRto SedimentTrap Data surfaceisconsideredconstantinspace.GWRapproximates [17] We suggest that geographically weighted regression the above equation by selecting a subset of data around i provides a rigorous approach to assessing the spatial vari- which is weighted according to their distance from i. This ability of carrying coefficients. Although there has been assumes that parameters display a degree of spatial consis- extensive work on extending the predictive use of GWR tency suchthat parameters become increasingly different as [Harrisetal.,2011;KumarandLal,2011],GWRisapplied distance increases fromi. here as an exploratory tool. We apply GWR to our updated [14] The weighting function (kernal) used in GWR can sediment trap data set using software (GWR 3.0) kindly takedifferentforms.Thesimplestapproachappliesaweight provided by M. Charlton of the National Centre for Geo- of 1 to all data within a set distance (d) of i and 0 to data computation at National University of Ireland Maynooth. outsideofthisarea.Thisfunctioncreatesartificialboundaries An adaptive kernel (defining a subset of data by number of and could create artifacts in the patterns of parameters esti- nearestneighbors)waschosentogroupdatabecauseafixed mated. An alternative is to use a continuous weighting kernel(definingasubsetofdatastrictlybydistance)failedto functionsuchasinexponentialformorabi-squarefunction produceresultsduetothesparsesamplingdensityinspace. that approximates a Gaussian weighting within the band- A fixed bandwidth could not account for the data points in widthandzerobeyondthis(4): relative isolation such as those in the Southern Ocean. The h (cid:2) (cid:3) i AICc minimization technique was used to find the optimal w ¼ 1(cid:4) d =b 2 2 if d <b¼0 otherwise; ð4Þ bandwidth. Data were weighted using bi-square function. ij ij ij AMonteCarlosignificancetestwasusedtodeterminewhether regressioncoefficientswerespatiallyvariable[Hope,1968]. where the weight applied (w ) is a function of distance (d) ij betweeniandjandaparameterreferredtoasthebandwidth (b), defining the total distance of the subset of data 3. Results [Fotheringham et al., 2002]. The bandwidth can be defined 3.1. Comparisonof Global Regression Model Results either as a fixed distance (fixed kernel) or as a number of 3.1.1. Global Regression Analysis nearest neighbors (adaptive kernel) in the data set and is a global parameter. The latter definition allows the weighting [18] The sediment trap data set shows similar global rela- tionshipstothoseobtainedpreviouslywithsmallerdatasets functiontorespond tochangesinsamplingdensity wherea (Figure 1). The correlation between POC and CaCO is fixed kernel may be unsuitable toapply. In both definitions 3 theweightingfunctiondefinesa‘bumpofinfluence’around strongest(r=0.60),whilethecorrelationbetweenPOCand opal is weaker (r = 0.35), with lithogenic fluxes being dataatpointi. intermediate (r = 0.45). Visually, the association of POC [15] ThesizeofthebandwidthisacriticalfactorinGWR with both opal and lithogenic material (Figure 1b) suggests asitdefinestheareaofinfluenceofeachregression.Alarger thepresenceoftwoseparatedistributions,oneofhighPOC bandwidthwillloseinformationaboutthespatialvariability flux and low mineral flux and the second of low POC flux in coefficients and bias the results toward the global withhighmineralflux(thiswasoriginallynotedforopalby regression. If regression coefficients are considered to be a Klaas and Archer [2002]). The scatterplot for CaCO also continuousfieldinspace,theneachdatapointinthesubset 3 displays more variability than previously recognized of data will have a unique coefficient value, but defining a (Figure 1b). All scatterplots display regional differences subset of data around each point forces the coefficient to a when separated into ocean basins as previously noted by common (essentially an average) value for that subset. Ragueneau et al. [2006] and De La Rocha and Passow Therefore, coefficients can never be completely unbiased [2007]. The global POC:mineral ratio is 0.052, close to the because there is always a level of spatial averaging. To ratio observedbyArmstrong etal.[2001]. minimizethebiasofcoefficients,asmallsubsetofdataclose to i is preferable although this increases the variance and [19] Multiple linear regression is used on the global data set to express the flux of POC as a summed linear function standarderroroftheestimate.Thereis,therefore,atrade-off of mineral fluxes, and, as observed previously, suggests a betweenincreasedvarianceatsmallsubsetsandbiastoward dominantroleforCaCO (Table2).Theresultingregression the global coefficients at larger subsets. To address this 3 model is significant at p < 0.001 and predicts 66% of the issue, the bandwidth can be calibrated using the cross vali- variability in POC fluxes (R2 = 0.66). The carrying coeffi- dation score (CV) or the corrected Akaike Information cient for CaCO is close to, although very slightly higher Criterion(AICc:seeAkaike[1974]).Thesevaluesexpressthe 3 than,previousestimates(Table1).Thecoefficientforopalis overall performance of regression models and can take into also consistent with previous estimates while the lithogenic account the bias-variance trade-off, providing an estimate of coefficient shows the most variability between studies. The thebestbandwidthtouse[Fotheringhametal.,1998]. lithogenic estimate here is much lower than that found by [16] A number of statistical tests have been defined to Klaas and Archer [2002]. Both our estimate and that of allow the assessment of GWR models against global Ragueneauetal.[2006]arederivedfromsignificantlylarger regression models (for full details, see Fotheringham et al. data sets (n = 156–189 c.f. n = 62–78) suggesting that this [2002]). These include Analysis of Variance (ANOVA), value could bemore globally representative. testing the null hypothesis that the GWR model represents noimprovementontheglobalmodel,andaMonteCarlotest [20] The R2 value is relatively low in comparison to the globalstudiesinTable1.Thisisduetotheinclusionofthe 4of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 Figure1. (a)Locationsofglobalsedimenttrapsat>1500m(dots)andadditionaldataat>1000m(open circle). Data locations are given in the context of biogeochemical provinces as per Longhurst [1998]. (b) Global annual mass fluxes of particulate organic carbon versus global annual mass fluxes of (left) CaCO , (middle) opal and (right) lithogenic material as measured by sediment traps >1500 depth. Flux 3 data from different ocean basins are indicated by symbols to highlight regional differences (adapted from Ragueneau et al. [2006]). Table2. Carrying Coefficients Calculated Using Multiple Linear Regression With Mass Flux Data for Different Depth Ranges andSpatialScalesa GlobalData CaCO Opal Lithogenic R2 3 >1500m(n=156) 0.089 0.023 0.027 0.66 0.076–0.102 0.012–0.034 0.017–0.037 >1000m(n=186) 0.080 0.017 0.033 0.65 0.068–0.097 0.007–0.028 0.027–0.039 RegionalData(>1500m) Atlantic(n=54) 0.083 0.152a 0.027NS 0.58 0.047–0.118 0.028–0.276 (cid:4)0.007–0.060 Indian(n=25) 0.083 0.058a 0.058 0.94 0.058–0.108 0.003–0.113 0.034–0.083 Pacific(n=63) 0.056 0.033 0.022 0.80 0.041–0.071 0.025–0.041 0.015–0.028 SouthernOcean(n=12) 0.183NS (cid:4)0.022NS 0.034NS 0.37NS (cid:4)0.065–0.431 (cid:4)0.139–0.095 (cid:4)0.050–0.117 aAllcoefficientsaresignificantatp<0.001exceptawherep<0.05.Ninety-fivepercentconfidenceintervalsaregivenas2(cid:5)standarderror.Notethat themodelfortheSouthernOceanisnotsignificant. 5of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 Figure 2. The spatial distribution of residuals from (a) global multiple linear regression and (b) geo- graphicallyweightedregressionanalysis.FluxesofPOCarepredictedusingmassfluxesofCaCO ,opal 3 and lithogenic material as in [Klaas and Archer, 2002]. The models predict (cid:1)66% and (cid:1)82%, respec- tively, of thevariability inobservedPOCfluxes. intercept term in equation (1). Repeating the analysis with- geographically weighted regression analysis we include the outtheintercepttermincreasesR2to0.89,avalueconsistent intercept, justified by the fact the residual mean squares are with previous studies (Table 1). (The large difference very similar and that the derived carrying coefficients only betweenthesevaluesisduetothecalculationofR2fromthe differbyafactorof0.0001.Note,therefore,thatR2valuesin total sum of squares which is uncorrected without an inter- thispaperarenotdirectlycomparabletopreviousstudies. cept, leading to inflated values of R2 [Montgomery et al., [21] A regional breakdown of the data to an ocean basin 2006].) To be consistent and directly comparable with scale is also shown in Table 2 and is comparable to that by 6of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 Table3. SummaryStatisticsoftheGWRModelsforSedimentTrapDataat>1500ma Bandwidth AIC R2 ANOVASig. CaCO Opal Lithogenic 3 66(optimal) 269.82 0.82 <0.001 0.066*** 0.033*** 0.022*** 20 318.04 0.90 <0.001 0.056 0.040*** 0.026*** 156 313.41 0.71 <0.001 0.089*** 0.025 0.026 aAbandwidthof66istheoptimalvaluechosenfromtheAICccalibration.Additionalbandwidthsof20and156areprovidedto explore the sensitivity of the outcomes to bandwidth. ANOVA statistics test the null hypothesis that the GWR model is no improvement on the global model. ‘Global’ carrying coefficients are the median of the 156 coefficients. Monte Carlo test for spatialvariabilityrejectsthenullhypothesisat*p<0.05,**p<0.01,***p<0.001. Ragueneau et al. [2006] as summarized in Table 1. The issignificantspatialvariabilityintheregressioncoefficients. carrying coefficient for opal in the Atlantic displays vari- A visual comparison of the residuals from the GWR ability compared to other basins and is quite distinct from (Figure 2b) against the global regression residuals theglobalvalue(0.152c.f.0.023).InthePacificandIndian (Figure 2a) indicates greater heterogeneity in areas previ- basins, our spatial coefficients are more consistent with the ously characterized by clustered residuals, such as the globalvalues.TheIndianbasinvaluesdiffer tothosefound equatorial Pacific, Atlantic and the Indian Ocean. These inTable1thereasonforwhichisunclear.Therelevantdata metrics suggest that the use of a spatially informed regres- setinTable1ismuchsmaller(n=16c.f.n=25)anditmay sion technique isjustified here. be that these coefficients are more influenced by outliers in [25] GWR calculates regression coefficients and other theregression.Inparticular,itisalsoworthconsideringthat statistics at each data point, allowing them to be mapped the Pacific basin is the largest in size, and potentially (Figure 3). The results show distinct regional groupings of includes multiple sources of variability unlike the smaller thecoefficientswithminimalvariabilitywithinthesegroups. Indian basin (i.e. as indicated by the number of biogeo- AnexceptiontothisarethecoefficientsforopalintheNorth chemical provinces in each). Therefore, it is uncertain Atlantic, whichshowa rangeof valuesinarelatively small whether the similarity of the Pacific coefficients to global area with no identifiable spatial trend. In this region, the values may be a product of averaging the potentially large bandwidthof66nearestneighborsthatdefinesthesubsetsof spatialvariabilityinfluxesorisactuallyrepresentativeofthe data is relatively large in comparison to the area (see values in this area. Table 2 also includes analysis of data Figure 6a) and is therefore influenced by significantly dif- from the Southern Ocean but the resulting model is not ferent values in the Arctic and equatorial Atlantic. This significant. This highlights the difficulties in conducting highlightsaparticularissueofusingGWRwiththisdataset: regional regression even at the scale of ocean basins. Mak- theanalysisislimitedbytherelativelysmallnumberofdata ing comparisons between areas like this is problematic pointscomparedtotheareasampled,suchthatareasoflow because of varying sample sizes and a lack of consistent sampling density may be produce spurious results. Data statistical methodology. points at the edge of ocean basins or in sparsely sampled [22] An alternative method of assessing whether regional areas, suchastheSouthernOcean,will alsotend toinclude variabilityexistsintheglobaldatasetistomaptheresiduals data from other basins. The inclusion of data from other of the global regression model (Figure 2a). An assumption ocean basins is an important caveat for this analysis and of regression is that the residuals should have a random limits the following section to the discussion of large-scale distributionaroundzero.Extendingthislogic,ifcoefficients spatial patterns in coefficients. We hereafter refer to this are truly global they should also exhibit residuals that are caveat asinter-basin influence. randomly distributed in space [Fotheringham et al., 2002]. [26] To explore the sensitivity of coefficients to the However, we note here that negative residuals appear to bandwidth and inter-basin influence, the bandwidth was clusterinthelow-latitudeAtlanticandPacificaswellasthe manually changed to 20 and 156 nearest neighbors in com- western sub-arctic Pacific whereas positiveresiduals cluster parison to the calibrated 66 neighbors (Table 3), and the in the Arabian Gulf and Indian Ocean. This supports the resultsfromthemostvariablemineral,CaCO ,wereplotted 3 contention that there is potential non-stationarity in the (Figures4aand4b).Abandwidthof156isusedtoassessif coefficients whicharenottruly global. the GWR technique can recover MLRA global coefficient values.Althoughthiswillincludealldatapointsinthedata 3.2. GeographicallyWeighted Regression set they will still be weighted by distance. Figure 4b shows [23] Geographically weighted regression analysis was thatmostofthecoefficientsconvergetowardaglobalvalue appliedtotheglobalsedimenttrapdatasetatdepths>1500m. around 0.09. The equatorial Pacific is the only area not to TheAICccalibrationwasusedtodetermineanoptimalband- conformtotheglobalvalue.Thisislikelytobearesultofa widthof66nearestneighbors. combination of densely sampled data being weighted 3.2.1. Assessing thePerformance of GWR heavily but in relative isolation to other data points. The [24] A reduction in the AIC score from the global model higher AIC score suggests this model has less ability to score of 341.6 and an increase in R2 from 0.66 to 0.82 predictPOCthanthecalibratedoptimalmodel,whichisalso suggest that GWR is an improvement on the global model suggested by the lower R2 value (0.71) although it is still (Table3).TheresultsoftheANOVAstatisticsshowthatthe higherthantheglobalmeanMLRAmodel(0.66)(Table1). GWR model is a significant improvement on the global With a much smaller bandwidth of 20, the R2 value increa- modelwhiletheresultsoftheMonteCarlotestsuggestthere ses but the higher AIC score indicates other aspects of the 7of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 Figure 3. The spatial distribution of coefficients calculated using geographically weighted regression analysisfor(a)CaCO ,(b)opal,and(c)lithogenicmaterial.Themodelisthesamemultiplelinearregres- 3 sionmodelappliedtotheglobaldatainTable2andFigure2.TheGWRanalysisusesabandwidthof66 nearestneighbors,definedbyanAICcminimizationcalibrationprocedure.Abi-squareweightingscheme wasused.TheGWRmodelpredicts(cid:1)82%ofthevariabilityinPOCfluxesandisanimprovementonthe globalmodel.Thecorrespondingglobalcoefficientvaluesare0.089,0.023and0.027for(a),(b)and(c), respectively. 8of15 GB4011 WILSONETAL.:SPATIALVARIABILITYOFPARTICLEFLUXES GB4011 Figure 4. Spatial distribution for CaCO :POC coefficients calculated using geographically weighted 3 regressionasinFigure3butwithmanuallyalteredbandwidthvaluesof(a)20and(b)156nearestneigh- bors. Only coefficients for CaCO are displayed, as it displays the most variability of the three minerals 3 considered. For reference, theglobalCaCO coefficient is 0.089. 3 model are worse, such as increased variance (Table 3). We Figure3a).Thissuggeststhatthesubsetofdatadefinedbya might expect coefficients to vary significantly at this band- bandwidth of 66 nearest neighbors may be influenced by widthbecausethecoefficientsrepresentlocalvaluesbutare significantlyhighervaluesintheSouthernOceanbutoverall also subject to influence from outliers. However, the large- our ability to capture the global mean MLRA coefficient scale general spatial patterns (Figure 4a) are comparable to values when relaxing the spatial bandwidth gives us those in Figure 3a, suggesting these are not an artifact of increased confidence inthis method. bandwidth size. A notable exception to this is the lower [27] The basin-scale coefficients in Table 2 appear to coefficients in the equatorial Atlantic (Figure 4a, cf. corroborate the GWR coefficients. Furthermore, to test the 9of15
Description: