ANCESTRY-CONSTRAINEDPHYLOGENETICANALYSISSUPPORTSTHE INDO-EUROPEANSTEPPEHYPOTHESIS WillChang ChundraCathcart UniversityofCalifornia,Berkeley UniversityofCalifornia,Berkeley DavidHall AndrewGarrett UniversityofCalifornia,Berkeley UniversityofCalifornia,Berkeley DiscussionofIndo-Europeanoriginsanddispersalfocusesontwohypotheses.Qualitativeevi- dencefromreconstructedvocabularyandcorrelationswitharchaeologicaldatasuggestthatIndo- European languages originated in the Pontic-Caspian steppe and spread together with cultural bp innovationsassociatedwithpastoralism,beginningc.6500–5500 .Analternativehypothesis, accordingtowhichIndo-EuropeanlanguagesspreadwiththediffusionoffarmingfromAnatolia, bp beginningc.9500–8000 ,issupportedbystatisticalphylogeneticandphylogeographicanalyses oflexicaltraits.ThetimeandplaceoftheIndo-Europeanancestorlanguagethereforeremaindis- puted.Herewepresentaphylogeneticanalysisinwhichancestryconstraintspermitmoreaccurate inferenceofratesofchange,basedonobservedchangesbetweenancientormedievallanguages andtheirmoderndescendants,andweshowthattheresultstronglysupportsthesteppehypothe- sis.Positingancestryconstraintsalsorevealsthathomoplasyiscommoninlexicaltraits,contrary totheassumptionsofpreviouswork.Weshowthatlexicaltraitsundergorecurrentevolutiondue torecurringpatternsofsemanticandmorphologicalchange.* Keywords: lexical change, linguistic phylogenetics, Indo-European chronology, Indo-European dispersal,steppehypothesis Thisarticlehasthreemaingoals.First,weshowthatstatisticalphylogeneticanalysis supports the traditional steppe hypothesis about the origins and dispersal of the Indo- European language family. We explain why other similar analyses, some of them widely publicized, reached a different result. Second, for skeptics about phylogenetic methodology,wesuggestthattheagreementbetweenourfindingsandtheindependent results of other lines of research confirms the reliability of statistical inference of re- constructed chronologies. Finally, for linguistic phylogenetic research, we argue that analysesgroundedintheevolutionarypropertiesofthetraitsunderstudyyieldmorere- liableresults.Ourdiscussionmakesreferencetoancestryrelationships,forexamplebe- tweenOldIrishandtwomodernlanguagesdescendedfromit,IrishandScotsGaelic, and draws on what can be learned from direct observation of changes over historical ancestry constraints time. In our phylogenetic analyses, we introduce and show thattheyresultinmorerealisticinferencesofchronology. Ourarticleisorganizedasfollows.Wefirstgivebackgroundinformationaboutthe steppeandAnatolianhypotheses,andaboutearlierphylogeneticanalyses(§1),anddis- cusslexicaltraits(§2)andlinguisticancestryrelationships(§3).Wethendescribeour dataandsomemeasurementsmadedirectlyonthedata(§4),explainourphylogenetic methods(§5),andsummarizeourexperimentalresults(§6).Finally,wediscusstheef- *WethankClaireBowern,RussellGray,SimonGreenhill,HannahHaynie,GaryHolland,LevMichael, JohannaNichols,DonaldRinge,andMichaelWeissforcommentsanddiscussion,TomRechtforhelpwith Greekdata,QuentinAtkinsonandRemcoBouckaertforsharingcodeanddata,MichaelDunnforassistance withIELEX,andaudiencesinBerkeley,Oslo,andSanDiegoforfeedbackandsuggestions.Ourworkwas supportedbytheDieboldFundforIndo-EuropeanStudies,UniversityofCalifornia,Berkeley(CC,WC,AG) andaGooglePh.D.FellowshipandGrantIIS-1018733fromtheNationalScienceFoundation(DH). 194 PrintedwiththepermissionofWillChang,ChundraCathcart,DavidHall,&AndrewGarrett.©2015. Ancestry-constrainedphylogeneticanalysissupportstheIndo-Europeansteppehypothesis 195 fects of advergence (§7) and ancestry constraints in phylogenetic modeling (§8), fol- lowedbyconclusions(§9)andappendiceswithdetailsaboutmethodsandresults.1 Indo-europeanbackground. 1. Thesteppeandanatolianhypotheses. 1.1. TherelationshipsofIndo-European (IE)languageshavebeenstudiedforovertwocenturies,butitisstilldisputedwhenand where their common ancestor Proto-Indo-European (PIE) was spoken, and how they spreadbeforetheyfirstappearedinhistoricalrecords about3,700yearsago.Twohy- pothesesdominatediscussion. Accordingtoatraditionalhypothesis(Gimbutas1973,1997,Mallory1989)accepted by many linguists (Ringe 2006, Parpola 2008, Fortson 2010, Beekes 2011), PIE was spokeninthePontic-Caspiansteppe,northoftheBlackandCaspianSeas.Thesteppe hypothesisassociatesIElanguagespreadwiththediffusionofculturalinnovationsre- latingtopastoralism,includinghorsedomestication,wheeledvehicles,andtheweaving of wool from woolly sheep.Analyses of archaeological data from this point of view suggestaPIEdispersaldatec.6500–5500bp,probablyinthefirsthalfofthatperiod.2 It is now also widely assumed that Anatolian was the first branch to separate from PIE.3Withintheframeworkofthesteppehypothesisthecommonancestorofthenon- Anatolianlanguages,Proto-Nuclear-Indo-European(PNIE),mightthenhavebeenspo- bp kenc.6000–5000 . According to an alternative hypothesis proposed by Renfrew (1987), IE languages spread into Europe with the diffusion of agriculture fromAnatolia; see also Renfrew 1999,2000a,b,2001,2003.Thismechanismisplausible,sinceclearcasesoflanguage dispersalwiththespreadofagricultureareknownelsewhereintheworld(Bellwood& Renfrew 2002, Diamond & Bellwood 2003, Bellwood 2004). Given that farming bce reachedsoutheastEuropebytheseventhmillennium (vanAndel&Runnels1995, Perlès2001,Bocquet-Appeletal.2009),theAnatolianhypothesisimpliesaPIEdisper- saldatec.9500–8000bp.4 1Datasetsandfiguresshowingsummarytreesforallofouranalysesappearinonlinesupplementaryma- terials,whichcanbeaccessedathttp://muse.jhu.edu/journals/language/v091/91.1.chang01.html.WecitePIE bp: ce verbrootsfromRixetal.2001,andweusethefollowingabbreviations: beforepresent(2000 ),HPD: highest posterior density, IA: Indo-Aryan, IE: Indo-European, IELEX: Indo-European Lexical Cognacy Database(http://ielex.mpi.nl/),ME:MiddleEnglish,OE:OldEnglish,PIE:Proto-IE,NIE:NuclearIE(the non-AnatolianIEclade),PNIE:Proto-NIE,RM:root-meaning(traits),RSC:restrictionsitecharacter,SDC: stochasticDollocharacter,(T)MRCA:(timeofthe)mostrecentcommonancestor. 2ArchaeologistshavesuggestedPIEdates‘about4,500bc’(Mallory1989)and‘about4,400–4,200bc’ (Anthony2013);seealsoDarden2001,Mallory&Adams2006,Anthony2007,andAnthony&Ringe2015. bp SomephylogeneticanalysesattributePIEdatesof6000–5000 tothesteppehypothesis(Gray&Atkinson 2003,Atkinsonetal.2005,Bouckaertetal.2012);RyderandNicholls(2011)characterizethehypothesis moreaccurately. 3Thisideahasalongpedigreeunderthelabelindo-hittite(Sturtevant1929,1933)andisacceptedinone form or another in much current research (Oettinger 1986, Strunk 1994, Melchert 1998, Lehrman 2001, Melchert2001,Ringeetal.2002,Jasanoff2003,Rieken2009,Yakubovich2010). 4AthirdhypothesisthathasgainedlesstractioncanbeseenasahybridoftheAnatolianandsteppehy- potheses.ItpositsthatPIEwasspokenatthetimeassumedbythesteppehypothesis,butineasternAnatolia (Barber2001)ortheCaucasus(Gamkrelidze&Ivanov1995,Ivanov2001).Accordingtothishypothesis, afterPNIEspreadnorthfromAnatolia,thesteppewasthestaginggroundforNIEexpansionintoEuropeand Asia.Allelsebeingequal,however,linguisticgeographyfavorsawesternoriginfortheAnatolianbranchof IE,sincethegreatestbranch-internaldiversityisinthewest;thisismosteasilyexplainedbyassumingthat Proto-AnatolianspreadacrossAnatoliafromthewest.Ivanov’s(2001)alternativeanalysisofAnatoliandi- alectologyhasnotgainedsupportfromotherspecialists(Melchert2001,Yakubovich2010). 196 LANGUAGE,VOLUME91,NUMBER1(2015) Inprinciple,evidencebearingonIEoriginsanddispersalmaycomefromarchaeol- ogy,genetics,orlinguistics.Atpresent,geneticdataisinsufficienttoresolvethematter, sinceancientEuropeanDNAandcomparisonofancientandmodernDNAconfirmnot justimmigrationfromtheNearEastatthetimeofthefarmingdispersal,butalsolater populationmovementfromnorthernEurasiathatisconsistentwiththesteppehypothe- sis;seeBrandtetal.2013andLazaridisetal.2014.Inpractice,discussionhasmainly focusedonarchaeologicalandlinguisticarguments. Several arguments have been advanced in favor of the steppe hypothesis; a recent review is by Anthony and Ringe (2015). The first argument is from archaeological analysis. For example, based on correlations among archaeological data, documented culturalpractices,andvocabulary,researchershavearguedthatProto-Indo-Iranianwas bp spokenc.4300–3700 inanareaofcentralAsiaaroundtheAralSea(Lubotsky2001, Witzel2003,Kuz’mina2007).5ThisincludestheSintashtacultureofthesteppetothe north,whoseeconomywaspastoralandwhosecemeteriescontainhorsesacrificesand chariots(Anthony2009),aswellasthemoreurbanizedBactria-MargianaArchaeolog- icalComplextothesouth(Hiebert1994).Thelattermayhavebeenthestagingground forIndo-Iraniandispersal.Ifthisargumentiscorrect,thenfromwhatwecaninferabout cultural interactions in this region, Indo-Iranian speakers probably entered the area from the steppe. This line of reasoning locates speakers of an Indo-Iranian precursor bp northoftheCaspianSeac.5000–4500 ,closeintimeandplacetoPNIEifthelatter bp was spoken in the steppe c. 6000–5000 . In other words, the diffusion of cultural traitsthatareobservedinthearchaeologicalrecord(andinsomecasesreportedinlater textual sources) correlates well with the chronology of the steppe hypothesis. Similar analyseslinkIE-speakingEuropeanpopulationswithculturechangesthatcanbeiden- tified as moving from the steppe and eastern Europe within the chronological frame- workofthesteppehypothesis. Asecond argument is based on inferences about environment and material culture from reconstructed vocabulary. For example, wheeled-transport vocabulary is recon- structedforPIE(Mallory&Adams1997,2006,Parpola2008)orPNIE(Darden2001). Since wheeled transport was invented long after farming reached Europe, if PIE or PNIEhadsuchvocabularyitcannothavebeenspokenbyearlyfarmers.Itisfairtosay that most of those writing from a linguistic perspective, though not all (Krell 1998, Clackson2000),havebeenimpressedbytheextentoftheevidence.Theargumentsare in any case based on an assemblage of individual points, each of which needs careful evaluation. Athirdargumentisbasedonlinguists’subjectiveimpressionthatearlyIElanguages are more similar grammatically and phonologically than would be expected from the Anatolianchronology;seeTable1.After4,500ormoreyearsofdivergenceontheAna- tolianchronology,somegrammaticalpatternsremainintactwithonlyafewchangesin each language; see, for example, Hittite [esmi, esi, estsi] = Sanskrit [ásmi, ási, ásti]. ː ː ː ͜ Similarly, only a few sound changes distinguish Hittite [χanti], Sanskrit [ánti], and Greek[ánti];suchexamplescanbereplicatedthroughoutthegrammarandlexicon.The 5Becauselanguageandmaterialculturespreadindependently,somearchaeologistsemphasizethatcorre- lationsofspecificarchaeologicaldatawithethnicityorlanguagecannotbedemonstratedconclusivelywith- outinscriptionalevidence.ButmostlinguistswouldprobablyagreethatKohl(2009:236)goestoofar,inhis otherwiseexcellentsurveyofBronzeAgeEurasia,insuggestingthat‘[t]herewasnosingleIndo-Europeanor Proto-Indo-European“homeland”butjustaneverunfoldinghistoricalprocessofdevelopment’.Iftherewas aPIElanguage,itwasspokensomewhere. Ancestry-constrainedphylogeneticanalysissupportstheIndo-Europeansteppehypothesis 197 PIE hittite vedicsanskrit greek ‘Iam’ *h1ésmi[ʔésmi] ēšmi[eːsmi] ásmi[ásmi] eimí[eːmí] ‘you(sg.)are’ *h1ési[ʔési] ēšši[eːsi] ási[ási] eĩ[ẽː](dialectalessí) ‘s/heis’ *h1ésti[ʔésti] ēšzi[eːst͜si] ásti[ásti] estí ‘theyare’ *h1sénti[ʔsénti] ašanzi[asant͜si] sánti[sánti] eisí[eːsí](dialectalentí) ‘bear’ *h2rtḱos[χrˌtkos] h̬artaggaš[χartkas] ŕḳ ṣas[ŕˌkʂas] árktos ‘cloud,sky’ *nébhos[nébos] nēpiš[nebis] nábhas[nábas] néphos ̤ ː ̤ ‘wood’ *dóru[dóru] tāru[taru] dāèru[dáru] dóru ː ː ‘vis-à-vis’ *h2ánti[χánti] h̬anti[χanti] ánti[ánti] antí ‘yoke’ *yugóm[ju óm] yugan[ju an] yugám[ju ám] zdugón ɢ ɡ ɡ ͜ Table 1.SelectedvocabularyinPIEandthreeearlyIElanguages. divergencetimepositedbytheAnatolianhypothesisisroughlytwicethatofthepres- ent-dayGermanic,Romance,orSlaviclanguages,butmanylinguistshaveasubjective impressionthatthedifferencesinTable1arenottwiceasgreat.Yetwehavenogener- allyacceptedwaytoquantifyimpressionsofsimilarityinphonologyorgrammar,orto show that examples like those inTable 1 are representative, so this argument remains impressionistic.6 Twofurtherargumentsprobablysupportonlyaweakerposition,namely,thatPNIE wasspokeninthePontic-Caspiansteppe.Oneconcernsevidenceforearlycontactbe- tween IE and early western Uralic languages (Joki 1973, Koivulehto 2001, Janhunen 2009). Given the location of Uralic in northern Eurasia, such contact must have oc- curred north of the Black and Caspian Seas. The evidence for contact between early Uralic languages and the Indo-Iranian branch of IE is uncontroversial, supported by dozensofunambiguousloanwords,andacceptedbyspecialists(Rédei1986,Lubotsky 2001, Mallory 2002). There is similarly clear evidence for contact with Balto-Slavic (Kallio2005,2006,2008)andGermanic(Hahmoetal.1991–2012).Theevidencefor contact with PIE itself is weaker (Kallio 2009), perhaps because Uralic languages spread from the east into northern Europe and Proto-Uralic itself was not spoken in proximitytothesteppe.7 Finally, some morphological evidence suggests that the Greek, Armenian, Balto- Slavic,andIndo-IraniansubfamiliesformacladewithinIE(Ringeetal.2002).Since GreekisspokentothewestofAnatolia,andArmenianandIndo-Iraniantotheeast,itis hardtoconstructadiversificationscenarioconsistentwiththeAnatolianhypothesisin whichtheselanguagesremainedincontactafterPNIE,unlessthelatterwasitselfspo- ken on the steppe. The steppe hypothesis makes this easier: the four subfamilies in question remained in proximity, after the departure ofTocharian to the east and Italo- Celtic,Germanic,andotherstothewest. TwomainargumentssupporttheAnatolianhypothesis.First,asoriginallynotedby Renfrew (1987), the spread of agriculture provides a plausible mechanism for large- scale language dispersal, one with clear parallels elsewhere. The language dispersal mechanismsrequiredinthesteppehypothesisarelesswellunderstood,partlybecause pastoralsubsistenceeconomiesarenotascommonworldwide.Second,beginningwith Gray&Atkinson2003,theAnatolianhypothesishasbeensupportedbyresearchusing 6Holmanandcolleagues(2011)provideaphonologicaldistancemeasurefromwhichtheyinferaPIE bp time-depthofabout4350 .Thatisfartoolate(onlyafewcenturiesbeforethefirstattestationofalready differentiatedIElanguages),butitdoesquantifytheimpressionthatIElanguagesaremoresimilarphono- logicallythaninvocabularypatterns.SeeClackson2000foraperceptive,skepticaldiscussionoftheargu- mentfromimpressionisticsimilarity. 7ForrecentdiscussionseeHäkkinen2012andParpola2012. 198 LANGUAGE,VOLUME91,NUMBER1(2015) statisticalmethodsadaptedfrombiologicalphylogenetics(Atkinsonetal.2005,Atkin- son & Gray 2006, Nicholls & Gray 2008, Gray et al. 2011, Ryder & Nicholls 2011, Bouckaertetal.2012,2013).Thisisalsothefocusofourresearch. Methodological differences between fields contribute to the present impasse.Argu- ments for the steppe hypothesis are mostly qualitative rather than quantitative, and come from traditional lines of reasoning in historical linguistics and archaeology. In contrast,acrucialargumentfortheAnatolianhypothesisisquantitative,relyingonsta- tistical methods that originated in another discipline. Thus some researchers have ex- pressed skepticism about chronological inference with statistical methods (Clackson 2000,2007,Evansetal.2006,McMahon&McMahon2006),whilesomeadvocatesof suchmethodshavewrittenthathistoricallinguisticsmethods‘all…involveintuition, guesswork, and arguments from authority’(Wheeler & Whiteley 2014). We hope our workcancontributetoarapprochementbetweenthetworesearchtraditions. Indo-european phylogenetics. 1.2. IE linguistic phylogeny has been studied for many decades (Meillet 1922, Porzig 1954, Birnbaum & Puhvel 1966), but statistical phylogenetic research is relatively recent in IE (Tischler 1973). Dyen and colleagues (1992)usedlexicostatisticstoproduceaclassificationofIElanguagesbyanalyzinga wordlistofeighty-fourmodernlanguagesand200basicmeaningscompiledbyIsidore Dyen.Their method assumed a similar overall rate of lexical change in all languages. An alternative approach to classification that dispenses with this assumption was em- ployedbyRingeandcolleagues(2002)toaddresstheissueofhigher-orderstructurein IE. They analyzed a data set created by Don Ringe and Ann Taylor, consisting of phonological,morphological,andlexicaltraitsfromtwenty-fourpredominantlyancient andmedievallanguages.Thesetwoworksyieldedtwowordlistswithcognatecoding; oneorboth,orbothcombined,wereusedinallsubsequentwork. In2003,GrayandAtkinsonpresentedthefirstBayesianphylogeneticanalysisofIE chronology;theyusedtheDyenwordlist,supplementedwithHittite,TocharianA,and TocharianBdata.Historicallyattestedeventswereusedtodatevariouslinguisticsplits, rate smoothing and over the branches of the inferred tree was used to relax the as- sumptionofaconstantrateofchange.Asinallsubsequentanalysespriortoourwork, the inferred root age supports theAnatolian hypothesis. Nicholls and Gray (2008) re- worked Gray and Atkinson’s analysis by replacing the trait model, which permitted multiplegainsinthesamelexicaltrait,withonethatdidnot;theyalsoperformedasep- arate analysis on the lexical traits of a subset of the languages from the Ringe-Taylor dataset.RyderandNicholls(2011)thenaddedamodeloflexicographiccoveragethat enabledthemtoworkwithalltwenty-fourlanguagesintheRinge-Taylordataset,many of which, like Oscan and Old Persian, are scantly attested. Bouckaert and colleagues (2012)performedaphylogeographicanalysis:itsgoalwastoinferthegeographicallo- cationofPIE,butembeddedinitwasaphylogeneticanalysisthatsupersededprevious workinmostrespects.8Mostnotably,theinferencesoftwaresupportedmanydifferent trait models, including the single-gain trait model devised by Nicholls and Gray, and thedatawasbasedonaharmonizationoftheRinge-TaylorandDyendatasets.Bouck- aertandcolleagues(2013)addressedanerrorinthecodingofthedatawithoutaltering theirgeneralconclusions. 8Ourworktodatedoesnotmodelgeographicdispersaloraddresstheirphylogeographicfindings;thisre- mainsadesideratumforfutureresearch. Ancestry-constrainedphylogeneticanalysissupportstheIndo-Europeansteppehypothesis 199 Figure 1.AnalysisA1summarytree.Modernlanguageswithnoancestorsinthedatasetareexcluded.This treeshowsmedianposteriornodeheights,medianposteriorbranchratemultipliers(widthofhorizontal lines),timeconstraintsonancientandmedievallanguages(brightredbars),cladeconstraints (verticalblackbars),andposteriorcladeprobabilitieslessthan98%. Wenowbrieflypreviewourresults.UsingthesamemodelanddatasetasBouckaert andcolleagues(2012,2013),butwithincrementalchangestoboth,wefoundarootage that strongly supports the steppe hypothesis. The key difference was that we con- strainedeightancientandmedievallanguagestobeancestraltothirty-ninemodernde- scendants.Usingancestryconstraintsissimilarinspirittostipulatingtheknowndates of historical languages, or stipulating uncontroversial clades that are not the object of inquiry.Theancestor-descendantrelationshipswepositareuncontroversial,butcould notbeinferredbythemodel.Figure1showstheresultofananalysiswithancestrycon- straints and other refinements, where the only modern languages included are those withdocumentedancestors.Figure2showsasimilaranalysiswithmodernlanguages 200 LANGUAGE,VOLUME91,NUMBER1(2015) Figure 2.AnalysisA2summarytree.ModernlanguagesareincludedfromallIEsubfamilies. SeeFig.1captiontointerpretgraphicalelements. fromallIEsubfamilies.Figure3showstheinferredIErootagesinselectedstudies,be- ginningwithGray&Atkinson2003andendingwithourwork. Lexicaltraits. 2. Linguistsinferrelationshipsfrommorphological,phonological, and lexical traits. Morphological and phonological traits, however, are interdependent Ancestry-constrainedphylogeneticanalysissupportstheIndo-Europeansteppehypothesis 201 GA NG NG RN B C 1 2 9500bp anatolian 8000 6500 steppe 5500 Figure 3. Inferred IE root age distributions in selected studies. GA: Gray &Atkinson 2003; NG1, NG2: Nicholls&Gray2008,usingDyenandRinge-Taylordatasets;RN:Ryder&Nicholls2011;B:Bouckaertet al.2013;C:analysisA1correctedrootagefromourwork(§7.1).Plottedarethe95%highest-densityinterval (verticallines),themean(NG,RN)ormedian(B,C)ifknown,andintervalsforthesteppeand Anatolianhypotheses(dashedlines). inwaysthatarepoorlyunderstood.9Forthisreason,andbecauselargelexicaldatasets are available, most statistical work on language relationships analyzes lexical traits. cognatetraits root- Thereareatleasttwotypesoflexicaltraits,whichwecall and meaningtraits .Theyhavenotbeendistinguishedinpreviousphylogeneticresearch. Languagesshareacognatetraitiftheysharecognatewords,thatis,wordsdescended fromthesameancestralwordform.Forexample,EnglishandGermanshareacognate traitbecausetimberandZimmer‘room’aredescendedfromGermanic*timra-(derived from a PIE root *demh - ‘build’); likewise, German Gast ‘guest’ and Latin hostis 2 ‘stranger,enemy’defineacognatetraitbecausetheyaredescendedfromaform*ghosti- (Bammesberger 1990, Ringe 2006). Cognate words need not have the same meaning. Cognatetraitsarewidelystudiedincomparativeandhistoricallinguistics,butareonly occasionallyusedinstatisticalphylogeneticstudies(e.g.Gray&Jordan2000).10 Moreoften,thedataconsistsofroot-meaning(RM)traits,whichencodewhetherthe most semantically general and stylistically neutral word for a given meaning is based onagivenancestralroot;meaningsareoftenchosenfroma‘Swadesh’listofoneortwo hundredbasicmeanings.SuchtraitsarethebasisformostIEanalyses,includingours. Forexample,sinceEnglishfeatherisderivedfromaPIEroot*pet-‘tofly’,Englishhas atrait[*pet-,‘feather’];becauseLatinserpens‘snake’isderivedfrom*serp-‘tocreep’, 9Typologicaltraitsarealsosometimesusedinanalyzinglanguagerelationships,thoughtheysharedraw- backsofbothlexicaltraits(theyspreadeasily)andnonlexicaltraits(theyareofteninterdependent).Somedif- ficultiesofusingmorphologicalandphonologicaltraitsinstatisticalphylogeneticanalyseshavealsobeen discussedbyTaylorandcolleagues(2000)andRingeandcolleagues(2002). 10Cognatetraitscanbehardtousesystematically,sincetheyrequiredetailedetymologicalknowledge.In English,forexample,withoutmedievaldataacognatetraitwouldbemissedbyanyonewhodidnotknow thatthesecondmemberofthecompoundbridegroomisthecognateofGothicguma‘man’;thecognateof Sanskritmádhu-‘honey’wouldbemissedwithoutlexicaldatathatincludedtheinfrequentwordmead.Afur- therproblemintheanalysisofcognatetraitsistodeterminewhichwordscountas‘descendedfromthesame ancestralwordform’.InRomance,forexample,Latinneuterschangegender,andsimplexverbsareoften continued by their frequentatives. Presumably Italian rapa ‘turnip’counts as the ‘same’as Latin rāpum, thoughrapastrictlyspeakingcontinuesapluralrāpareinterpretedassingular;butisFrenchchanter<Latin cantāre‘sing(repeatedly)’the‘same’asLatincanere‘sing’?Statisticalanalysesofroot-meaningtraitsneed notseekinvainforaprincipledwaytountiesuchGordianknots. 202 LANGUAGE,VOLUME91,NUMBER1(2015) Latinhasatrait[*serp-,‘snake’].LanguagescanshareanRMtraitbasedonformswith different derivations, like English feather < *pet-trā (or *pet-rā) and Latin penna ‘feather’<*pet-nā.Thesewordssharearootbutarenotcognatewords,sincetheywere derivedwithunrelatedsuffixes*-trā(or*-rā)and*-nāandsocannotdescenddirectly fromthesameancestralwordform.Bycontrast,becausethecognatewordstimberand Zimmerhavedifferentmeanings,theydonotdefineasharedRMtrait. Homoplasyanddrift. 2.1. CognateandRMtraitsevolveverydifferently,especially withrespecttohomoplasyorindependentinnovation.11Exceptinborrowingbetween languages,cognatetraitsordinarilycomeintoexistenceonlyonce;thisisthebasisofthe comparativemethod (Meillet1925,Weiss2014).Thereforemodelsoftraitevolution thatdonotpermithomoplasyarewellsuitedtocognatetraits.Butbecausethemecha- nismsofchangeunderlyingRMtraitsincludesemanticchangeandthederivationofnew wordsfromexistingforms,RMtraitsaresubjecttoatleasttwodistinctivekindsofho- drift moplasy.Indescribingthem,weadaptSapir’s(1921)term ,whichreferstothepre- dispositiontoundergocertainchangesgivencertainprecursortraits. First,RMtraitsarisenotonlywhenwordformscomeintoexistence,butalsowhen they change meanings. For example, Old English (OE) timber probably originally meant‘building’(likeOldSaxontimbar);tomodelitsshiftinmeaningto‘timber(ma- terial for building)’, the trait [*demh -, ‘building’] would be said to be replaced by a 2 trait [*demh -, ‘timber’].12 Meaning changes fall into recurrent patterns across lan- 2 guages (Heine & Kuteva 2002, Traugott & Dasher 2002, Urban 2014). If the same meaningchangeaffectsthesamerootinrelatedlanguages,ahomoplasticRMtraitre- sults. For example, in a crosslinguistically common shift (Wilkins 1996), reflexes of PIE*pod-‘foot’cametomean‘leg’independentlyinModernGreekandmodernIndic andIranianlanguages.Twootherexamplesfromourdatasetaregivenin1. (1) a. Old Irish seinnid meant ‘play or strike an instrument, sound’ but has shifted in Modern Irish and Scots Gaelic to mean ‘sing’. The ancestral root *swenh - referred to producing sound or music more generally, but 2 thesamesemanticshiftto‘sing’isseeninPersianxvāndan. b. Manylanguagesdistinguishastative‘sit’verb(‘beinasittingposition’) from a change-of-state one (‘sit oneself down, take a seated position’), butshiftsbetweenthesensesarecommon.InPNIE,theroot*h eh s-ex- 1 1 pressedthestativesense,while*sed-expressedthechange-of-statesense (Rixetal.2001).Change-of-state*sed-cametoexpressthestativesense independentlyinArmenian,Balto-Slavic,Celtic,Germanic,andItalic,a shiftnotsharedbyGreekorIndo-Iranianandthereforeindependentinat leastsomeofthebrancheswhereithappened. Recurrentmeaningchangeslikethesehavebeencalled‘rampant’inlanguage(Ringeet semanticdrift al.2002).Forsuchchangesweusetheterm . 11Homoplasyisanevolutionarytermforindependentanalogousinnovationinparallellineages.Changes likethet>ksoundchangethathasoccurredindependentlyatleasttwentytimesinAustronesianlanguages (Blust2004)aresaidtobehomoplastic. 12Wewriteasiflatermeaningssimplyreplaceearlierones.Inreality,asinthiscase(OldEnglishhadboth meanings),wordsusuallypassthroughapolysemousstage,whichwouldbemodeledasthecoexistenceof traits(e.g.[*demh2-,‘building’]and[*demh2-,‘timber’]).Themethodologyofbasic-vocabularywordlist collection tends to suppress all but the clearest cases of polysemy in the languages under study, but this shouldnotaffecttheanalysisaslongasthemethodisthesamethroughoutthedataset. Ancestry-constrainedphylogeneticanalysissupportstheIndo-Europeansteppehypothesis 203 derivational drift Asecond source of homoplasy in RM traits is . This refers to changethatoccursbecausecertainrootsaresemanticallywellsuitedtoprovidecertain derivatives.Forexample,constructionsthatmean‘causetodie’arearecurrentsource ofverbsfor‘kill’(Buck1949).Therefore,asdescendantsofPIE*gwhen-‘kill’fellout of use, causative derivatives of PIE *mer- ‘die’were used in this meaning. This hap- pened independently in Irish and in modern Indic and Iranian languages (Rix et al. 2001), yielding a homoplastic RM trait.13Three other examples from our data set are givenin2. (2) a. Becausewordsfor‘animal’oftenevolvefromexpressionsmeaning‘hav- ingbreath’andsoforth,PIEh enh -‘breathe’isseeninseveralotherwise 2 1 unrelated ‘animal’ terms. Latin animal itself is a derivative of anima ‘spirit’, a derivative of h enh -. In Indo-Iranian, Persian jānvār ‘animal’ 2 1 andrelatedformsdescendfrom*wyāna-bāra-‘havingaspirit’,inwhich *wyāna-isaderivativeofh enh -.Andthoughnotthebasicwordfor‘an- 2 1 imal’, Tocharian B onolme ‘living being’is also a derivative of h enh - 2 1 ‘breathe’(D.Adams2013). b. Basic words for ‘live’ include derivatives of the PNIE root *gwyeh -: 3 Greek zdõ, Classical Armenian keam, Latin vīvō, and so forth, all of ͜ whichareprimaryverbalformations.BasicCelticwordsfor‘live’inour datasetarederivativesofanadjective*gwih -wo-thatwasderivedfrom 3 thesameroot,thatis,aconstruction‘bealive’.Itisnaturaltoderiveasta- tiveverb‘X’fromaconstruction‘beX’(withastativeadjectiveX). c. In the meaning ‘snake’, reflexes of the PIE noun *h ógwhis are wide- 1 spread (Ancient Greek óphis, Vedic Sanskrit áhi-, etc.). A verb *serp- ‘crawl’alsooftenrefersspecificallytothemotionofasnake.Derivatives of *serp- came to be the general term for ‘snake’inAlbanian, in Latin (and modern Romance languages), and in modern Indo-Aryan (IA) lan- guages,forexample,Hindisālp,Assamesexāp.Thehomoplasticnatureof suchcasesisshownbythefactthatthevariouswordformsthatacquire the new meaning are often formed with different derivations. For exam- ple, thoughAlbanian (Tosk) gjarpër ‘snake’and Latin serpens are both based on the PIE root *serp-, theAlbanian noun is formed with a suffix *-ena-(Orel1998)andtheLatinnounwithadifferentsuffix*-ent-.The wordformsthemselvesdonotgobacktoasingleancestor. In short, though some analysts erroneously assume that RM traits are homoplasy- free(Atkinsonetal.2005:204,Grayetal.2011:1094),semanticandderivationaldrift areendemicinRMdatasets.Thisclaimisfurthersupportedin§3belowandquantified in§4.2.14 13Derivationaldriftcanbeunderstoodasresultingfromthesemanticdriftofderivativewords.Forexam- ple,thederivationaldriftwhereby‘causetodie’replaces‘kill’canbeviewedasaconsequenceofsemantic changeintheseverbs.Itisnonethelesshelpfultodistinguishthetwocategorieswhenpossible. 14Athirdkindoflexicaltrait,intermediatebetweencognateandRMtraits,wouldencodewhetherwords forgivenmeaningsareexpressedbycognatewordforms,ratherthanroots.Suchtraitswouldhaveevolu- tionarypropertieslikethoseofRMtraits,sincetheywouldbeliabletosemanticdrift(thoughnotderivational drift).TheseareevidentlythetraitsanalyzedbyBowernandAtkinson(2012),whoreportthat‘wordswere notcountedascognate…iftheywerepresentinthelanguageinadifferentmeaning’(p.827).Theyalso writethat‘languagesarehighlyunlikelytoindependentlygainthesamecognate’(p.829),whichistruebut irrelevant,since(astheyimplicitlyacknowledgebynotingthatcognatescanhavedifferentmeanings)the traitstheyanalyzearesusceptibletosemanticdrift.
Description: