UUnniivveerrssiittyy ooff PPeennnnssyyllvvaanniiaa SScchhoollaarrllyyCCoommmmoonnss IRCS Technical Reports Series Institute for Research in Cognitive Science 1-1-2001 AAnnaapphhoorraa aanndd DDiissccoouurrssee SSeemmaannttiiccss Bonnie L. Webber University of Pennsylvania, [email protected] Matthew Stone Rutgers University, [email protected] Aravind Joshi University of Pennsylvania, [email protected] Alistair Knott University of Otago, [email protected] Follow this and additional works at: https://repository.upenn.edu/ircs_reports Webber, Bonnie L.; Stone, Matthew; Joshi, Aravind; and Knott, Alistair, "Anaphora and Discourse Semantics" (2001). IRCS Technical Reports Series. 30. https://repository.upenn.edu/ircs_reports/30 University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-01-13. At the time of publication, the author, Bonnie L. Webber, was affiliated with the University of Edinburgh. Currently, July 2007, she is a faculty member in the Department of Engineering at the University of Pennsylvania. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/ircs_reports/30 For more information, please contact [email protected]. AAnnaapphhoorraa aanndd DDiissccoouurrssee SSeemmaannttiiccss AAbbssttrraacctt We argue in this paper that many common adverbial phrases generally taken to be discourse connectives signalling discourse relations between adjacent discourse units are instead anaphors. We do this by (i) demonstrating their behavioral similarity with more common anaphors (pronouns and definite NPs); (ii) presenting a general framework for understanding anaphora into which they nicely fit; (iii) showing the interpretational benefits of understanding discourse adverbials as anaphors; and (iv) sketching out a lexicalised grammar that facilitates discourse interpretation as a product of compositional rules, anaphor resolution and inference. CCoommmmeennttss University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-01-13. At the time of publication, the author, Bonnie L. Webber, was affiliated with the University of Edinburgh. Currently, July 2007, she is a faculty member in the Department of Engineering at the University of Pennsylvania. This technical report is available at ScholarlyCommons: https://repository.upenn.edu/ircs_reports/30 Anaphora and Discourse Semantics (cid:3) Bonnie Webber Matthew Stone Edinburgh University Rutgers University Aravind Joshi Alistair Knott University of Pennsylvania University of Otago We argue in this paper that many common adverbial phrases generally taken to be dis- course connectives signalling discourse relations between adjacent discourse units are instead anaphors. We do this by (i) demonstrating their behavioral similarity with more common anaphors (pronouns and de(cid:12)nite NPs); (ii) presenting a general framework for understanding anaphora into which they nicely (cid:12)t; (iii) showing the interpretational ben- e(cid:12)ts of understanding discourse adverbials as anaphors; and (iv) sketching out a lexi- calised grammar that facilitates discourse interpretation as a product of compositional rules, anaphor resolution and inference. Introduction Several years ago, in an ACL workshoppaper, Janyce Wiebe (1993) cited Example 1 to question the adequacy of tree structures for discourse. (1) a. The car was (cid:12)nally coming toward him. b. He [Chee] (cid:12)nished his diagnostic tests, c. feeling relief. d. But then the car started to turn right. The problem she noted was that the discourse connectives but and then appear to link clause (1d) to two di(cid:11)erent things: \then" to clause (1b) { i.e., the car starting to turn right being the next relevant event after Chee’s (cid:12)nishing his tests { and \but" to a combination of clauses (1a) and (1c) { i.e., the car turning right failing the expectation of its continuing in the same direction and being the car that Chee is awaiting. (The former link is commonly called a sequence relation, and the latter, a form of contrast.) These relations are usually taken to be the basis for low-level discourse structure, leadingtosomethinglikeFigure1forExample1.Thisstructuremightseemadvantageous in allowingthesemanticsoftheexampletobe computeddirectlybycompositionalrules and defeasible inference. However, this structure is in fact a DAG { a directed acyclic 1 graph. Viewed syntactically, arbitrary DAGS are completely unconstrained systems. They substantially complicateinterpretive rules for discourse,in orderfor those rules to account for the relative scope of unrelated operators and the contribution of syntactic nodes with arbitrarily many parents. While we are not committed to discourse structure being a tree (e.g. Figure 2 from (Bateman,1999)),wefeel thatthecosttodiscoursetheoryofmovingtoarbitraryDAGs for discourse structure is too great to be taken lightly. So we want to suggest another explanationfortheseandotherexamplesofapparentcomplexandcrossingdependencies (cid:3)DivisionofInformatics,UniversityofEdinburgh,2BuccleuchPlace,EdinburghUKEH89LW. E-mail:[email protected] 1ThestructureinFigure1islabelledwiththetypeofrelationtakentoholdandits\support"from eitheraconnective (\but") oradverbial(\then"). Thereareotherpossiblestructuresfor Example1,butallofthemareDAGS,soourpointstillholds. (cid:13)c 2001AssociationforComputationalLinguistics Computational Linguistics Volumexx, Numberx seq contrast seq elaboration a b c d Figure 1 Possible discourse structure for Example1. succession manner 6 8 9 Figure 2 Simplemulti-parentstructure in discourse:while structural connectives such as coordinating(e.g., \but") andsubordi- nating (e.g., \although") conjunctions do indeed signal discourse relations between (the interpretation of) their conjuncts, discourse adverbials such as \then", \otherwise",and \nevertheless" are instead simply anaphors, signalling a relation between the interpre- tation of their matrix clause and the discourse context. We argue that understanding discourse adverbials as anaphors accomplishes four important goals: 1.It recognises their behavioral similarity with the pronouns and de(cid:12)nite noun phrases (NPs) that are the \bread and butter" of previous work on anaphora (Section 1). 2.It contributes substance to the view, expressed for example by Carter (1987) that anaphora comprises more than just pronouns and de(cid:12)nite NPs: Anaphora is the special case of cohesion where the meaning (sense and/or reference) of one item in a cohesive relationship (the anaphor) is, in isolation, somehow vague and incomplete, and can only be properly interpreted by considering the meanings of the other item(s) in the relationship (the antecedent(s)). (Carter, 1987, page 33) This is explored in Section 2. 3.It supports the direct computation of discourse semantics through compositional rules and defeasible inference. This is a goal that researchers have been struggling after for some time (Asher and Lascarides, 1999; Gardent, 1997; Kehler, 1995; Polanyiand van den Berg, 1996;Scha and Polanyi, 1988; Schilder, 1997a;Schilder, 1997b; van den Berg, 1996), and that Wiebe essentially recognises in her concern about the consequences for discourse structure of examples such as (1). Enabling anaphor resolution to contribute to 2 meaning simpli(cid:12)es the process of compositional semantics and directs 2Thereisananalogoussituationatthesentence level,wheretherelationshipbetween syntactic structureandcompositionalsemanticsissimpli(cid:12)edbyfactoringawayinter-sentential anaphoric relations.Herethefactorisationissoobviousthatonedoesnoteventhink aboutanyother possibility. 2 Webber et al. Anaphora andDiscourse Semantics attention to how the meaning of discourse adverbials supports and complements other aspects of discourse semantics (Section 3). 4.It allows us to see more clearly how a lexicalised approachto the computation of clausal syntax and semantics extends naturally to the computation of discourse syntax and semantics, providing a single semantic matrix with which to associate speaker intentions and other aspects of pragmatics. (Section 4) The account we provide here is meant to be compatible with current approaches to discoursesemanticssuchasDRT(KampandReyle,1993;vanEijckandKamp,1997)and Dynamic Semantics (Stokhof and Groenendijk, 1999) and with more detailed analyses of the meaning and use of individual discourse adverbials, such as (Jayez and Rossari, 1998a; Traugott, 1997): it provides what we believe to be a simpler and more coherent account of how discourse meaning is computed, rather than an alternative account of what that meaning is or what speaker intentions it is being used to achieve. 1 Discourse Adverbials as Anaphors 1.1 Discourse Adverbials do not behave like Structural Connectives We take the building blocks of the most basic level of discourse structure to be explicit structural connectives between adjacent discourse units (i.e., coordinating and subordi- nating conjunctions, and \paired" conjunctions such as \not only ... but also", \on the onehand...ontheother(hand)",etc.)andinferred relationsbetweenadjacentdiscourse units(intheabsenseofanexplicitstructuralconnective).Here,adjacencyiswhattriggers the inference. Consider the following example: (2) You shouldn’t trust John. He never returns what he borrows. Adjacency leads the hearer to hypothesize that the second clause is related to its left- adjacentneighbor{andmorespeci(cid:12)cally,thataformofrhetoricalrelationholdsbetween the two. (We discuss this more in Section 4.) Our goal in this section is to convince the reader that many discourse adverbials { including \then", \also", \otherwise", \never- theless", \instead" { behave not like structural connectives, but instead like anaphors. Structural connectives and discourse adverbials do have one thing in common: Like verbs,theycanbothbeseenasheadingapredicate-argumentconstruction;unlikeverbs, theirargumentsareindependentclauses.Forexample,boththesubordinateconjunction \after"andtheadverbial\then"(initstemporalsense)canbeseenasbinarypredicates (sequence or after) whose arguments are clausally-derivedevents. But that is the only thing that discourse adverbialsand structural connectives have in common. As we have pointed out in earlier papers (Webber, Knott, and Joshi, 1999; Webber et al., 1999a; Webber et al., 1999b), structural connectives have two relevant properties: while they admit stretching of predicate-argument dependencies, they do not tolerate their crossing. This is most obvious in the case of preposed subordinate conjunctions (Example 3) or \paired" coordinate conjunctions (Example 4). With such connectives, the initial predicate signals that its two arguments will follow. (3) Although John is generous, he is hard to (cid:12)nd. (4) On the one hand, Fred likes beans. On the other hand, he’s allergic to them. Likeverbs,structuralconnectivesallowthedistance betweenthe predicateand its argu- mentstobe\stretched"overembeddedmaterial,withoutlossofthedependencybetween them. Fortheverb\like"andanobjectargument\apples",suchstretchingwithoutloss of dependency is illustrated in Example 5b. (5) a. Apples John likes. 3 Computational Linguistics Volumexx, Numberx concession[although] contrast[one/other] elaboration elaboration d d condition[if] comparison[not only/but also] a a b c b c (i) (ii) Figure 3 Discourse structures associated with (i) Example6 and(ii) Example7. b. Apples Bill thinks he heard Fred say John likes. That this also happens with structural connectives and their arguments, is illustrated in Example 6 (in which the (cid:12)rst clause of Example 3 is elaborated by another preposed subordinate-mainclause constructionembedded within it) and Example7 (in which the (cid:12)rstconjunctofExample4iselaboratedbyanotherpaired-conjunctionconstructionem- bedded withinit). PossiblediscoursestructuresfortheseexamplesaregiveninFigure3. (6) a. Although John is very generous { b. if you need some money, c. you only have to ask him for it { d. he’s very hard to (cid:12)nd. (7) a. On the one hand, Fred likes beans. b. Not only does he eat them for dinner. c. But he also eats them for breakfast and snacks. d. On the other hand, he’s allergic to them. But, as already noted, another property of structural connectives is that they do not admit crossing of predicate-argument dependencies. If we do this with Examples 6 and 7, we get (8) a. Although John is very generous { b. if you need some money { c. he’s very hard to (cid:12)nd { d. you only have to ask him for it. (9) a. On the one hand, Fred likes beans. b. Not only does he eat them for dinner. c. On the other hand, he’s allergic to them. d. But he also eats them for breakfast and snacks. Possiblediscoursestructuresforthese(impossible)discoursesaregiveninFigure4.Even if the reader (cid:12)nds no problem with these crossed versions, they clearly do not mean the same thing as their uncrossed counterparts: In (9), \but" now appears to link (9d) with (9c), conveying that despite being allergic to beans, Fred eats them for breakfast and snacks. And while this might be inferred from (7), it is certainly not conveyed directly. As aconsequence,westipulatethatstructuralconnectivesdonotadmitcrossingoftheir predicate-argument dependencies. That is not all. Since we take the basic level of discourse structure to be a conse- quence of (a) relations associated with explicit structural connectives and (b) relations 4 Webber et al. Anaphora andDiscourse Semantics elaboration elaboration concession[although] condition[if] contrast[one/other] comparison[not only...] a b c d a b c d (i) (ii) Figure 4 (Impossible) discourse structures thatwould haveto be associated with Example8 (i) and with Example 9 (ii). 5 Computational Linguistics Volumexx, Numberx contrast[but] seq[then] conseq[so] explanation[because] a b c d Figure 5 Example10, with structural realisation of all dependencies whosedefeasibleinferenceistriggeredbyadjacency,westipulatethat low-level discourse structure itself does not admit crossing structural dependencies. (In this sense, discourse structure may be truly simpler than sentence structure. To verify this, it might be use- ful to carefully examine the discourse structure of languages such as Dutch that allow crossingdependenciesin sentence-levelsyntax.Initial cursoryexaminationdoesnot give any evidence of crossing dependencies in Dutch discourse.) If we now consider the correspondingproperties of discourseadverbials, we see that they do admit crossing of predicate-argument dependencies. Example 10 shows this clearly. Clause 10(d) contains the discourse adverbial \then". For it to get is (cid:12)rst ar- gument from (b) { i.e., the event that the discovery in (d) is \after", it must cross the structural connection between clauses (c) and (d) associated \because". This crossing dependency is illustrated in Figure 5. (10) a. John loves Barolo. b. So he ordered three cases of the ’97. c. But he had to cancel the order d. because then he discoveredhe was broke. But of course crossing dependencies are not unusual in discourse because anaphors (e.g., pronouns and de(cid:12)nite NPs) do it all the time, for example: (11) Everymani tellseverywomanj hei meetsthatshej remindshimi ofhisi mother. This suggests that in Example 10, the relationship between \then" and the previous discourse might usefully be taken to be anaphoric as well. 1.2 Discourse Adverbials do behave like Anaphors There is additional evidence to suggest that \otherwise", \then" and other discourse adverbials are anaphors. First, anaphors in the form of de(cid:12)nite and demonstrative NPs can take implicit material as arguments. For example, in (12) Stack (cid:12)ve blocks on top of one another. Now close your eyes and try knocking fthe tower, this towerg overwith your nose. both NPs refer to the structure which is the implicit result of the block stacking. (Fur- ther discussion of such examples can be found in (Isard, 1975; Dale, 1992; Webber and Baldwin, 1992).) The same is true of discourse adverbials. In (13) Do you want an apple? Otherwise you can have a pear. thesituationinwhichyoucanhaveapearisoneinwhichyoudon’twantanapple{i.e., where your answerto the question is \no". But this answer isn’t there structurally: it is onlyinferred.Whileitappearsnaturaltoresolveananaphortoaninferredentity,itwould bemuchmorediÆculttoestablishsuchlinksthroughpurelystructuralconnections:todo sowould involvea substantial commitmenttocovertconstituents indiscoursestructure. 6 Webber et al. Anaphora andDiscourse Semantics Secondly, attempts to paraphrase \otherwise" in terms of the structural connective 3 \or" demonstrate that \otherwise" has a wider range of options. This is illustrated by the following pair of examples: (14) a. If the light is red, stop. Otherwise you’ll get a ticket. (If you do something other than stop, you’ll get a ticket.) b. If the light is red, stop. Otherwise go straight on. (If the light is not red, go straight on.) Only one of these two ways of resolving \otherwise" in the context of a preceding if- construction can be paraphrased with \or" { that is, only the case where \otherwise" resolves to an alternative to the consequence clause, as in (14a) { cf. If the light is red, stop or you’ll get a ticket. Paraphrasing(14b) with \or", as in If the light is red, stop or go straight on. produces something whose meaning is quite di(cid:11)erent. Thus, \otherwise" has access to materialthatisnot availableto astructuralconnective.(Actually,in Section 4,weposit two separate lexico-syntactic entries for \or" as a structural connective { one for purely logical \or" and the other for \or" conveying an independent semantic relation between its arguments, as is the case here.) Our (cid:12)nal piece of evidence is that, like pronouns, these discourse adverbials can appear in an analogue of donkey sentences. Donkey sentences such as Example 15 have been usedtoarguetheintrinsicdiscoursenatureofpronominalanaphors:thatpronouns are not merely a re(cid:13)ex of a syntactic binding operation. (15) Every farmer who owns a donkey feeds it rutabagas. In donkey sentences, anaphors appear in a structural and interpretive environment in whichadirectsyntacticrelationshipbetweenanaphorandantecedentisnormallyimpos- sible. Therefore, donkey sentences are evidence for interpreting an anaphorby accessing a discourse entity instead of by syntactic binding. While no one has ever argued that discourse adverbials are a re(cid:13)ex of a syntactic bindingoperation{theyhavealwaysbeentreatedaselementsofdiscourseinterpretation, signalling relations between adjacent clauses { it is signi(cid:12)cant that they can appear in their own version of donkey sentences, as in (16) a. Anyone who has developed network software, has then had to hire a laywer to protect his/her interests. (i.e., after developing network software) b. Many people who have developed network software, have nevertheless never gotten very rich. (i.e., despite having developed network software) c. Every person selling \The Big Issue" might otherwise be asking for spare change. (i.e., if s/he weren’t selling \The Big Issue") This suggests that discourse adverbials are accessing discourse entities (in particular, 4 eventualities) rather than signalling a structural connection between adjacent clauses. 3Thiswaspointedoutindependently byNataliaModjeska,LauriKarttunen,MarkSteedman,Robin CooperandDavidTraum,onpresentation ofthisworkatESSLLI’01inHelsinki,August2001. 4WhileRhetoricalStructureTheory(RST)(MannandThompson,1988)wasdevelopedasan account oftherelationbetweenadjacentunitswithinatext,Marcu’sguidetoRSTannotation (Marcu,1999)hasaddedan\embedded"versionofeachRSTrelationinordertohandexamples suchasin(16c) andothers,inwhichthematerialinanembeddedclause(here,arelativeclause) bearsasemanticrelationtoitsmatrixclause.Whilethisimportantlyrecognisesthephenomenon,it doesnotcontributetounderstanding itsnature. 7 Computational Linguistics Volumexx, Numberx These arguments have been directed at the behavioral similarity between discourse adverbials and what we normally take to be discourse anaphors. But this isn’t the only reason to recognise them as anaphors: In the next section, we suggest a framework for 5 anaphora in which discourse adverbials (cid:12)t as neatly as pronouns and de(cid:12)nite NPs. 2 A Framework for Anaphora 2.1 Discourse referents and anaphor interpretation If we want to take discourse adverbials to be anaphors, we have to ask what kind, since on the surface, adverbials neither walk nor talk like the anaphors we are must familiar with { pronouns and de(cid:12)nite NPs. All discourse anaphors involve, at the very least, an anaphoric expression(cid:11)andoneormoreentitieser fromthediscoursecontextorcontext 6 of utterance that contribute in some way to the interpretation of (cid:11), e(cid:11). One thing we want to point out, although it is not critical to our discussion of discourseadverbialsasanaphors,isthat not all the material in the expression(cid:11) maybe anaphoric{i.e., interpretedwith respecttoer.Forexample,onetypeofexpressionthat 7 we take to be anaphoric is \other NPs" , as in: (17) a. The new mayor of London has declared war on pigeons. b1. Other birds have not incurred his wrath. b2. Other birds that inhabit the city year round have not incurred his wrath. b3. Other birds with more sanitary habits have not incurred his wrath. b4. Other more sanitary birds have not incurred his wrath. In (b1), one would obviously take the anaphoric expression to be the entire NP \other birds". Its interpretation involves the entity er evoked by \pigeons" in (17a), which is excludedfromthesetofbirdsunderconsideration,whichhavenot(wearetold)incurred the mayor’swrath. Similarly,in (b2), one wouldtakethe anaphoricexpressionto be the entireNP\otherbirdsthatinhabitthecityyearround",withitsinterpretationinvolving the exclusion of er (pigeons) from that set. In (b3) and (b4), however, if we take the anaphoric expression to be the entire NP, then it is not the case that er (pigeons) is to be excluded from the set of birds with more sanitary habits (b3) or more sanitary birds (b4), since they don’t belong to either set: they are simply being excluded from the set ofbirds.Soonemaywanttoallowforananaphoricexpressiontocompriseonlypartofa constituent, though the interpretation of the entire constituent will, as a result, depend 8 on how the anaphor is resolved. Now,besideser (the entityorentitiesfromcontextd=u)ande(cid:11) (theinterpretationof the anaphoric expression (cid:11)), we have been motivated to introduce a third entity ei into theprocessofanaphorinterpretation,whichwecallacontextual parameter:ei isderived from er and supplied to the interpretation of (cid:11). The motivation relates to the familiar phenomenon variously called textual ellipsis (Hahn, Markert, and Strube, 1996), partial anaphora (Luperfoy, 1992), indirect anaphora (Hellman and Fraurud, 1996), associative anaphora (Cosse, 1996), and bridging anaphora (Not, Tovena, and Zancanaro, 1999), illustrated in Example 18. (18) Myra darted to a phone and picked up the receiver. 5totheextent thatanythinginhumanlanguage canbeconsidered\neat" 6Sincewerefertothisdisjunctionsooften,weabbreviateitsimplycontextd=u 7Thereismorediscussionof\otherNPs"laterinthissection. 8Thatthisoccursevenwithde(cid:12)niteNPswasobservedovertwenty yearsagobyoneofthe co-authors(Joshi,1978),whoconsideredthequestionofwhetherade(cid:12)niteNPcould simultaneouslyco-referandprovidenewinformationaboutitsreferent. 8
Description: