Research Report AI-1989-04 Artificial Intelligence Programs The University of Georgia Athens, Georgia 30602 Available by ftp from aisun1.ai.uga.edu (128.192.12.9) Series editor: Michael Covington mcovingt@aisun1.ai.uga.edu PROBLEMS IN APPLYING DISCOURSE REPRESENTATION THEORY William H. Smith Piedmont College Demorest, GA 30535 Artificial Intelligence Programs The University of Georgia Athens, Georgia 30602 Introduction Discourse Representation Theory (DRT) was developed by Hans Kamp 1981 in order to combine "a definition of truth with a systematic account of semantic representations (277)". The semantic representations produced are to provide a bridge between syntactic parses and model theoretic semantics such that the representations can be used to determine the truth conditions of a discourse. This report describes that theory, both the original, basic form and some extensions that have been suggested by Kamp and others, and applies it to a "real" discourse in order to indicate further extensions that will be necessary if DRT is to be used as a complete theory of semantic representations. Truth in model theoretic semantics is determined by a mapping from a representation of the discourse to a model, a mapping that preserves the properties and relationships of the discourse. A model consists of two sets: a set of entities (the universe) and a set of properties of those entities and relations that hold among them. The discourse representation must likewise consist of two sets, a set of referenced items and a set of propositions about those items. A discourse is held to be true in a model if there is a mapping such that the set of referenced items maps to a subset of the universe and each property or relation expressed by the propositions is true of the corresponding entities in the model. Kamp addressed the questions of how a discourse places items in the set of referenced items and of how anaphoric relations can be expressed in the discourse representation. In particular, he was concerned with situations in which an item should not be placed in that set yet should serve as the antecedent for an anaphoric relationship (the so-called "donkey sentences"). To that end, he developed the basic version of DRT (1981). Part 1 2 of this report describes that basic theory. Part 2 presents some extensions to the basic theory. Part 3 describes an attempt to represent a text in DRT and suggests further extensions that might be made. 1. Discourse Representation Theory The central notion of DRT is the Discourse Representation Structure (DRS). A DRS K is a pair , where U is a set of reference markers (the universe) and C is a set of conditions (properties, relations, or complex conditions--negation, disjunction, or implication). The initial DRS, K0, contains none of the information in the discourse. As the discourse is processed, the DRS construction algorithm produces a series of K' as it incorporates material from the discourse into K. For example (1) Pedro owns a donkey. K: Although U and C are described as sets, at least one of these must be ordered by time of introduction into the discourse if the construction algorithm is to work properly in assigning antecedents to anaphoric expressions (cf Goodman 1988). Antecedent assignment is accomplished by finding an item in U that agrees with the anaphoric expression (for pronouns, an entity that agrees in gender and number). Thus, (2) shows an extension of (1) (:= is an assignment operator, as in Pascal; + is union of sets): (2) He beats it. K: (The basic theory would add R3, R4 to U, and then set them equal to R1, R2, respectively. A later "clean-up" operation would eliminate these redundant discourse markers. Here we assume that 3 the clean-up operation has been applied.) A discourse of any size, however, is likely to include several such entities; most often, the conflict is resolved by selecting from the candidates the one that was most recently encountered. In order to make that selection possible, the set of items must be ordered, and must in fact be re-ordered every time reference is made to an entity. For purposes of exposition, K0 is usually treated as consisting of empty sets. Such is often not the case in real discourse, where referring items are often exophoric, their referents to be found in the nonlinguistic context or in the shared knowledge of the participants. While DRT allows for such references, it is not clear how or when the antecedents are to be entered into K. Kamp 1985 describes the DRS construction algorithm is "a set of rules that operate, in a roughly top-down manner, on the nodes of the parse tree, (2)" converting those nodes into the conditions of C and, when appropriate, introducing new reference markers into U. As was noted in the introduction to this report, the basic version of DRT is directed toward the role of noun- phrase (NP) nodes in the discourse--their relationship to U. It would seem, at first glance, that every NP should be associated with an entity in the model and should therefore have a corresponding marker in U. (That view is, of course, a great oversimplification. Most work in DRT has limited itself to singular concrete NPs, where the oversimplification is not so drastic.) When the algorithm encounters a NP, it should either associate it with a marker already present in U (anaphora) or introduce a new marker, and these were the problems that Kamp 1981 addressed. The main problem for anaphora is that theories of sentential syntax do not provide for intersentential anaphora. DRT solves that problem by creating a unified representation for the discourse, so that all markers in the discourse are available for anaphoric relations (with exceptions to be treated shortly). Conflict resolution is not treated beyond the recency heuristic; this is not a weakness particular to DRT, for a full treatment of 4 pronominal anaphora must take into consideration grammar, pragmatics, and knowledge of the real world. Definite noun phrases perform as do personal pronouns but, since they carry more content, are less likely to introduce conflict. (Definite NPs used generically are not considered.) Proper nouns and indefinite NPs introduce new reference markers into U. Although this procedure corresponds to the "first glance" view of natural language, it encounters problems in sentences that involve negation, disjunction, or conditions: (3) Pedro does not own a donkey. (4) Pedro owns a donkey or a cow. (5) If Pedro owns a donkey he beats it. One certainly would not want to add a reference marker for 'donkey' in (3); the semantics would require that it map to an entity in the model, and the sentence explicitly denies its existence. The same holds for the donkey in (4), since it has perhaps a 50-50 chance of existing (although it might be useful to add a marker for the thing that Pedro owns). Sentence (5) is the so-called "donkey sentence"; it not only introduces a donkey that may or may not exist, but goes on to make anaphoric reference to it. DRT handles sentences of the above types by adding to C one or more sub-DRSs. Each sub-DRS has its own universe, which is not visible to the superordinate DRS, and its own condition set, and the truth value of the sub-DRS is determined by the logical connective that controls it. 5 (3') K: }> (4') K: K'': }> (5') K: K'', K': K'': }> Thus, (3) is true if there is no entity in the model that satisfies its universe and conditions, (4) is true if there is a successful mapping from one of its sub-DRSs to the model, and (5) is true if any mapping that satisfies the antecedent DRS also satisfies the consequent DRS. Sentences (4) and (5) introduce an additional problem; each could be followed by a sentence such as (6): (6) It is unhappy. The pair (5-6) is handled by including (6) in the consequent DRS for (5). The other pair, (4-6), seems to be overlooked by theorists, but it can be handled, as was suggested above, by adding to the main DRS a marker for the thing that Pedro owns and including only the properties -- donkey or cow -- in the sub- DRSs. Universal propositions have the same DRS form as conditionals. The scope of a universally quantified term relative to an existentially quantified term is indicated by the U in which the existentially quantified term is placed. Thus, the usual interpretation of (7) is represented by (7a), while the interpretation that places 'donkey' outside the scope of 'farmer' is shown in (7b): (7) Every farmer owns a donkey. 6 (7a) K: K'', K': K'': }> (7b) K: K'', K': K'': }> DRT, as described so far, does a very good job of handling a very small subset of English sentences. Kamp and others have offered a number of extensions to the basic theory in order to expand that subset. 2. Extensions to the Basic Theory The basic theory is confined to a very limited subset of natural language. In particular, it is limited to singular, non- generic NPs, to anaphoric reference (i.e. the referent is present in the discourse), and to sentences whose main verbs do not take propositions (i.e. DRSs) as arguments. Researchers have offered extensions to the basic theory that reduce the second and third of those limitations. Kamp 1983 and Pinkal 1986 have offered refinements to the reference-resolving algorithm for definite NPs that extend the power and accuracy of that algorithm. Kamp distinguishes four kinds of definite noun phrases (Pinkal: 369): (8a) Personal and possessive pronouns ( b) Complex demonstratives. (Demonstrative + NP; NP may include a restrictive relative clause.) ( c) Definite descriptions. ('the' + NP; NP may include a relative clause.) ( d) Functional definite descriptions. ('the' + NP + prepositional phrase, the latter limiting the set from which NP selects.) Complex demonstratives differ from definite descriptions in that the latter presuppose a unique referent while the former presuppose a contrast between two or more possible referents. Resolution of referential expressions requires the following 7 (Pinkal: 370): (9a) The DRS K. ( b) A salience ranking of the markers in UK. (Including recency of reference.) ( c) A selection set of UK whose members are available as antecedents. ( d) The universe of the real world needed for deictic reference. Pinkal argues that definite descriptions are not limited to the selection set and that there is no motivation for the distinction between anaphora and exophora (where the referent is not present in the discourse; it is either physically present--deixis--or present in shared knowledge). Guenthner et al. 1986 extend the basic theory by adding two new types of discourse markers: event markers and time markers. They include meaning rules in the DRS construction algorithm that assign each verb and each noun that refers to an action (e.g. 'accident') to an event marker. Each time reference (i.e. time of day or extent of duration) is assigned to a time marker. Events are temporally ordered with respect to each other and to time references: an event may precede or overlap another event or time, it may be given a time argument expressing its duration, and it may be a subset of another event. The addition of event markers makes it possible for predicates to take DRSs as arguments. Guenthner et al. do not include any examples of such a use of event markers, but Guenthner 1987 does. In that article he also makes a notational distinction between events, which advance the time of the discourse, and situations or static verbs, which do not. Spencer-Smith 1987 does not use event markers, but adds a different type of discourse marker, a proposition marker. This extension makes it possible to include embedded predicates, such as infinitival complements and beliefs: (10) Mary wants to marry a rich man. 8 K: }> The representation of beliefs, which is explored more fully in Kamp 1985, requires two further additions to DRT: internal and external anchors. Anchors are used to associate discourse markers to entities in the world. External anchors are ordered pairs, , that associate the two as they actually are, while internal anchors are DRS-like structures that associate items as the speaker believes they are. The use of anchors makes it possible to represent propositions that are in fact contradictory but are not so in the speaker's belief system because his internal anchors differ from the external anchors: (11) John believes that Hesperus is pretty and Phosphorus is not pretty. External anchors: , , Internal anchors: K: K: }> }> These extensions to DRT give it considerable power, but are far from giving it the power necessary to represent adequately the full range of meanings available in natural language. In the next section we attempt to apply DRT to a selection of natural language in order to discover further extensions that will be necessary if DRT is to become an adequate theory for the representation of natural language. 9 3. Application of DRT The passage to be analyzed here was treated extensively in Smith 1977 in order to determine the types of information that must be added to the text in order to obtain a complete representation of the situation reported by the text. The text is a narrative passage that has been normed at sixth-grade readability (ETS 1969). It is particularly interesting because it forces the reader to treat certain items as if they were in K0. In order to represent this passage, it is necessary to postulate ad hoc extensions to DRT. Although these extensions work for this passage, they should be regarded as suggestions only and not as fully developed extensions; some will reveal their weaknesses as the representation is developed. The DRS K for the passage will be developed incrementally, the DRS for each portion being added to the existing DRS. The clean-up of redundant discourse markers, however, is assumed to take place before the DRSs are combined. Additional symbols will be explained as they are introduced. As before, discourse sentences will be presented in the company of the DRSs that they add to K; since these sentences, unlike those in previous examples, have a cumulative effect, they will be denoted with the prefix N. In order to treat reference adequately, the items shown below must be included in DRS K0. These items are, in effect, imposed on the reader as possible referents. The marker Now indicates the time of reading. 10 (N0) K: (N1a) The cave widened out as he went U := U0 + {E1, E2} C := C0 + {cave(R1), E1:widen_out(R1), E2:go(R2), E2 o E1, E2 << Now } The symbol o indicates that E2 overlaps E1; << indicates that E2 (and therefore E1) precedes the time of reading. 'widened out' is treated as a unit verb; the 'out' is actually redundant. Since 'the cave' is definite, its referent must exist prior to (N1); for this reason R1 is included in U0, and the same is true of 'he' and R2. (N1b) and the bottom seemed to drop away little by little U := U + {R3, E3, P1} C := C + {bottom(R3), part-of(R3, R1), E3:seem(P1), P1: E2 o E3} R3, 'the bottom,' has no apparent antecedent and might have been included in U0. It seems more likely, however, that it existed implicitly and that a meaning rule (such as 'Every physical object has a bottom') resolves the reference. E3 is true if P1 seems to be true, even if P1 is actually false. Since E4 is controlled by 'seem,' it is a subset ( =< ) of E3. (I am not sure that this is what Guenthner et al. mean by subset, since they offer no examples, but it works here.) 11 (N1c) and then, with no warning, it split in two directions, U := U + {E5, Set1} C := C + {ŠK1c K1c: E5:split_in(R1, Set1), E3 << E5, direction(Set1)} Since there is no warning, R4 is not visible to the top- level K. 'directions' introduces what is perhaps the major weakness in current versions of DRT, a means of representing plural nouns. The ad hoc solution offered here is to use set markers, following a suggestion in Guenthner et al. The proposition direction(Set1) is a notational shorthand for a complex sub-DRS representing "All members of Set1 are directions." 12 (N1d) one path leading straight ahead and one off to the left. U := U + {E6, R5, R6, E7, R7, R8} C := C + {path(R5), E6:lead(R5, R6), R6 <- Set1, straight_ahead(R6), E6 =< E5, path(R7), E7:lead(R7, R8), R8 <- Set1, to_the_left(R8) E7 =< E5} The cohesion of R5 and R7 with R1 is indicated by the fact that R6 and R8 are members of ( <- ) Set1. (N2) "If I were an opening to this cave, where would I be?" he asked himself. U := U + {E8, P2, R9} C := C + {E8:ask(R2, R2, P2) P2:K2a -> K2b, K2a: K2b: E7 << E8 } (N2) is, on the one hand, almost ridiculous; its only contribution to the understanding of the passage is the knowledge that 'he' is lost (Smith 1977), but that knowledge is no more explicit in the DRS than it is in the sentence itself. On the other hand, it is a major headache for DRT. (N2) is an embedded contra-factual conditional whose antecedent is impossible and whose consequent is a rhetorical question (indicated by the ? as an argument to location). Its embeddedness, in this case, is wrong, in the sense that it is not a matter of 'his' belief, but 13 in another situation it might be. The implication itself is worthless, but another implication might not be. The conclusion that R9 is not R2 (indicated by \= ) is obvious but might be useful in another contrafactual. The whole DRS must be added to K so that the reader can infer, by conversational implicature, that 'he' does not know the answer to the rhetorical question and that, since he does not know the answer, he is lost. (N3) Luke wasn't frightened. U := U + {Sit1} C := C + {Luke(R2), ŠSit1, Sit1: Sit1 o E8} (N3) introduces a situation (more accurately, a non- situation) whose duration is vague but which at least overlaps E8. (N4a) Oh, he knew there were such things in this world as bottomless caves, U := U + {Sit2, P3} C := C + {Sit2:know(R2, P3), Sit2 o Sit1, P3: } P3 is like an external anchor, in that it is a fact about the world, but Kamp 1985 does not allow for propositions as external anchors. It could be treated as an internal anchor, but it is explicit in the discourse. Both Sit2 and Sit3 are true throughout the discourse, so they are irrelevant as temporal markers, but either might have changed during the discourse and the representation must allow for that possibility. 14 (N4b) where people fell in and were never heard of again, C4 := C4 + {K4a -> ŠK4b, K4a: } This is a continuation of the sub-DRS begun in (N4a); it is interpreted as a universal: 'No person who falls in such a cave is ever heard of again.' Since this universal is embedded in a belief, it does not matter whether such persons exist or not; if it were not, it would be necessary to replace R11 with a set of at least two members. (N4c) but if there had been any such thing around the cottage he would have heard about it. U := U + {Set2} C := C + {K4c -> K4d, cottage(R15), K4c: K4d: } It is not clear whether (N4c) should be treated as a continuation of the belief initiated in (N4a), as a different belief, or as a top-level condition. Viewed objectively, it is a belief (and an illogical one at that), yet it does not seem to be syntactically embedded in 'know,' or any other verb of belief. If it is a different belief, or a top-level condition (as it is treated here), Set2 must be promoted from P3 to the top-level so 15 that it can be visible to other sub-DRSs. R15 must be added to K0; it is a definite description whose referent cannot be deduced in the way 'bottom' can be deduced as 'part-of' a cave. As with K2a, it would seem reasonable to elevate the negation of the antecedent of a contrafactual to the top-level, but in that case R14 would not be accessible (it would exist in a subordinate universe). Since Sit4 is a general proposition, no temporal relation is assigned; the same is true of Sit5, Sit6, and Sit7 below. (N5a) This was just a plain, ordinary cave--deeper than most, but that was all-- U := U + {Sit5, Sit6, Set4} C := C + {Sit5:plain_cave(R1), Sit6:ordinary_cave(R1), just(Sit5), just(Sit6), cave(Set4), ŠK5, K5: cardinality(Set4)/2, R16 <- Set5, R1 <-\- Set5, deeper_than(R16, R1)}> } From a logical point of view, most of (N5a) is redundant; the only useful part is 'deeper than most,' and that belief lacks credibility. Nevertheless, it poses several problems: handling the adjective-common noun combination, handling 'just,' and accounting for 'most' in a manner suitable for logic. The adjectives 'plain' and 'ordinary' (unlike 'red,' e.g.) have little meaning until applied to a particular domain-- caves, in this case. The adverb 'just' means something like 'not other than' in this case, but how is that meaning determined? It does not seem to be a syntactic matter, but a DRS is composed from a 16 syntactic parse. K5 is an attempt to handle 'most'; the symbols < and / have their usual mathematical meanings; <-\- indicates 'not a member of.' (N5b) and some place there had to be an opening to it. U := U + {R17, R18, Sit7, P4} C := C + {place(R17), opening(R18), part-of(R18, R1), Sit7:necessary(P4), P4: } The truth of (N5b) is doubtful, but given its truth, R17 and R18 must exist at the top-level. It is possible that R18 is identical to R9, now raised to top-level. Sit7 suggests one way to handle modal auxiliaries. (N6) There was, though, one big difference about this cave: it was Luke's. U := U + {Sit8, Sit9} C := C + {Sit8:difference(R1, Set5, Sit9), Sit9:own(R2, R1), Sit8 o E1, Sit9 o Sit8} The noun 'difference' entails two things that are different (this cave and other caves) and the thing that distinguishes them (Sit9). However, only Sit9 is syntactically specified. 17 (N7) He had found it and it was his own secret place. U := U + {E12, Sit10, R19} C := C + {E12:find(R2, R1), E12 << E1, secret_place(R19), R19 = R1, Sit10:own(R2, R19) E12 << Sit10} The representation of (N7) is straight-forward; R19 is R1, but to replace it by R1 in Sit10 is to make Sit10 a copy of Sit9. This application of DRT has pointed out several needed extensions to the theory. One of the most obvious is the means of representing plural NPs, including those with quantifiers that are less specific than 'all' but more specific than 'some' (e.g. 'most'). Another needed extension is a means of handling terms that modify conditions: verbs that take verbals as complements (modals and verbs such as 'seem') and adjectives whose meanings depend on the particular nouns that they modify. A third extension is a formalism for specifying arguments that are not syntactically indicated (such as those for 'difference'). Whether or not the second and third extensions are feasible without appealing to semantic analysis prior to constructing the DRS remains to be seen; perhaps the needed machinery is available in the lexicon. Conclusion DRT has been successful in representing a small subset of natural language, and is being extended to increase the size of that subset. As we have seen in Part 3 of this report, other extensions will be necessary before it can handle the full range of natural language expressions. If those extensions can be accomplished without appeal to semantics, DRT will prove to be quite powerful. However, DRT is intended to provide a bridge between syntactic parses and model theoretic semantics; if 18 semantic analysis is necessary before a DRS can be constructed, the purpose of DRT has been lost, or at least seriously modified. REFERENCES ETS. 1969. Sequential Tests of Educational Progress. Princeton: Educational Testing Service. Goodman, D. 1988. An Implementation of an extension to discourse representation theory: Translating natural language to discourse representation structures to Prolog clauses. Unpublished master's thesis, The University of Georgia, Athens. Guenthner, F. 1987. Linguistic meaning in discourse representation theory. Synthese 73:569-98. Guenthner, F., H. Lehman, and W. Schonfeld. 1986. A Theory for the representation of knowledge. IBM Journal of Research and Development 30:1.39-56. Kamp, H. 1981. A Theory of truth and semantic representation. In J. Groenendijk, T. Janssen, and M. Stokhof (eds.) Formal Methods in the Study of Language, 277-322. University of Amsterdam. Kamp, H. 1983. SID without time or questions. Ms. Stanford, CA. Kamp, H. 1985. Unpublished discourse representation theory project description, University of Texas, Austin. Pinkal, M. 1986. Definite noun phrases and the semantics of discourse. COLING-86, 368-373. University of Bonn. Spencer-Smith, R. 1987. Semantics and discourse representation. Mind and Language 2:1.1-26. Smith, W. 1977. Types of information addition in the psycholinguistic process of reading. Unpublished doctoral thesis, The University of Georgia, Athens.