A LANGUAGE USE PERSPECTIVE ON THE DESIGN OF HUMAN-COMPUTER INTERACTION Derek Brock Naval Research Lab Washington, DC, 20375, USA [email protected] ABSTRACT. Clark (1996) has identified common ground–information we take to be shared with others–as an indispensable requisite for everything people do with each other where language use or, more definitively, a collaborative use of meaning and understanding is involved. In interactive software development, though, theory and practice have placed little emphasis on the role and importance of this crucial aspect of cognition in user interactions, with the result that user interface and interaction designs are typically impoverished in ways they need not be. In this paper, the basic features of a language use approach to human-computer interaction are outlined, and a range of both noncomputational and computational implications for the design of interactive systems is examined. In particular, human-computer interaction is recast as a genuine instance of language use between the user and the system designer and, in a second layer, as a joint activity in which the system and the user are participants. In this view, principled interaction design is authorship that promotes common ground with the user at all times in all layers through noncomputational and/or computational means. Noncomputational means facilitate the user’s development and maintenance of common ground with the author’s design through passive and semipassive mechanisms that may also depend on user initiative. Computational means focus on actively maintaining a system-side image of common ground and using this to inform aspects of the system’s interaction behavior. Key issues in computing common ground include representing and verifying what the user does and does not know, identifying and using conventions, and solving coordination problems posed by the user. Ways that both noncomputational and computationally based language use approaches to human-computer interaction designs can better support user comprehension and performance through the promotion and use of common ground, and cognitive issues these approaches address, are discussed. 1 Introduction The broader implication is that all principles of language use Motivated by a continuing effort to appreciate ways in which are important in human-computer interaction. In this paper, it software user interfaces can be better made to support users, is argued that they are the proper basis of a principled this short paper examines at a high level how and why the framework for interaction designs. study of language use is theoretically relevant to the design of 2 What is language use? human-computer interaction. Herbert Clark (1996) characterizes the study of language use At its heart, the study of language use is concerned with not as a science of the structure of language but instead as a collaborations where one person’s meaning and another’s cognitive and social science that is centrally concerned with understanding are essential to carrying out what they are the notions of speaker’s meaning and addressee’s doing together. Meaning and understanding are conveyed understanding. In this view, all instances of language use are through the use of signals, and any device that can serve this instances of two or more participants acting jointly in one purpose well, no matter if it is a gesture, a sign, or an setting or another, though not necessarily at the same time, utterance, amounts to a use of language. (Think of the adage who jointly coordinate individual cognitive, physical, and "a picture is worth a thousand words.") Arguably, people’s perceptual actions as needed to accomplish a social purpose language use skills––their complementary abilities to devise they share. The setting in which a particular instance of ways to indicate their intentions and to successfully resolve language use occurs determines how these joint actions are the indications of others––are their most fundamental coordinated and what skills are needed. Face-to-face settings resource in all activities that require collaboration. are the most basic and written settings are perhaps the most demanding. Language use also often involves more than one Observations such as the foregoing about language use have conceptual domain of action. These domains are called layers direct implications for human-computer interaction. Although and are identified by references to persons and/or settings, and computers are far from the equivalent of people, people possibly other matters, that are not necessarily present in the nevertheless use them to do all sorts of things that would be initial layer of action. difficult at best to do on one’s own. This use has the distinct character of a collaboration. People interact with software Language use always serves people’s larger collaborative through user interfaces that require them to understand their purposes––purposes that are met and carried out socially in role in the task and that constrain their goals and demand their their joint activities. Every instance of language use, though, cooperation in the coordination of a range of participatory is itself an act of collaboration. Signals are presented with the actions. intention of conveying meaning. But meaning can only be understood with an addressee’s participation. By taking up a But user interfaces are not devised by computers, of course. signal and grasping its sense in context, addressees make the They are devised by people. As Terry Windograd noted in an action behind the signal’s presentation a meaningful and interview published in Preece, et al. (1994), "everything that complete joint action. The principles at work in this process is in [a] computer came from somebody in some context with are the primary concern of language use studies. some purpose and some meaning." The implication is that the design of human-computer interaction is fundamentally a Thus, meaning and understanding are central, and whenever representation problem that involves the designer’s meaning these notions are involved, even wholly nonlinguistic and the user’s understanding. collaborations between people should be readily identified as instances of language use. Wholly essential for meaning and 1 Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED JAN 2002 2. REPORT TYPE 00-00-2002 to 00-00-2002 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER A Language Use Perspective on the Design of Human-Computer 5b. GRANT NUMBER Interaction 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Naval Research Lab,4555 Overlook Ave SW,Washington,DC,20375 REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES ONR TC3 Workshop, Cognitive Elements of Effective Collaboration, 15-17 Jan 2002, San Diego, CA 14. ABSTRACT see report 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE Same as 28 unclassified unclassified unclassified Report (SAR) Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 understanding, though, is the notion of common ground, to expect these premises to be met as they go about which Clark identifies as an indispensable requisite in all coordinating their joint actions; otherwise, a larger effort is things people undertake to do with each other. always needed to accomplish the same thing. 2.1 What is common ground? Working out coordination problems is a dominant part of Common ground is essentially a specific kind of shared people’s language use skills. In choosing the right basis to use knowledge. More precisely, it is knowledge that people take as a coordination device, joint salience is usually the most to be held in common on the basis of shared information. important consideration. Capitalizing on this aspect of People try to justify and use common ground by presenting people’s attention in context greatly increases the likelihood what they believe is an adequate shared basis for it. In of their finding the intended solution. Two important sources general, all signals, be they words, images, or something else, of coordination devices that readily meet this test are external function as shared information when they are presented and representations and conventions. External representations are perceived. Suppose that two people have agreed to go to simply elements of the physical situation in a setting of lunch at noon and one shows the other that it is now just past language use that can be used to represent parts of the joint twelve. When the time is presented, it is a signal that activity taking place there. Not all external representations becomes shared information. What is meant and understood have the same properties and advantages, but they generally by it, that they have agreed to go to lunch now, is knowledge make superior coordination devices because of their that resides in their common ground on the basis of their perceptual immediacy and their inherent significance in the earlier agreement. Presenting the time both justifies and common ground of the moment. Conventions are makes use of this common ground by referring to their representations of standing solutions to frequently occurring previous agreement concerning lunch at noon and indicating coordination problems, and therein lies their value: they are that it is time to go. successful and familiar regularities of action. When a convention is signaled and recognized as a coordination Common ground is built up moment-by-moment over a device, addressees usually know instantly how to proceed. lifetime of joint activities and experience and presupposes Conventions and systems of conventions are preeminently certain cognitive processes (most notably, ordinary perceptual useful for purposes of language use. Clark notes that, skills, the workings of short and long term memory, and "Languages like English are conventional signaling systems mundane reasoning abilities). Every event that openly takes par excellence." Conventions and external representations are place and is mutually attended to in a joint activity often found together. In joint activities that take place in immediately becomes a piece of knowledge in its participants’ written settings, for instance, the media itself is an external common ground. Ordinarily, this includes all of the shared representation of the activity, and its elements serve as a bases people use with each other and, just as importantly, continuous stream of conventional coordination devices (text, most of the inferences each would normally make in context. layout, etc.) that addressees use to complete most of the joint Clark argues that people conceptually divide their common actions the writer has begun. ground in joint activities into at least three parts: first, the common ground they bring to the joint activity; second, an 3 Language use in human-computer interaction idea of the joint activity’s current state; and third, a record of The basic claim of this paper is that human-computer what has been done so far. People naturally keep track of this interaction is, in fact, a genuine instance of language use. This information in one form or another, but just as importantly, is the case largely because of the nature of its supporting they expect their counterparts to do very much the same. media: software user interfaces are not only design artifacts (i.e., products of human design) but are also a means for 2.2 Coordination in language use carrying out a class of joint activities (doing computationally Common ground makes it possible for people to cooperatively based tasks) in which all principles of language use apply, and orient each other’s cognition with signals. To direct one whose primary participants are people acting as themselves in person’s understanding to what another means by a particular the roles of designer and user. In contrast, the notion of signal requires each individual to coordinate a corresponding interacting with the computer (human-computer interaction set of interdependent physical, perceptual, and cognitive per se) is simply a secondary layer of action on top of this actions with his or her counterpart. This coordinated effort is primary one (more on this below). what makes each instance of language use a joint action. The joint activity between people defined by human-computer Clark characterizes joint actions involving meaning and interaction is made up of joint actions that arise from the understanding as coordination problems that people, in most coordination of individual cognitive, physical, and perceptual instances of language use, pose for each other and then actions on the part of both the designer and the user, albeit immediately work together to solve. To be efficient, certain across time. These joint actions implicitly depend on the principles apply. Among these are that first, a straightforward function of common ground and always involve the solution should already be in mind when the problem is designer’s meaning and the user’s understanding. Put another posed. Second, all of the information needed to reach the way, a software user interface presents a designer’s notion of solution, given what is common ground, should be shared in how a particular computational task can be organized and the problem’s posing (this information functions as a carried out. When a user takes up the task, part of what he or coordination device). And third, this solution should be one she faces is the job of solving a substantial array of that is readily and unambiguously apparent to everyone in coordination problems the designer has posed. Every aspect context. Respectively, these are premises of solvability, of the presentation is available to be used as a coordination sufficiency, and joint salience. Experience motivates people device, and each solution the user correctly converges on 2 makes joint a corresponding set of pre-coordinated actions the confirm the outcome of a previous event. And while the designer has initiated. external representation is accessible to both the user and the system, it is mostly unused by the system as a means for 3.1 Comparing written settings justifying common ground in the second layer of action. Because the designer’s conception has been worked out and produced in advance of its presentation, the joint activity that 3.1.3 Common ground takes place in the primary layer of human-computer Common ground turns out to be built differently in each of interaction must rightly be characterized as one that takes the first two layers of human-computer interaction, and this place in a written setting. In contrast to more conventional proves to be a third important way user interfaces understood written media such as that of a book, the scene and medium of as written media differ in their use from conventional written this written setting are those of a computer’s interactive presentations. For both the designer and the user, finding display. All written presentations share certain basic common ground is an intuitive prerequisite for the conduct of characteristics, but the written media of human-computer their joint activity. More often than not, a presentation’s use interaction differs from conventional written media in several of various sorts of display conventions will be enough to get important ways. this process started. Unfortunately, though, because software user interfaces are conventionally organized as nonlinear 3.1.1 Layers presentations, the designer’s control over the process of First, unlike the second layer of action that is ordinarily building a complete body of common ground with the user in presented in a book or magazine, wherein the reader remains the planned manner of a linear presentation is sacrificed. in a written setting but participates in the imagining of a story Linearly organized presentations in conventional written or the domain of an essay, the second layer of action in media have an important strength: they naturally correspond human-computer interaction, perhaps surprisingly, places the to what Clark calls the third part of common ground in joint user in a face-to-face setting. Here, the user is expected to activities, i.e., a record of what has been done so far. This interact with the computer to carry out the joint activity correspondence permits addressees to use such presentations implicit in the software task, as if the computer and not the to their advantage; they are, for instance, intuitively indexed designer were the user’s real counterpart. This shift of setting and readily open to review when misunderstandings or and participants between layers naturally makes questions arise. However, in the written setting of human- correspondingly different demands on the user’s language use computer interaction (specifically, in its first layer), the skills and also has potentially damaging consequences for the designer can only build common ground with the user in an function of common ground. irregular manner. The reason for this limitation is the opportunistic nature of the presentation. That is, opportunities 3.1.2 Presentation properties for the designer to informally promote all of the common A second key difference between conventional written media ground the user will need to efficiently coordinate any part of and that of human-computer interaction can be found in the the joint activity published in the design are effectively organization and expectations of the latter’s presentation, contingent on what the user chooses to do. For users, a further which is nonlinear and deliberately opportunistic. This is difficulty with written presentations of this sort of lies in their decidedly different from the linear presentation that is the inherent lack of correspondence with the third part of encouraged convention in most other written settings. Instead, common ground and the advantages that would attend. As a in human-computer interaction, the user has opportunistic consequence, over repeated episodes of use, software users control of the joint activity. Inevitably, this leads to a often develop a palpable sense of where their common ground contingent branching of focus that no nontrivial design can with a design is missing and will even look for it on occasion fully anticipate for the user. Hence, when designs rely on unless the effort proves to be too costly. dependent concepts for the solution of widely divergent goals, as they sometimes must, users understandably may not know Building common ground with the designer in a user what to do next through no fault of their own. Users often interfaces’s first layer of action is, of course, wholly relevant have little recourse, though, but to depend on the designer, in to working directly with the operation of the computer in the posing coordination problems, to honor their natural second layer. By design, second layer interactions are in many expectations of solvability, sufficiency, and joint salience. respects intended to resemble the regular conduct of joint activities in face-to-face settings, albeit with a sophisticated Several other presentation properties are worth highlighting information processing machine. Ordinarily in face-to-face at this point. Human-computer interaction relies on a interactions, people expect each other to keep track of what language of coordination devices that is made up of a variety they are doing together as part of a transparent exercise of of linguistic and nonlinguistic signals including elements of their mundane language use skills. Indeed, people nominally natural language, visual artifacts, and behaviors (i.e., actions do this in any activity they undertake, even if others are not and procedures). Many of these presentation elements, in the involved. Accordingly, it can readily be argued that the modern culture of computer literacy, are rightly intended to be knowledge people accumulate and are able to justify in all of construed as conventions. The presentation also serves in part their undertakings corresponds structurally to Clark’s three as an external representation of the joint activity. This aspect parts of common ground: knowledge initially held about an of a software user interface is usually intended to function as activity at its start, knowledge of the activity’s current state, a manipulable model of the task at hand, but does not always and knowledge of what has taken place so far. fully indicate the task’s status. For instance, actions that take place in the external representation are often evanescent to the When this kind of knowledge is acquired in an interactive extent that a user may have to return to an earlier procedure to setting with other people, it functions as common ground 3 whenever shared informational bases exist to justify it as 4.1 The noncomputational approach such. When it is acquired in another sort of interactive setting, The essential thrust of this approach is to revisit the notion of one where the activity involves only components of the user-centered design (Norman and Draper, 1986) through the environment that maintain no such knowledge, it can only be lens of language use principles. Common ground in this view construed pragmatically as an individual’s lone conception of is seen as a sine qua non for usability. Strategies worth the activity. The knowledge users acquire in the second layer investigating under this heading include design reviews that of human-computer interaction, though, falls somewhere in evaluate both the presentation language as a set of shared between these two extremes––for the most part, it is only the informational bases and the interaction design as a set of user’s conception of the activity, but in certain ways, it is also coordination problems. Such reviews should focus on issues part common ground. Put more explicitly, in face-to-face of intended meaning and the user’s expectations of interactions with computers, users regularly keep track of solvability, sufficiency, and joint salience. Certain legacy contingent interaction knowledge corresponding to the three strategies that deserve renewed investigation as user initiated parts of common ground as well as any informational bases means for addressing the inherent first layer difficulty of that may be needed to justify it. Throughout the course of the building adequate common ground also fall into this interaction, the software keeps track of some of this same approach. These strategies include the design of robust, low knowledge and shares corresponding informational bases in cost, point-of-use reference, tutorial, annotation, and limited, ad hoc ways (with undo mechanisms, for instance, interaction history mechanisms. and other kinds of interaction histories). Contingent interaction knowledge the software fails to be able to justify 4.2 The computational approach and use, though, cannot be construed as common ground in The long range goal of this approach is to share the user's the setting nor the layer. Failures of this sort have the effect of cognitive load and promote the use of face-to-face language withering most expectations of the computer’s capacity to use skills in the second layer of interaction designs. Although keep track of the joint activity as another person would. As a many behaviors exhibited by user interfaces can be easily consequence, many of the user’s face-to-face language use misconstrued as evidence of regular maintenance of common skills go largely unused. ground by a computer (e.g., various reflexive mechanisms, corrective and alerting behaviors, and history mechanisms in 3.2 Design challenges general), fully functional common ground between users and As the foregoing discussion has attempted to illustrate, when computers can only arise when the computer maintains a the normal, linearly accumulative function of common ground system-side representation of the three parts of common in a language use setting is impaired, people are naturally ground in user interactions and can justify and use this forced to work harder cognitively to accomplish their goals. knowledge on the basis of shared information to advance the Unfortunately, this is the case in the settings of both layers of joint activity in reliable and productive ways. Ultimately, this human-computer interaction. Since opportunistic interaction is a design problem for artificial intelligence techniques inherently builds incomplete common ground in the first layer (Brock and Trafton, 1999). Most aspects of reasoning about of an interaction design, straightforward design strategies that common ground are computationally challenging. Significant reward user initiative with, for instance, inexpensive, well computational goals include representing and verifying what indexed, at hand access to missing knowledge when the user does and does not know about the joint activity, discrepancies are encountered and/or simple, unobtrusive perceiving and solving coordination problems posed by user, ways to document procedures in context are certainly worth and identifying and using conventions (Alterman and considering. A more difficult challenge obtains in the design Garland, 2001). Design issues raised by these goals include of second layer interactions. Here, the clear need is to devise matters of system initiative, appropriate knowledge further reliable ways for user interfaces to accumulate, justify, boundaries (reasoning about content vs. reasoning about and use additional contingent interaction knowledge as it functionality) and the design of face-to-face interactions about develops in a given instance of human-computer interaction. interactions, which are necessary to more fully support users' With well designed access to second layer common ground, language use skills in second layer transactions. users should be able to make greater and more efficient use of their face-to-face language use skills and the cognitive References resources these skills support. Alterman, R. and Garland, A. (2001). Convention in Joint Activity. Cognitive Science (25)4,2001. 4 Research approaches Brock, D. and Trafton, J. G. (1999). Cognitive Representation of As an adjunct to the short review of basic design challenges Common Ground in User Interfaces. UM99 User Modeling: for language use in human-computer interaction presented at Proceedings of the Seventh International Conference. New York, the end of the previous section, the remainder of this paper NY: Springer-Verlag Wein. Clark, H. H. (1996). Using Language. New York, NY: Cambridge briefly describes two fundamental approaches for design University Press. research in this area, one targeted principally at issues of Norman, D. A. and Draper, S. W., eds. (1986). User Centered usability and the other at the problem of face-to-face System Design: New Perspectives on Human-Computer Interaction. interactions with user interfaces Hillsdale, NJ: Lawrence Erlbaum Associates. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., and Carey, T. (1994). Human-computer interaction. Reading, MA: Addison- Wesley. 4 A Language Use Perspective on the Design of Human-Computer Interaction Derek Brock Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory Motivating question How can user interfaces better support users? Derek Brock, Navy Center for Applied Research in Artificial Intelligence 2 Observations • People use language (in all senses where meaning and understanding are needed) to do things with each other, e.g., to collaborate • People’s language use skills are their most fundamental resource in all interaction contexts involving meaning and understanding • Although people interact with computers to do things, people are also the authors of human-computer interaction designs Derek Brock, Navy Center for Applied Research in Artificial Intelligence 3 Implications • The design of human-computer interaction is a representation problem that involves the designer’s meaning and the user’s understanding • Accommodating the principles of how people use language is relevant as a framework for the design of human-computer interaction Derek Brock, Navy Center for Applied Research in Artificial Intelligence 4 What is language use? Language use – allows people to carry out joint activities – is a collaborative form of joint action built on individual actions – always involves at least one person’s meaning and another’s understanding – depends on representation, i.e, the use of signs and symbols as signals – is a cognitive science and a social science – is not the study of linguistics Derek Brock, Navy Center for Applied Research in Artificial Intelligence 5