The growth of interactive communication fueled by web technologies has created a wealth of naturalistic dialogic communication. Like many areas of natural language processing, the modeling of dialog has begun to have large impact for both military and commercial applications. Most approaches and computational tools for dialog to date have focused on analysis of simple dialogic interactions between two utterances. We propose to examine how complex dialogic goals above the single turn pair (such as advice giving, argument, recommendation, and persuasion) are accomplished across a range of conversational settings such as face to face, voice only in mobile settings, online chat and other online forums. This research will focus on developing resources, tools, and computational models that will allow us to analyze dialog above the level of adjacency pairs, and which will then contribute to realistic computational models that go beyond current simplified models of joint intention or shared plan towards an understanding of how higher level goals in dialog are manifest.
The proposed EAGER aims to explore how techniques for language generation and dialog management in NLP can be incorporated into the narrative structures of games, providing novel approaches for game authoring that we hypothesize will eventually lead to more compelling and engaging games, appealing to a much wider segment of the population, and usable for a much wider range of purposes. We propose to develop a prototype architecture for a new kind of game, a Social Outdoor Role Playing Exercise Game (SORPEG). We will use the new SORPEG architecture to implement a prototype novel physical- activity (PA) based social role-playing mystery game, Spy Feet, aimed at encouraging physical activity in young women and girls.
The main technical contribution of this project concerns generating dialog interactions and managing the integration and optimization of dialog and narrative goals in a role playing game. We will design Spy Feet to run on mobile devices equipped with GPS, similar to recent Augmented Reality (AR) educational games. No other AR games use expressive language generation, computational dialogue management, or procedural content selection techniques, thus providing little opportunity to explore how such techniques could lead to different game play experiences and outcomes. Our technical aims are to: To build a natural language generation engine, Spy-Gen, aimed at producing a range of linguistic styles for characters in role-playing spy and mystery games; To develop an architecture for managing and dynamically generating dialogue game narrative sequences (plot structures) that exploits Spy-Gen, and that can dynamically adapt some aspects of game play to the user and her environment; To prototype a new game Spy Feet that allows us to test out our ideas with users and refine them in a participatory manner.
The human ability to use language flexibly is a hallmark of robust intelligence. Utterances are not containers of meaning to be handed back and forth; in interactive dialog, utterances emerge collaboratively, tailored to common ground or specific context with specific partners. Language processing and behavior adapt spontaneously to express and ground individuals’ knowledge, needs, and goals, as well as to coordinate their joint actions. Evidence of coordination emerges in several ways: Speakers and addressees converge on the same words when they believe they share a perspective or meaning; they can also entrain on syntactic form, speaking rate, pronunciation, and style (e.g., in directness, detail, and strategy). Adaptive processing can be complementary as well as convergent.
In contrast to human-human dialog, interaction with spoken dialog systems is anything but flexible. Commercially available telephone dialog systems are highly constrained and constraining, at best allowing speakers some flexibility in what they can say in response to a branching series of menus presented with pre-determined messages. GPS navigation systems do not adapt online or learn. On the input side, research on dialog management has too often focused on how to reduce or ignore the variability in what users say, rather than using it to support natural and efficient communication with spoken dialog systems. On the output side, even state-of-the-art research systems that aim to adapt to different types of users typically hand-craft the system’s messages. New approaches to dialog management and speech generation are needed if spoken dialog systems are to fully exploit the richness and potential of human language use; such approaches would combine different types of adaptation, including priming, user modeling, and adaptation to the dialog history. Adaptive strategies can be global (applying to a particular user or user stereotype during all or part of a dialog) as well as local (dynamic reactive or proactive adjustments in dialog strategy motivated by the local dialog context). Understanding, modeling, and prototyping adaptation is crucial in order to create flexible, powerful, and more usable spoken dialog systems. Users need to be able to move easily through their environments while staying connected via hands-free spoken or multimodal dialog; such dialog technology should be responsive, flexible, and accessible to all.
This exploratory, interdisciplinary project will collect a large and parameterized ‘Walking-Around’ corpus in which a remotely located partner gives directions to a pedestrian; the variability captured by this corpus will inform the design of an initial prototype spoken dialog system. Parameters will include each partner’s degree of prior knowledge about the navigation environment, their common ground from previous interaction with each other, the degree of visual evidence available about the current state of the task, and various other parameters selected to support both observational studies and testing of hypotheses about variability and adaptive processing in human spoken dialog. The domain of collaborative navigation is modeled as a realistic environment in which at least one interacting partner in the pair is mobile. The corpus will be collected and analyzed in a cascaded fashion, enabling it to inform and provide criteria for a spoken dialog prototype that will eventually use (rather than ignore) the natural variability in human speech. The prototype’s platform will rely on off-the-shelf and other components developed for this project. The ultimate goal of this exploratory effort is to support the synthesis of entirely new, flexible, and robust spoken dialog systems that are capable of both adapting and being evaluated on-line (in real time). Key to that effort will be to determine which potential adaptations are actually functional, that is, beneficial for a particular task or context, and to eventually test those adaptations in human-computer spoken dialog systems. This EAGER project combines methods from psychological data collection and linguistic analysis with hypothesis testing and computational modeling, in order to advance understanding of the psycholinguistics of interactive dialog, the coordination of joint action, and computational research that models spoken dialog interaction.
If interactive spoken dialog technology is to be broadly useful, then it cannot be aimed only at the “average user”, at the motivated user, or at a few stereotyped users. Not all users are willing or able to produce scripted utterances or to understand the same messages; future dialog systems should adjust to individual users dynamically. If this project is successful, its products and knowledge will inspire and support longer-term research benefiting a large and diverse population of users, both through an improved basic understanding of communication, and through guidance for follow-on prototypes. The interdisciplinary methods and results will lay foundations for developing algorithms and architectures for language adaptation and generation in both stationary and mobile applications, including those accessible by special-needs users. The parameterized Walking-Around corpus will be made available to other researchers for additional impact on the design of mobile GPS navigation systems. The PIs take a strongly interdisciplinary approach to both research and training, mentoring students from groups that include those currently underrepresented in technical and scientific fields. Both Stony Brook and UCSC comprise ethnically, racially, and economically diverse communities from which to recruit these students, who will receive training in the fields of computer science, computational linguistics, experiment design, and psycholinguistics
Computer games and other forms of interactive media have many potential benefits -- ranging from the educational to the economic. Games are used to educate in areas such as computer science, health care, and language learning, while game industry revenue is now larger than feature film box office receipts. But many subjects we would wish to teach, and many genres in which we would like to entertain, are fundamentally limited by current authoring approaches: in particular, the amount of character dialogue that must be hand-authored. This project will use what is known about the creative work of human authors together with advanced techniques from the field of "natural language generation" to explore a new approach to addressing this problem. In particular, it will integrate a new model of dialogue generation into an advanced tool for interactive story authoring, then evaluate the results when both expert and beginning authors work with the tool, giving us our first understanding of the promise of such techniques for enhancing the creativity of authors.
The need for a new approach to dialogue is pressing. For example, the forthcoming commercial game LA Noir has a script of 2,200 pages (roughly equivalent to 12 feature films). Producing this amount of dialogue is simply impossible for educational game producers, and is nearing the limit of what commercial producers can manage, yet games can only continue to grow in sophistication by having more dialogue. This research works toward a solution for this dilemma, opening the door to further educational development and economic growth, by enhancing author creativity through cutting edge computer science. In addition, this project will help develop broader understanding of the field of Creative IT, providing a case study of how the knowledge of creative professionals and scientists can combine to produce social benefits that would be impossible for either working alone.