Loqui: A wizard ablation dialogue system project

The name Loqui (lo-kwee) is a Latin phrase meaning "I speak"; because the "I" in the case of an ablated wizard is neither the wizard nor the system, we like the alliterative allusion to Loki (lo-kee), the Norse god of mischief.

this page contains four sections:

 

Columbia University

The National Science Foundation

The City University of New York

 


 

The project:

The Loqui project is funded by the National Science Foundation under awards IIS-0745369, IIS-0744904 and IIS-084966.

Automated telephone dialogue systems rely disproportionately on accurate transcription of the speech signal into readable text. As the performance of automatic speech recognition (ASR) decreases, dialogue system performance often falls off sharply. Our project seeks better dialogue strategies that are less dependent on accurate ASR, and that degrade gracefully. Our novel methodology, wizard ablation, collects simulated human-system dialogues that vary in controlled ways. Our testbed application, the CheckItOut dialogue system, is modeled on a corpus of telephone transactions between patrons and librarians that we collected at New York City’s Andrew Heiskell Braille & Talking Book Library. This application has appropriately limited complexity, and potentially broad social benefit. We based CheckItOut on the Olympus spoken dialogue system architecture and RavenClaw dialogue manager developed at Carnegie Mellon University.

The Wizard of OZ

Originally in wizard-of-Oz studies, unsuspecting users interacted with human wizards “behind-the-screen” to provide data on the humans interacting with (what they believed to be) machines. In ablated wizard studies, wizards are restricted to some of the data available to the dialogue system; wizard and system collaborate to produce responses to users. This allows us to model wizard behavior using system features. We study multiple wizards to identify relatively more successful dialogue strategies, and to learn models from the best wizard teachers. Our experiments show that wizards differ in the accuracy of their interpretations of ASR, and that we can model the best wizards using a combination of features from ASR, voice search (database query with the ASR), semantic parsing, and dialogue state.

Highlights:

The project has generated two ablated wizard corpora that will be released after the project ends. The first is a corpus of approximately 4,200 wizard-caller turn exchanges in which callers request a book by title. The second consists of 913 full dialogues (20,422 user utterances) with 6 wizards and 10 callers. Callers requested four books per dialogue, by title, author or catalogue number. The key to learning a machine-usable model of wizard behavior is the selection of an appropriate set of features, meaning data available to the system at decision time that characterizes the user utterance and the dialogue context. Such selection is non-trivial. The datasets our project produces support a wide range of research and engineering goals. Our current research applies a variety of dialogue-specific feature selection methods to feature sets much larger than those commonly used to learn dialogue strategies from corpora. To create more habitable dialogue, our work has begun to move toward an architecture that integrates utterance interpretation and dialogue management in a way that profits more fully from the rich set of features we now use to model wizard behavior.

 

People:

Principal Investigators :

 

Ph.D. Students:

 

Masters Students:

 

Programmers:

  • Kevin McInerney (2010)

  • Pravin Bhutada (2008-2009)

 

Undergraduate Students

Publications

Resources Produced

 

Valid HTML 4.0 Transitional