Loqui: A wizard ablation dialog system project

The name Loqui (lo-kwee) is a Latin phrase meaning "I speak"; because the "I" in the case of an ablated wizard is neither the wizard nor the system, we like the alliterative allusion to Loki (lo-kee), the Norse god of mischief.

this page contains four sections:

 

Columbia University

The City University of New York

 


 

The project:

Automated telephone dialog systems rely disproportionately on accurate transcription of the speech signal into readable text. When the system has low confidence in the automatic speech recognition (ASR) of a caller’s utterance, a typical dialog strategy requires the system to repeat its best guess and ask for confirmation. This leads to unnatural interactions and dissatisfied callers. Given current ASR capabilities, our project seeks better dialog strategies. It learns automatically from contrasting corpora, and compares the results across corpora. Our previous research has shown that dialog strategies can be modeled as a Markov Decision Process (MDP), and can be learned automatically from a corpus of transcribed dialogs. Here we apply similar learning techniques to discover which dialog corpora are the best “teachers.” Our novel methodology, wizard ablation, collects simulated human-system dialogs that vary in controlled ways. Our testbed application, the CheckItOut dialog system, is modeled on a corpus of telephone transactions between patrons and librarians that we collected at New York City’s Andrew Heiskell Braille & Talking Book Library. This application has appropriately limited complexity, and potentially broad social benefit. We based CheckItOut on the RavenClaw/Olympus architecture and components, an experimental framework for spoken dialog systems developed at Carnegie Mellon University.

The Wizard of OZ

Our project intends to exploit problem-solving strategies people would use if a person’s abilities and options were restricted to be more like a machine’s. In conventional wizard-of-oz studies, unsuspecting users interact with human wizards “behind-the-screen,” thus providing data on the way humans interact with (what they believe to be) machines. Unlike a conventional wizard, an ablated wizard is restricted to some of the inputs or outputs available to the dialog system. For example, under one condition the wizard may see only the output of a speech recognizer, instead of hearing the human user's voice; thus it “hears” only a transcription. Under further ablation, the wizard must choose dialog actions (and the associated utterances) from the system’s repertoire, but can combine them freely. In this state, the wizard and the system collaborate to produce a response to the user. This allows us to examine and learn from various combinations of system functionality and human intelligence. The book-borrowing scenarios for the wizard interactions are realistic, and Heiskell Library patrons participate in the studies. Our collected dialogs will be made available to the research community.

 

People:

Principal Investigators :

 

Ph.D. Students:

 

Programmer:

  • Pravin Bhutada

Undergraduate Students:

  • 2008-2009
    • Alex Dieudonne (Hunter College - CUNY)

    • Brandon Maister (Hunter College - CUNY)

    • Boris Mindzak (Columbia College, Columbia University)

    • Jason Fitzsimmons (Hunter College - CUNY)

    • Thomas Flynn (Hunter College - CUNY)

    • Chris Haueter (Columbia College, Columbia University)

    • Patricia Paunescu (Hunter College - CUNY)

    • Davis Quintanilla (Hunter College - CUNY)

    • Geoffrey Rice (Hunter College - CUNY)

    • Eliane Stampfer (Columbia College, Columbia University)

    • Allan Zelener (Hunter College - CUNY)

  • 2007-2008
    • Carnegie Castillo (Hunter College - CUNY)

    • Kenneth Cordero (Hunter College - CUNY)

    • Cem Isbilir (Hunter College - CUNY)

    • William Ng (Hunter College - CUNY)

    • Davis Quintanilla (Hunter College - CUNY)

    • Allan Zelener (Hunter College - CUNY)

 

Publications:

Passonneau, Rebecca; Epstein, Susan; Gordon, Joshua. 2009. Help me understand you: Addressing the speech recognition bottleneck. To appear in AAAI Spring 2009 Symposium, Agents that Learn from Human Teachers Stanford, CA. March 23-25, 2009.

Levin, Esther and Rebecca Passonneau. 2006. A WOZ variant with contrastive conditions. Proceedings of the Interspeech Satellite Workshop, Dialogue on Dialogues: Multidisciplinary Evaluation of Speech-based Interactive Systems. Pittsburgh, PA.




Valid HTML 4.0 Transitional