|
Loqui: A wizard ablation dialogue system project
The name Loqui (lo-kwee) is a Latin phrase meaning "I speak"; because the "I" in the case of an ablated wizard is neither the wizard nor the system, we like the alliterative allusion to Loki (lo-kee), the Norse god of mischief.
this
page contains four sections:
|
|


|
|
The
project:
Automated telephone dialogue systems rely disproportionately
on accurate transcription of the speech signal into
readable text. When the system has low confidence in
the automatic speech recognition (ASR) of a caller’s
utterance, a typical dialogue strategy requires the system
to repeat its best guess and ask for confirmation. This
leads to unnatural interactions and dissatisfied callers.
Given current ASR capabilities, our project seeks better
dialogue strategies. It learns automatically from contrasting
corpora, and compares the results across corpora. Our
previous research has shown that dialogue strategies can
be modeled as a Markov Decision Process (MDP), and can
be learned automatically from a corpus of transcribed
dialogues. Here we apply similar learning techniques to
discover which dialogue corpora are the best “teachers.”
Our novel methodology, wizard ablation, collects simulated
human-system dialogues that vary in controlled ways. Our
testbed application, the CheckItOut dialogue system, is
modeled on a corpus of telephone transactions between patrons
and librarians that we collected at
New York City’s Andrew
Heiskell Braille & Talking Book Library. This application
has appropriately limited complexity, and potentially
broad social benefit. We based CheckItOut on
the Olympus spoken dialogue system architecture, and RavenClaw dialogue manager
developed at Carnegie
Mellon University.

Our project intends to exploit problem-solving strategies
people would use if a person’s abilities and options
were restricted to be more like a machine’s. In
conventional wizard-of-oz studies, unsuspecting users
interact with human wizards “behind-the-screen,”
thus providing data on the way humans interact with
(what they believe to be) machines. Unlike a conventional
wizard, an ablated wizard is restricted to some of the inputs
or outputs available to the dialogue system. For example,
under one condition the wizard may see only the output of
a speech recognizer, instead of hearing the human user's
voice; thus it “hears” only a transcription.
Under further ablation, the wizard must choose dialogue
actions (and the associated utterances) from the system’s
repertoire, but can combine them freely. In this state,
the wizard and the system collaborate to produce a response
to the user. This allows us to examine and learn from
various combinations of system functionality and human
intelligence. The book-borrowing scenarios for the wizard
interactions are realistic, and Heiskell Library patrons
participate in the studies. Our collected dialogues will
be made available to the research community.
|
People:
Principal
Investigators :
Ph.D. Students:
Masters Students:
Programmer:
|
Publications:
Hu, Jun; Passonneau,
Rebecca; Rambow, Owen. 2009. Contrasting the interaction structure of an email and a telephone corpus: A machine learning approach to annotation of dialogue function units. To appear in Proceedings of the 10th SIGDIAL on Dialogue and Discourse.Queen Mary University of London, UK 11-12 September, 2009.
Passonneau,
Rebecca; Epstein, Susan; Gordon, Joshua; Ligorio, Tiziana. 2009. Seeing what you said: How wizards use voice search results. Proceedings of the 6th Workshop on Knowledge and Reasoning in Practical Dialogue Systems, International Joint Conference of Artificial Intelligence. Pasadena, CA, July 12, 2009.
Passonneau,
Rebecca; Epstein, Susan; Gordon, Joshua. 2009. Help me understand
you: Addressing the speech recognition bottleneck.
AAAI Spring 2009 Symposium, Agents that Learn from Human Teachers
Stanford, CA. March 23-25, 2009.
Levin,
Esther and Rebecca Passonneau. 2006. A WOZ variant with
contrastive conditions. Proceedings of the Interspeech
Satellite Workshop, Dialogue on Dialogues: Multidisciplinary
Evaluation of Speech-based Interactive Systems.
Pittsburgh, PA.
|
|
| |
|
|
|
|
|