Tuesday, February 22, 2011

Learning to be a better human through a Turing test

Each year for the past two decades, the artificial-intelligence community has convened for the field’s most anticipated and controversial event—a meeting to confer the Loebner Prize on the winner of a competition called the Turing Test. The test is named for the British mathematician Alan Turing, one of the founders of computer science, who in 1950 attempted to answer one of the field’s earliest questions: can machines think? ...

Instead of debating this question on purely theoretical grounds, Turing proposed an experiment. Several judges each pose questions, via computer terminal, to several pairs of unseen correspondents, one a human “confederate,” the other a computer program, and attempt to discern which is which. ... Turing predicted that by the year 2000, computers would be able to fool 30 percent of human judges after five minutes of conversation, and that as a result, one would “be able to speak of machines thinking without expecting to be contradicted.”

Turing’s prediction has not come to pass; however, at the 2008 contest, the top-scoring computer program missed that mark by just a single vote. When I read the news, I realized instantly that the 2009 test in Brighton could be the decisive one. I’d never attended the event, but I felt I had to go—and not just as a spectator, but as part of the human defense. A steely voice had risen up inside me, seemingly out of nowhere: Not on my watch. I determined to become a confederate. ...

[Joseph] Weintraub’s program, [the first winner of the Loebner Prize,] shifting topics wildly and spouting non sequiturs and canned one-liners, came off as zany, a jokester, a much more “human” personality type. At least I used to think so—before I learned how easy this was to mimic.

As Richard Wallace, three-time winner of the Most Human Computer award (’00, ’01, and ’04), explains:
Experience with [Wallace’s chatbot] ALICE indicates that most casual conversation is “state-less,” that is, each reply depends only on the current query, without any knowledge of the history of the conversation required to formulate the reply.
Many human conversations function in this way, and it behooves AI researchers to determine which types of conversation are stateless—with each remark depending only on the last—and try to create these very sorts of interactions. It’s our job as confederates, as humans, to resist them.

One of the classic stateless conversation types is the kind of zany free-associative riffing that Weintraub’s program, PC Therapist III, employed. Another, it turns out, is verbal abuse. ...

[A]rgument is stateless—that is, unanchored from all context, a kind of Markov chain of riposte, meta-riposte, meta-meta-riposte. Each remark after the first is only about the previous remark. If a program can induce us to sink to this level, of course it can pass the Turing Test.

Once again, the question of what types of human behavior computers can imitate shines light on how we conduct our own, human lives. Verbal abuse is simply less complex than other forms of conversation. In fact, since reading the papers on MGonz, and transcripts of its conversations, I find myself much more able to constructively manage heated conversations. Aware of the stateless, knee-jerk character of the terse remark I want to blurt out, I recognize that that remark has far more to do with a reflex reaction to the very last sentence of the conversation than with either the issue at hand or the person I’m talking to. All of a sudden, the absurdity and ridiculousness of this kind of escalation become quantitatively clear, and, contemptuously unwilling to act like a bot, I steer myself toward a more “stateful” response: better living through science.
--Brian Christian, Atlantic Monthly, on the art of human conversation

