After first day on Jeopardy!, Watson is tied for lead

Last night (in North America), the long-awaited match between IBM’s “Watson” question-answering computer system and legendary Jeopardy! champs Ken Jennings and Brad Rutter began. A good part of this first program was devoted to an overview of the Watson system and its development, so only a few minutes were devoted to actual competition.

However, even in this brief introduction, Watson performed very impressively. In fact, in the first few minutes of the match, Watson performed so well that it looked like it would be a runaway victory, with the machine making shambles of its human competitors. Even in arenas such as song lyrics, it performed very well. For example, in response to the clue “Bang, bang, his silver hammer came down upon her head,” Watson correctly responded “Maxwell’s Silver Hammer.”

But then, after a commercial break, Watson showed its “human” side and made several gaffes. For instance, Watson responded “What is Harry Potter?” instead of “What is Valdemort?” in response to a clue about a Harry Potter episode. In another memorable blunder, Watson incorrectly responded “What is finis?” to the clue “From the Latin for end, this is where trains can also originate” rather than “What is a terminus?,” the correct answer.

Perhaps the most telling example of Watson’s limitations was in a question on 20th century history, where Ken Jennings first rang in with the incorrect response “What is the 1920s?”. Then Watson repeated this same wrong answer, a mistake that a human contestant who can hear would never make. During some practice rounds, Watson was fed the results of other contestants’ responses, but this was not allowed in the official televised match.

One very interesting aspect of the match is that Watson’s running “scorecard” is displayed on the screen for television viewers. Changing in real time as Watson “thinks,” the display shows the level of confidence Watson has in each of the top three answers. Watson only “rings in” when it has a high confidence level (at least 70%) in its response.

Monday’s partial match ended inconclusively with Watson tied with Brad Rutter for first place at $5000, and Ken Jennings second at $2000. To be continued!

Comments are closed.