By now most everybody has heard of Watson, an A.I. built by IBM to compete on the Jeopardy! game show. If not, you should probably take a moment to learn a bit about it.
There’s a lot of debate about what Watson’s performance means. On one hand, IBM clearly created an amazingly powerful natural language based A.I. Jeopardy! represents an extreme challenge for people, let alone machine intelligence. The sort of questions posed in the contest require extreme linguistic skills and a deep understanding of cultural references. Winning against the best human players was far from an easy task. The competition focused on the skill humans are best at… and computers have traditionally fared poorly in such contests. On the other hand, Watson has absolutely no general intelligence. The DeepQA system it’s based on is merely an expert system. Watson’s domain was very broad, but required limited reasoning beyond language comprehension. It’s a fantastic leap forward, but doesn’t quite herald the arrival of our computerized overlords.
Given that, there is one thing very clear about Watson’s victory- it was the most publicly visible demonstration of machine intelligence ever. Events such as the Loebner prize or the series of human computer chess matches happened more or less in the background. This one happened on national TV against popular champions. Whereas chess matches are battles of phenomenal logic and Turning Tests are essentially academic exercises, this was fast paced, exciting, and full of cultural relevance. It was an entirely different sort of contest for an entirely different audience. Turing tests are purposefully done through chat windows and chess moves are enacted by solemn human assistants, but this was right out in the open.
Watson needed to be different from ALICE or Deep Fritz, it needed a face to show the world. Watson needed to do more than answer questions in the form of answers, it needed at least a few more “human” traits to show the audience. It needed a voice and a physical appearance. It even needed a bit of personality. One of IBM’s YouTube videos describes how they created all of these things.
Watson was clearly “excited” with certain categories. It was happy to answer questions correctly. It was sad when things went wrong. This was obvious without the video, because the designers did their jobs well. Hundreds of thousands of of people will remember that more than the technical details of the accomplishment or the myriad products IBM will roll out in the wake of their success. Millions will one day interact with intelligences through faces whose designers were inspired by Watson’s animated version of an IBM trademark. Ever since we began dreaming of intelligent machines we’ve personified them somewhat, initially to ensure they made sense to science fiction readers and increasingly as part of real world user interfaces. If we assume that artificial intelligence will continue its advance there will soon be a lot more faces in need of making. At this point, it’s not clear what they will look like.
Cues From History
Maybe history provides a good place to figure that out. Robots have long been envisioned with expressive faces, even when there wasn’t much of a mind behind them. There have always been some robots without faces, both in fiction and reality, but many of them were rather expressive. (Think R2D2) Only truly utilitarian devices, not intended for human interaction, are constructed without some sort of shell to give them a personable visage. It is extremely rare to see a fictional intelligent robot without a face, voice, and personality. There are about as many projects developing real world intelligence for the purpose of interacting through expressions as there are for developing autonomy. However, intelligence based solely in a computer was not always envisioned the same way.
Many early concepts of advanced computers were very similar to the mainframes that existed at the time, meaning there was little visual interface or interactivity. Computer based intelligence was envisioned without significant display of any sort. This can be seen in TV series such as Star Trek where the computer is merely a disembodied voice. It’s not uncommon in 1960s Sci Fi to see a computer communicating largely through dot matrix printers.
Disembodied Voices
Of course, for many people the archetypal image of an artificial intelligence is HAL 9000 from the A Space Odyssey. HAL’s glowing red eye and smooth, synthetic voice made him seem more real… and helped him reach #13 on the list of all time greatest movie villains. HAL’s eyes were present throughout the ship he monitored, heightening the sense of threat behind his character. Realistically, this is a useful feature for a computer designed to monitor a facility and is common with real life intelligences. It’s not fundamentally different than the Microsoft Kinect sitting on my TV right now.
In most of these early examples speech was the only intelligent interface. Unless the computer was saying something, or possibly blowing someone out an airlock, it was hard to tell what it was thinking. Regardless of what it did, an intelligence of this sort seemed more than inhuman… it was outright disconnected. Real A.I. developed to compete in Turing Tests tend to have this issue as well, as mastery of language is the key in these events. Essentially all of these tests work by dehumanizing the people participating in them, as opposed to tying their results to the creation of artificial faces. Intelligence designed primarily to converse with others are called chatterbots, and the blinking cursor in a chatterbot’s input field is in many ways just like the watchful sensors of a machine like HAL.
Virtual Assistants
In the last ten years, reasonably intelligent chatterbots have began to proliferate. Using technology developed for Turing Test style competitions, like the ALICE technology developed by Richard Wallace for the Loebner Prize, these programs have offered a more personable way to interface with artificial intelligence. When these were brought to the personal computer by companies like Zabaware, the “brains” of chatterbots were quickly tied to the faces provided by animated characters. Many of these were courtesy of tools like Microsoft’s Agent system. Zabaware’s Ultra Hal Assistant helped popularize the idea that a chatterbot could read mail, assist in searches, and perform other simple tasks. Options ranged from cute to crudely human, and were never really accurate. While these Virtual Assistants did not become widely popular, they are still being developed and it is rare to see one without some sort of face.
Guile3D’s Virtual Assistant Denise is a newer example of these agents. Denise has a fairly realistic face with a set of simple expressions. Denise is a reasonably capable product with a variety of input options under development, including face recognition through the increasingly ubiquitous webcam “eyes” appearing in everyone’s homes. Denise has a voice as well, which can be swapped out using commercial voice packages. (I’ll be offering a review of Denise in the near future.)
Denise exemplifies a trend toward making more human avatars as chatterbot interfaces. There are several advantages to doing so, as humans automatically understand the expressions and actions of humanlike figures. Whereas Watson’s designers had to recreate happy and sad with color and position of swarming elements, Guile3D only had to render a smile and frown for Denise.
Of course, rare as it is, not all virtual assistants have faces, voices, or significant personalities. For instance, the Siri iPhone application merely uses text and voice. In it’s case, the device it runs on is in many ways its face. With Apple’s purchase of Siri, their legendary restriction of software interface design will likely keep it that way.
Online Chat Agents
Simple machine intelligence has been applied to phone trees for some time with fairly good voice synthesis, speech recognition, and simple expert systems at a variety of financial institutions and corporate service centers. These machines are designed to be as unobtrusive as possible in order to disarm callers’ angst over not speaking with a live person and, of course, they are only experienced as voices. When these machines went online, that had to change.
Web sites offer a perfect space to feature artificial sales and support bots . Without a whole lot of fanfare, companies have been introducing simple intelligence to the question and answer sections of their sites and in many cases giving it a face. While I have little statistical evidence to support it, these seem to be more popular in Europe. Chat agents are created by firms like DoYouDreamUp and CreateMyAssistant for a variety of sales and support sites, turning an F.A.Q. section into an expert system. Newer versions of these bots are designed to lead browsers into more traditional support tools, like live chat or ticketing systems. This development is currently bringing chat agents into more mainstream sites.
A good example of mainstream usage would be the virtual assistant at GoArmy.com, Sergeant Star. GoArmy is a site dedicated to the somewhat difficult task of recruiting for the United States Army. The Sergeant features a very complex animation set and user specific personalization options. Based on the way he works, it seems that GoArmy expects users to interact with him multiple times. His designers seem to have put a lot of work into his image, which is important given the variety of prospective soldiers who are expected to chat, and maybe identify, with him.
Thoughts On The Future
Will all intelligent machines have avatars? Probably not.
However, there are already enough simple AIs in need of faces that a few businesses revolve around creating them. Some intelligences will continue to be molded to resemble realistic people. Some may be embodied by clearly fictional characters. Some may be humanoid whereas some will be more abstract. A good number will probably carry on a key element of their creators’ branding, like Watson does.
The important thing is that we’ve started creating intelligences complex enough that there is an advantage in giving them presence and expression. Now that we’ve started, there’s probably no stopping it.















