Like King Louie in the Jungle Book–Artificial Intelligence has to learn like people. Machine learning’s surely a brilliant student, but it’s still a slow learner. Once trained to recognize patterns, analyze huge amount of data, or interpret speech, they can do the job at lightning speed, often better than humans can. But the training part of that equation can be a labor and programming-intensive task, because machines still learn like machines–one thing at a time, often only after repeated instruction.
AI researchers are looking for ways to shorten the learning loop, and the Army, which has plans for soldiers and machines working side-by-side in the near future, has found a couple of ways to help close the circle. In short, machines can learn more like humans do by being taught more like humans are–or, for that matter, the way dogs are.
The Army Research Laboratory (ARL), working with the University of Texas at Austin, recently demonstrated how researchers can teach a machine by providing feedback, both positive and negative, using an algorithm called Deep TAMER. In one demonstration, researchers needed only 15 minutes using human feedback to teach an intelligent agent to excel at Atari Bowling, a crusty 1979 game from the Pong era, but one that has proved troublesome for even state-of-the-art AI. Trained with Deep TAMER–using feedback like “good job” and “bad job”–the machine was capable of beating expert Atari players, ARL said.
The researchers presented their findings in February at the Association for the Advancement of Artificial Intelligence (AAAI) Conference in New Orleans, where ARL team leaders talked about the importance of this kind of training in developing human-machine teams for search-and-rescue, surveillance and other military operations. In the field, they noted, humans can adapt to changing situations and improvise based on their training. AI agents, on the other hand, are at a loss if the situation isn’t what they’ve seen before.
“If we want these teams to be successful, we need new ways for humans to be able to quickly teach their autonomous teammates how to behave in new environments,” said ARL researcher Dr. Garrett Warnell. “We want this instruction to be as easy and natural as possible. Deep TAMER, which requires only critique feedback from the human, shows that this type of real-time instruction can be successful in certain, more-realistic scenarios.”
A key to furthering that kind of training would be the ability to converse beyond a good-dog, bad-dog paradigm, which is where another ARL project–this one with the University of Michigan–hopes to lay a foundation. The research team played a version of the game 20 Questions, in which players ask a series of yes-no questions such as “Does the thing have wheels?” or “Does the animal live in the mountains?” to reach a conclusion through a process of elimination.
The goal is to get past a simple question and answer session to a more productive conversation, between humans and machines and machines and machines, that can be applied to soldier-robot teaming. While AI assistants can give you the weather report or direct you to a restaurant, a “real, purposeful conversation, especially in complicated military environments, is different,” said ARL senior scientist Dr. Brian Sadler. “It requires the AI system to understand a whole sequence of questions and answers, and to handle every question or answer with consideration of what has been asked or answered before. Such computer algorithms do not yet exist, and the scientific theory for building such algorithms is not yet developed.”
The challenges being addressed by ARL’s research are among the topics addressed by Thomas Dietterich, a professor at Oregon State University, former president of AAAI, and the founding president of the International Machine Learning Society, in a wide-ranging recent interview in the National Science Review. Among other topics, Dietterich discussed machine teaching and reinforcement learning, as well as another practical issue for military users: user friendliness.
“An important goal of machine learning work,” he said, “is to make machine learning techniques usable by people with little or no formal training in machine learning.” That will be the ultimate goal of nearly every human and machine communication effort. As King Louie would say “AI wanna learn like you-o-o…”