Artificial intelligence has been applied to everything from cybersecurity and financial management to human resources and self-driving cars, so it seemed only a matter of time before it could take over video surveillance duties. And while AI, machine learning, and neural networks have made some promising strides in this area, it’s not quite the slam dunk that it might seem.
An artificial intelligence robot might make a formidable chess opponent or a splendid accountant, but it wouldn’t be much of a movie-watching buddy. Machines have a bit of a challenge with moving pictures in part because unlike humans, computer systems separate processing and memory, limiting their ability to connect current activity which has already been learned.
For the intelligence community and the Department of Defense, this presents a couple of challenges. The first has to do with the hundreds of thousands of hours of surveillance footage collected by airborne drones that overwhelm the capacity of human analysts to view all the video, let alone analyze it. The other concerns the prospect of real-time video surveillance in which machines could unblinkingly watch endless feeds from CATV and other cameras to identify suspicious activity outside a government building, inside a train station, or near a military post.
The Intelligence Advanced Research Projects Activity (IARPA), the scientific arm of the Office of the Director of National Defense, wants to tackle these challenges by pushing AI’s learning ability further toward a human-like cognitive realm, with a new program called Deep Intermodal Video Activity, or DIVA. The program wants to advance development in artificial visual perception as a way to use AI both to quickly cull through collected full-motion video, and take over live monitoring of secure areas.
“There is an increasing number of cases where officials, and the communities they represent, are tasked with viewing large stores of video footage, in an effort to locate perpetrators of attacks, or other threats to public safety,” said Terry Adams, DIVA program manager. “The resulting technology will provide the ability to detect potential threats while reducing the need for manual video monitoring. The technology does not track the identity of individuals, and will be implemented to protect personal privacy.”
IARPA has selected six teams to work on new developments in identifying two primary types of behaviors: primitive activities (a person carrying an object or getting in or out of a car), and complex activities (a car arriving to pick a person up, two people exchanging an object, or a person carrying a particular object, such as a gun). But the program also intends to include different ambient setting and image sources. Phase One will work with visible light and fixed or limited pan-tilt-zoom cameras. Phases Two and Three will move into infrared and other light conditions while taking feeds from wearable or handheld cameras.
IARPA’s solicitation said it is looking to incorporate a variety of skills, including machine learning, deep learning, artificial intelligence, detection capabilities, 3D reconstruction from video, statistics, and probability and mathematics. The agency said it wants a scalable framework suitable for a cloud environment.