Teaching machines to learn like humans could help autonomous systems deal with unfamiliar environments - SRI (original) (raw)


SRI is spearheading a way for autonomous systems, such as self-driving vehicles and drones, to effectively operate in evolving and adversarial environments such as war zones.


Autonomous systems are typically trained in high-fidelity simulation environments, such as warehouse interiors or city streets. Significant cost and effort go into making these training environments as realistic as possible to generate the data that will train the autonomous system to make the right decisions in the real world. However, this high-fidelity approach leaves the autonomy vulnerable to novel situations and previously un-experienced observations — a challenge often referred to as the simulation-to-real gap.

To address this challenge with rapid autonomy transfer techniques that are robust against the quick and inevitable changes in dynamic environments, the Defense Advanced Research Projects Agency (DARPA) has launched a research program, Transfer from Imprecise and Abstract Models to Autonomous Technologies (TIAMAT), under the leadership of Alvaro Velasquez.

TIAMAT is specifically building toward flexible autonomy in defense applications, where autonomous systems cannot be expected to operate in controlled environments like those for warehouse robots or self-driving cars. Autonomy in defense must be able to handle surprises and shifts in distribution of observations, such as encountering new kinds of vehicles with previously unobserved dynamics, navigating bomb-damaged buildings and rubble-strewn roads, and accommodating different lighting and weather conditions. Furthermore, an autonomous vehicle or aircraft must also be able to withstand potential attempts by enemy forces to disrupt the system’s functioning.

By drawing inspiration from human learning, SRI is developing a new learning-based paradigm, FLASH, that seeks to replace data-hungry learning methods of today with the capability to create abstractions that can easily transfer to new environments. In this way, FLASH could provide the flexibility and resiliency needed in defense theaters.

“With FLASH, we aim to train autonomous systems to operate in unstructured and novel environments that are dynamic with active adversaries trying to fool the system,” said Susmit Jha, Technical Director of the Neuro-symbolic Computing and Intelligence (NuSCI) research group at SRI and FLASH principal investigator.

Bridging the sim-to-real gap

The novel methodology of FLASH — which stands for Functionality-based Logic-driven Anchors with Semantic Hierarchy — diverges from the dominant machine learning method of the last couple of decades. The conventional method involves training algorithms on massive amounts of data that aim to depict an environment as realistically as possible.

To take the example of recognizing a chair, the conventional method shows thousands of images of chairs, so the model memorizes that a “chair” looks like an object usually with four legs connected to a flat horizontal plane with another vertical plane attached.

Yet no matter how voluminous a training set may be, it will likely fail to capture the entire plausible spectrum of real-world chair types. When the model is set loose in reality, it may misidentify chairs without legs, such as recliners, chairs with three legs, or chairs with highly elongated backs. This inflexibility in apprehending the variability of the real world is an instance of the sim-to-real gap.

“It should be possible to create autonomous systems that can operate in diverse and novel environments with minimal retraining.” — Susmit Jha

SRI researchers plan to bridge the gap by having FLASH abstractly gauge what a chair is from a semantic perspective. FLASH will thus converge on the meaning or purpose of a “chair,” versus fixating on a defined, physical presentation. Semantically, a chair serves as a seat, which can also be a stack of rocks or a four-legged slab of wood.

“A ‘chair’ in this sense is better defined by its affordances and interaction with the agent, rather than as a labeled object in a large dataset,” said Jha. “Visual similarity is no longer the only cue, and the focus is on learning semantic abstractions that capture functionality.”

By developing a model to understand what the likely function of an object is, rather than the object meeting a strict definition of an object type, Jha said it should be possible to create autonomous systems that can operate in diverse and novel environments with minimal retraining.

“If the system is going into a completely new environment, like a bombed building in an urban warfare situation, the notion of shelter, doors, and streets is going to be different than what the system ever trained on,” said Jha. “But if the system can abstractly generalize, like humans can, it can continue to operate effectively.” Furthermore, by having flexible, semantic concepts of the physical world, a FLASH-equipped autonomous system may be less brittle and more resilient to adversarial perturbations.

From big data to efficient abstractions

The recent history of machine learning taking a “big data” strategy has proved largely successful as the datasets continue growing in scope and scale to deliver performance improvements.

However, the trendline for the energy and the cost to run ever-larger training sessions is skyrocketing. The demand is becoming so high that tech companies are pushing for investment into new nuclear power plants. By using abstractions, the hope is that the “scaling law can be broken,” Jha said, “and we can show that with a much smaller amount of data, our model can transfer better to novel and even hostile environments.”

“We have high hopes for FLASH,” Jha continued. “The bigger and bigger data approach seems to have limits, and so by learning abstractions, we will be able to harness some of the remarkable abilities humans possess to quickly adapt to new contexts.”

To learn more about teaming with SRI, please contact us.

Distribution Statement A: approved for public release, distribution is unlimited.