It's undeniable that machine learning has made enormous progress over the past few years: from amazing artificial intelligence accomplishments like defeating a top ranking player at the ancient and complex game of Go, to simple everyday uses like auto-tagging personal photo collections. At the core of the most advanced algorithms used to achieve these feats are artificial neural networks, which are technology's way of mimicking the human brain. But just how smart are these neural networks? Since my not-quite-two-year-old son started paying attention to the world beyond his mom and started learning, I have been at awe by the way his brain learns by making associations; by comparison, I wonder how much further machine learning has to go.
My toddler in an exciting moment of successful identification. He's not always as cooperative as an artificial neural network, but he's much cuter (Source: Gunn Salelanonda)
Baby-X: The animated virtual infant
Apparently, I'm not the first one to ask these questions. Researchers at the University of Auckland have been taking this comparison to the extreme by developing a highly realistic intelligent toddler simulation.
The Laboratory for Animate Technologies webpage describes this simulation as an experimental vehicle incorporating computational models of basic neural systems involved in interactive behavior and learning. One of the intriguing features of this virtual baby is the way it is motivated to learn. For example, the software releases a simulated version of dopamine to make BabyX happy upon successfully identifying objects (because that's what every virtual construct wants — virtual dopamine). This makes for a highly interactive experience, including getting the virtual infant's attention and giving encouragement during learning sessions. In addition, the hyper-detailed and nuanced facial expressions make this psychobiological simulation eerily lifelike (and IMO more than slightly creepy).
BabyX v3.0 Interactive Simulation by Laboratory for Animate Technologies (Click Here to see a video. Source: Laboratory for Animate Technologies)
Comparing my toddler to the most advanced neural networks
While the aforementioned simulation is an intriguing accomplishment, it's still not even close to the real thing. When I teach my toddler new things (sometimes intentionally but often unintentionally), his reactions are much more varied. Sometimes he wants to show off a new skill or learned trait, while other times he makes it clear that he's “not our trained monkey.”
Maybe neural networks are better at identifying objects once they are trained, but the training part itself seems to be much more efficient with my son (when suitably incentivized — like with animal crackers). For example, I can show him a picture of an animal he's never seen before and teach him the animal's name and the sound it makes. Usually, after a couple of times of getting it wrong, he'll recognize that animal. After seeing five or six other images of the same kind of animal and being told that it's the same, he will, for the most part, be able to identify the entire category including all the variations (not just photographs and videos, but cartoon depictions, toys, and even stuffed caricatures).
My toddler doing some object recognition and some showing off for the camera (Click Here to see a video. Source: Gunn Salelanonda)
This is very different from the training of neural networks used to identify animals, which requires millions of images as input before they start to get it right.
Another thing that infants have naturally — but machines have to strive for — is flexibility. Any hardware solution that is tailored to a specific algorithm or pre-trained network might be efficient, but lacks the ability to adapt to new circumstances. With neural networks constantly evolving, becoming deeper and including more layers, a flexible solution is a must. At the other extreme, though, a completely open solution that can do anything might be too wasteful in resources. That's why a software solution implemented on an extremely efficient hardware architecture is, in my opinion, the best way to go. Like buying new shoes for little kids, a fine balance is required to find a comfortable fit while also leaving some room to grow.
And then there's the issue of power consumption. The brain's capacity to identify images (like my kid and his uncanny ability to spot “tunnels” as well as anything that even looks like a “tunnel”), solve complex problems, and perform other tasks is unparalleled when it comes to utilization of energy. Comparisons between the human brain and machine intelligence, like this one, have stipulated that deep learning machines may use around 50,000 times more energy to perform the same task (literally 20 watts for the human compared to one megawatt for AlphaGo).
Neural networks in embedded systems
One of the biggest challenges today is to bring the power consumption down to make this technology feasible inside battery-operated devices. Extremely efficient embedded processors can already use artificial intelligence to achieve outstanding results. Applications of AI, like these new Google projects, are becoming commonplace in everyday life.
Object recognition can already be performed on mobile devices with very high success rates. For example, in the video below you can see how the CEVA-XM4 vision processor runs a full working version of AlexNet, the large, pre-trained deep convolutional neural network.
Using the CEVA Deep Neural Network (CDNN) on an FPGA development board, which runs at a tiny fraction of the speed of a production silicon SoC, the demo is able to identify an enormous variety of objects almost instantaneously. If it were running on silicon, the response time would be somewhere between twenty to thirty times faster! And all this is performed using extremely low power, thereby enabling it to run on even the smallest battery-powered handheld devices.
CEVA CDNN Demo Running Full AlexNet on CEVA-XM4 Vision Processor (Click Here to see the video. Source: CEVA)
Will future AI be as smart as a toddler?
While today's artificial neural networks are very useful for a wide variety of use cases (from smart surveillance to autonomous vehicles), there is still room to grow.
What will the future be like when the learning phase can be implemented efficiently enough to also be performed on portable devices? When will portable AI applications also be able to learn new things quickly, from just a few instances, like my toddler can? That future could hold some scary scenarios as well as extremely exciting possibilities that may enable humankind to achieve new heights.
Who knows, maybe by the time my son grows up, we'll have the answers to some of these questions…