As demand for artificial intelligence (AI) increases, chip makers strive to create more powerful and more efficient processors. The goal is to accommodate the requirements of neural networks with better and cheaper solutions, while staying flexible enough to handle evolving algorithms. At Hotchips 2017, many new deep learning and AI technologies were unveiled, showing the different approaches of leading tech firms as well as budding startups. Check out this EETimes survey of Hotchips for a good summary of the event focused on chips for AI data centers.
The future of AI is mobile
Some of the cool stuff that was revealed at Hotchips included a deeper look at Google's second generation tensor processing unit (TPU v2), which was developed to perform efficient neural network training. Also, Microsoft exhibited its Project Brainwave, which is a “soft” FPGA-based deep neural network (DNN) processing unit. These solutions are intriguing for heavy duty data centers that preform cloud based AI, but…
While data centers are important, the most interesting use cases for AI are mobile. Since not all mobile scenarios can rely on the cloud, some of the AI processing needs to be handled on-device or as edge processing. Similar to virtual reality, which has its own set of mobility challenges, AI must be untethered to unleash its full potential. In the rest of this column we'll consider a few of the uses that make it clear that the future of AI is mobile.
Smartphone photography is not what it used to be
Last year, I made a prediction that the future of smartphone photography would combine dual cameras and deep learning to generate enhanced images. Now that Samsung has come around, realizing that two cameras are better than one, I think it's safe to say that the prediction is coming true. Samsung's newly released Galaxy Note 8 flagship is their first smartphone to feature dual rear-facing cameras. Another development in this category are new smartphones that sport dual front-facing cameras to maximize the quality of selfies, like two of the new Asus Zenfone models. Out of the six new models that Asus just released, five are equipped with either a front or rear dual camera setup.
The Asus Zenfone 4 Selfie Pro uses two 12MP sensors
to intelligently create one 24MP photo (Source: Asus)
Smartphone cameras have made giant steps forward in quality and effects, but there is still a lot more that can be done. Combining AI techniques like face recognition and tracking with advanced photo features like autofocus can result in amazing customizations. For example, you can teach your camera to identify your kids and automatically focus on them whenever they're in the frame. Alternatively, the same technology could automatically remove non-family members from the frame. Add to that the use of bokeh effect to blur the background, and you can capture amazing looking pictures every time. This really redefines the concept of point-and-shoot!
All this intelligent processing must take place on the device. In addition to privacy concerns, the speed with which we take photos, capturing fleeting moments, makes it unfeasible to process in the cloud. This means that the smartphones that take these photos must be able to perform deep learning tasks efficiently, without depleting the battery.
Voice interfaces would be smart to perform edge analysis
Since the advent of the virtual voice assistants, voice interfaces have become huge. In smart speakers as well as in smartphones, most of the speech analysis is performed in the cloud. This, again, opens a lot of issues such as privacy and latency, and even something as simple as being offline. As the virtual assistants improve and become ubiquitous, users are becoming more accustomed to using their voice as a main interface. Performing edge processing enables some of the functionality to be local and averts all the problems of cloud processing, enabling a seamless and convenient user interface.
Deep learning also powers augmented reality
AI is also at the core of augmented reality (AR) functionality. As proven by the Pokémon Go craze, the ideal scenario for AR is to run on a mobile phone that can be used by anyone, anywhere. But, it also must run efficiently so that the device can still be used for other things, like — I don't know — phone calls? Since Pokémon Go first appeared, AR has been improving, and users have been anticipating the release of more and better applications.
At Apple's Worldwide Developers Conference (WWDC) earlier this year, they announced their AR development kit, ARKit. It should be available on all devices running iOS 11, which will be released this year. In the meanwhile, Google beat them to punch by releasing their own similar augmented reality platform, ARCore, made for Android. As opposed to Google's project Tango, which requires hardware that isn't part of every phone, ARCore (see this introductory video) is a platform that could potentially run on any Android smartphone, which would make it the world's largest AR platform.
ARCore introduction video (Source: Google)
All smartphones will have a dedicated neural engine
Taking all of the significant benefits presented above into account, I feel confident to make a bold prediction: within a few years, most smartphones will be equipped with a dedicated neural engine. A few months ago, a slew of rumors hit the web that Apple is developing a chip called the Apple Neural Engine dedicated to powering on-device AI. Soon, when the iPhone 8 is released, perhaps, we'll find out if the rumors have basis. But, even if this acceleration is delayed to a later model, Apple already announced Core ML, a library which enables AI acceleration for developers utilizing existing engines.
More recently, at IFA 2017, Huawei announced that their new flagship processor would include a neural processing unit to accelerate AI computing. I think it's a safe bet that other companies will follow suit. We probably won't see a mobile version TPU in the upcoming Google Pixel 2, but — within a couple of years — I'm sure that Google, Samsung, and all the rest will include a dedicated deep learning engine in their smartphones and find many new hot features utilizing these engines.
Here is an example. User: “OK Camera, detail shirt of person crossing the street?” Camera: “That's an NBA Warriors shirt; it says — 2020 Champs — last AI-free year.” User: “What does it mean?” Camera: “That was the last year without AI-assisted shooting, which increased the average shooting by 18.5 percent”…
Visit CEVA To find out how the CEVA Deep Neural Network toolkit can streamline development of embedded AI and discover how CEVA's computer vision and deep learning embedded platform, the CEVA-XM6, is enabling AI to go mobile.