Ultra-low power neural network processor targets hearables - Embedded.com

Ultra-low power neural network processor targets hearables

Potential applications include advanced noise cancellation, voice pickup, spatial audio, among others.

Less than two weeks after next-gen silicon arrived at GreenWaves HQ in Grenoble, the company was showing off advanced audio demos with its partners at Mobile World Congress.

The Gap9 processor, a successor to Gap8 which targets computer vision in IoT devices, is an ultra-low power neural network processor suitable for battery-powered devices. GreenWaves’ vice president of marketing Martin Croome told EE Times Europe that the company decided to focus Gap9 on the hearables market after receiving traction from this sector for Gap8.

“We had a Tier 1 hearables manufacturer come to us and say, this is interesting, but it’s missing this, this and this,” he said. “Talking with them, we saw that they needed to do certain things. So during the design of Gap9 that meant we had someone to talk to about the decisions we made about the chip, which allowed us to focus on hearables as one of the core markets.”

Engineering samples of Gap9 at GreenWaves’ HQ (Source: GreenWaves)

The result is a tiny 3.7 by 3.7mm chip optimized for advanced audio, which can fit into an ear bud.

Potential applications include advanced noise cancellation (ANC), which has strict latency requirements – processing has to happen in the time between sound reaching the error microphone on the outside of the headphone, and reaching your ear. Croome said ANC is often a tricky trade off between cancelling noise well in a particular band or less well across a range of frequencies; eliminating high frequency noise is often difficult too. Various schemes exist, including adaptive ANC which detects the type of noise environment the user is in via AI, then adapts the filter to compensate.

Another potential application for Gap9 is voice pickup – when a user is speaking on the phone in a busy or crowded environment, have the phone block out all voices except theirs. This is mostly a cloud application today, Croome said, though it is done on some PCs, it hasn’t made it to headsets yet. Then there is spatial audio – making sounds appear to come from a particular place – this requires an inertial measurement unit (IMU) on the headset so that when the user moves their head, the source of sound appears to move accordingly.

AI-based ANC

GreenWaves showed off several advanced audio demos on its ultra-low power chip at Mobile World Congress.

With partner Orosound (Paris, France), one demo showed hybrid ANC running on Gap9 inside the chip’s sound filtering unit (SFU). Developed by simulator and tuned on-chip in just four days, Croome said it worked remarkably well. Orosound has developed AI-based ANC, operating at 768 kHz, which is selective and adaptive. Gap9 consumed 1.5mW per channel. The partners are also working on implementing Orosound’s AI-based noise reduction algorithm on Gap9.

Partner Idun Audio (Copenhagen, Denmark) showed dynamic spatial audio on wireless headphones. The demo app allowed user to click on speakers to move them around, and hear the difference. The user can also move their head around and hear that sounds appear to come from the same place. Processing in headphones, as opposed to smartphone, avoids the latency added by Bluetooth and enables high-quality dynamic spatial audio no matter what the playback device is. For this demo, Gap9 performed encoding (1.5 mW per channel at 44.1 kHz, 400 kbps) and decoding (1mW per channel at 44.1 kHz, 400 kbps) as well as head tracking. Total consumption for Gap9 was 1.8 mW.

GreenWaves’ Gap9 chip was used for Idun Audio’s spatial audio demonstration (Source: Idun Audio / GreenWaves)

GreenWaves’ partner Cyberon (Taiwan) has a speech processing engine which can be used for keyword detection. This engine is phoneme based, meaning it doesn’t need to be retrained for new keywords (just type them in). Croome said this demo consumed around 500 microWatts running continuously on Gap9 for voice activation.

Another partner, Segotia (Galway City, Ireland) are working on auditory attention decoding. This system takes an electro-encephalogram (EEG) – brain waves from the user – and uses AI to decode this multi-channel brain wave to figure out which person in a room the user is trying to listen to. The idea is that this information could be used to boost sounds coming from a particular speaker and/or mute other speakers. According to Croome, this is an ideal application for Gap9’s combination of audio processing and neural network acceleration.

Segotia’s idea is to use an EEG to detect which speaker a person is trying to listen to, then adjust the audio accordingly (Source: Segotia / GreenWaves)

GreenWaves’ own demo at the show demonstrated voice ID-based speaker separation. The user provides a voice sample and the system makes an embedding of their voice, which is fed to a voice filter. The filter can then be tuned to pick up their voice only. Croome said the neural network that achieves this is “quite big,” 8-10 MB of parameters, but shows off the range of neural networks Gap9 can accelerate.

GreenWaves’ voice ID-based speaker separation system uses a relatively large neural network (Source: GreenWaves)

Sound filtering unit

How does GreenWaves achieve advanced audio applications at such low power? The key is the hardware sound filtering unit (SFU) introduced into Gap9 which provides stream-based autonomous time domain filtering. The SFU has multiple highly configurable hardware blocks, including 13 different filter patterns that can be configured to form a data flow graph. It supports multiple graphs and dynamic graph parameter updates; GreenWaves has schemes for updating filter coefficients without glitching, which is important for ANC.

The architecture of GreenWaves’ multi-core RISC-V for Gap9 (Source: GreenWaves)

The overall effects of putting these filters into a hardware block is three-fold. Latency is reduced (to 1.3 microsecond structural latency for ANC), power is reduced compared to a software approach (result is 1.3mW per ANC channel) and the execution time becomes fully deterministic.

“There’s no indeterminism because [the SFU] has its own resources in it, it’s sitting between the interfaces and the memory, so it can actually process interface to interface, at which point it’s completely autonomous – it has its own memory for its coefficients,” said Croome. Determinism means tasks take a known number of clock cycles so designers can scale the clock appropriately to balance latency and power consumption.

Target applications for the SFU are ANC, heavy duty filtering, sound spatialization, sound effects, etc.

Development boards for GreenWaves’ Gap9 chip are available now. Production qualification for Gap9 is expected in Q3 2022.

>> This article was originally published on our sister site, EE Times Europe.

Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.