SUNNYVALE, Calif. – A group of nearly 200 engineers and researchers gathered here to discuss forming a community to cultivate deep learning in ultra-low power systems, a field they call TinyML. In presentations and dialogs, they openly struggled to get a handle on a still immature branch of tech’s fastest-moving area in hopes of enabling a new class of systems.
“There’s no shortage of awesome ideas,” said Ian Bratt, a fellow in machine learning at Arm, kicking off a discussion.
“Four years ago, things were getting boring, and then machine learning came along with new floating-point formats and compression techniques—it’s like being young again. But there’s a big shortage of ways to use these ideas in a real system to make money,” Bratt said.
“The software ecosystem is a total wild West. It is so fragmented, and a bit of a land grab with Amazon, Google, Facebook and others all pushing their frameworks… So how can a hardware engineer get something out that many people can use,” he asked.
An engineer from STMicroelectronics agreed.
“I just realized there are at least four compilers for AI, and the new chips won’t be used by the traditional embedded designer. So, we need to stabilize the software interfaces and invest in interoperability – a standards committee should work on common interfaces,” the STM engineer suggested.
It may be too soon for software standards, said Pete Warden, a co-chair of the TinyML group and the technical lead of Google’s TensorFlow Lite, a framework that targets mobile and embedded environments.
“We blame the researchers who are constantly changing the operations and architectures. They are still discovering things about weights, compression, formats, and quantization. The semantics keep changing, and we have to keep up with them,” Warden said.
“Over the next few years, there’s no future for accelerators that don’t run general-purpose computation to handle a new operation or activation function because two years from now it's likely people will bring different operations to the table,” he added.
A Microsoft AI researcher agreed. “We are very far from where we think we should be, and we won’t get there in a year or two. This was the reason Microsoft invested in FPGAs” to accelerate its Azure cloud services. “We need to build the right abstraction layers to enable hardware innovation…and if there was an open source hardware accelerator, it might help,” he added.
“Maybe a compliance standard is the first step, so researches get the same experience at the edge as in the cloud,” Bratt of Arm suggested.
“We need robust functional specs for whatever level you live in. If we have them at enough levels, it will give people an entry point to other layers, and this group is the best one to tackle defining them,” said Naveen Verma, a Princeton professor whose research focuses on AI processors-in-memory.