When a DSP beats a hardware accelerator

Embedded CPUs took off almost everywhere because they offer flexibility along with pretty good performance and low power and usually much lower cost. When compared with a solution requiring a separate microprocessor or microcontroller coupled to your custom hardware, switching to designs based on embedded CPUs was a no-brainer. But CPUs of any kind have limits. Even though we can move our algorithms into software, the potential complexity of algorithms is unbounded. We can write the programs and they will run, but not necessarily in an acceptable time or within a reasonable power budget.


Source: CEVA

That’s why microprocessor makers quickly came up with the concept of hardware accelerators – hardware functions which can perform commonly needed tasks, for example floating point arithmetic, much faster than would be possible if run in software on the CPU. This idea caught on quickly and other accelerators started to appear, for cryptography, regular expression handling and graphics functions as just a few examples.

All of this works very well but sacrifices one important advantage of software-based solutions: Because the implementation is mostly hard-coded, it’s difficult to modify. Accelerators may allow for some limited level of tuning through register controls, but otherwise if you need to fix a bug or change the algorithm you have to redesign the hardware. Responding to field failures and evolving market demands becomes much more expensive.

What you really want is the best of both worlds – a way to accelerate algorithms while still being able to define those algorithms in software. Of course the range of all possible algorithms is infinite so there is unlikely to be one solution for all cases. But for a substantial set of very commonly used functions, DSPs can provide exactly this solution.

Consider almost any operation that must work on streaming data. Obvious examples can be found in audio processing, from filtering to PDM-PCM conversion to acoustic echo cancellation. Or think about stream-based cyphers such as SNOW and ZUC (used in LTE). In a signal processing context, think about channel estimation between base stations and cell phones. This aims to optimize transmissions to current conditions for maximum reliability and requires complex matrix computation on received signals. More generally still, think about any application which can benefit from very wide parallelism, such as AES cryptography.

Streaming computation, complex math (matrix, floating point) and/or high levels of parallelism are all areas where a DSP shines and should be considered seriously as an alternative to a hardware accelerator. An implementation will also be smaller than the hard-coded accelerator in many cases, reducing the unit cost of your product. As for power, perhaps the accelerator will be a little lower than the DSP implementation, but the DSP power will still be much lower than a CPU-based equivalent. Better yet, you may be able to consolidate multiple acceleration functions onto a DSP, eliminating the need for several accelerators, if these accelerations don’t need to run at the same time. For even more processing horsepower, you can use a multi-core DSP, just like you can use multi-core CPUs.


Source: CEVA

Most important, a DSP implementation is programmable, in C, just like your CPU core. You’ll need to do some things a little differently – to optimize for parallelism for example – but a good compiler and modeling simulator for the DSP should make this relatively easy. So you get all the advantage of bug-fixing and product upgradability without needing to change the underlying hardware. Improved customer satisfaction and improved revenue streams. Not bad.

There’s another advantage: As a processor it can support multiple functions. Consider GNSS, the global location standard and a function that benefits significantly from DSP-based computation. This is a good feature to have in mobile devices certainly but there’s now also a boom in GNSS for fixed devices to simplify provisioning, updates and maintenance. If your device is already DSP enabled, GNSS may be a software add-on with some vendors and can run in quiet periods when other functions are dormant. If you already had a hardware based GNSS or were planning to add one, you can save yourself both area and power.

I’m not suggesting that DSP implementations can necessarily replace all your hardware accelerators. Some accelerator functions may not be a good fit to the strengths of a DSP. And some may fit within some range, but not outside that range; for example your only option for a very large filter may still be a hard-wired implementation. But that leaves a lot of functions where a DSP comes close to an equivalent hardware accelerator on performance and power, may actually be better on cost, and has infinitely more flexibility than the hardware version. Worth considering.

This blog is the second in a series that started with “Why DSPs are suddenly everywhere”. Stay tuned for the third blog: “Decisions, decisions: Hardware accelerator or DSP?”.


Guy Givoni works with CEVA Inc. Customer Solution Group as Architecture and VLSI Expert. Guy brings with him 10 years of multi-disciplinary experience, in Software development, Hardware design, micro and macro Architecture developing, and System Architecture and integration. In his current position Guy is dedicated to guide and elevate CEVA’s customer to achieve peak competitive performance, improving their System Design and Integration, Hardware implementation, and Software over Hardware design. Guy holds a B.Sc. in Computer Engineering from Bar-Ilan University and an MBA from Tel-Aviv University.

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.