Addressing the challenges of embedded analytics -

Addressing the challenges of embedded analytics


Analytics are often touted as the solution to many problems across a variety of embedded applications such as surveillance, automotive, industrial, and even purpose-built high-performance compute servers. While there are a variety of processing solutions to run the many analytic algorithms that exist, it’s important that designers pick the technology that will be the most efficient and effective for their design. This is even more important in the area of embedded analytics where solutions are often extremely size and power constrained. In these embedded spaces especially, the real-time, math intensive architecture of digital signal processors (DSPs) are proving to be an extremely efficient processing solution.

Embedded analytics are all around us. They’re in our cars and our places of work and in our homes. Most new automobiles are great examples of intelligent analytics systems. Whether helping people to parallel park or automatically accelerating and braking as part of an adaptive cruise control system, advanced driver assistance systems (ADAS) are becoming increasingly commonplace.

Figure 1. Advanced driver assistance systems (ADAS) relies heavily on embedded analytics.

Another example of embedded analytics are security and surveillance cameras which have become increasingly advanced with their usage of analytics. They are now capable of running algorithms like trip zone detection, motion detection, people counting and tamper detection all within the camera itself. For even more advanced capabilities, these cameras can be connected to a networked video recorder (NVR) that can stitch images from multiple cameras together to present a more continuous view or run highly complex algorithms like facial recognition and object detection. Chances are that if you work in an office building, there are similar cameras protecting you. Less sophisticated networked cameras are now common for home monitoring and while the use of analytics is mostly done on a networked PC, it won’t be long before advanced features like identification capabilities enter this space.

Even if you don’t have a car or a camera running analytics, you most assuredly have products that were either inspected or shipped using embedded analytics systems. Machine vision systems are widely used to help assemble and inspect products ranging from smart phones to fruits and vegetables.

Data growth and the need for edge processing
It’s not newsworthy to say that the amount of data traffic is increasing. Cisco’s popular yearly report on networking traffic is always full of amazingly quotable trends like “This year the world will use more than half as much web data as was used in the entire history of the world prior to this year.” The predictions of the increase in data traffic in Cisco’s report are simultaneously staggering and perfectly believable. In addition to the continuing demand for higher definition video, there is the rapid increase in the sheer number of connected devices that are all sending data of some sort or another.

The key technology challenges for all the data being generated are to store it and to make it useful. Analytics are the key to being able to glean insights from all this data as well as shouldering the burden

Analytics are the key to discovering meaningful relationships and patterns buried in data and they can provide the ability to either facilitate making an intelligent decision or, in some cases, actually make a decision based on the data. Either way, by extracting useful information from the data, analytics provide a way to reduce network bandwidth by only transmitting the relevant information and not the entire data stream. Edge processing, where analytics are used near the sensors (edge) of the network to reduce the amount of data being transmitted, will be increasingly needed as storing all the data being generated will simply not be practical in our advancing connected world.

However, there are several constraints on these edge processing applications that need to be considered when choosing a processor. These systems require the processing element to be near the data collection source. This criteria often sets hard limits on the size and power of a processor. In a machine vision system, for example, the processor has to be placed within the camera enclosure. Many of these cameras need to be quite small so that they can be easily placed into an elaborate and expensive machine automation system. The small enclosure places requirements on the physical size of the processor as well as the lens and other electronics. Additionally the power consumed by the processor is important because imaging sensors are sensitive to heat. A processor that runs too hot will adversely affect the quality of the image.

Similarly, many systems using embedded analytics, like a lot of industrial systems, have a high requirement for reliability. These systems are required to run continuously for a long period of time without fail which requires a processor that can run in these rugged conditions without fail. Often times these systems will not allow a cooling fan to be used due to reliability concerns, further magnifying the importance of a low-power processor.

Next page >

Most embedded systems that run analytic algorithms require aprocessor capable of delivering high performance. Many analyticsalgorithms are computationally intensive and require real-timeprocessing, meaning the system must be able to analyze the input andprovide the output consistently in a known amount of time. This oftenrequires the ability of the processor to run a real-time operatingsystem and be able to service interrupts with a low and determinedlatency. Additionally the data must be able to be quickly moved into andout of the arithmetic units of the processor, and often times with aknown latency, to facilitate the real-time processing.

To summarize, embedded analytics processors have several requirements:

  • Small size
  • Low power
  • High reliability
  • High-performance computational processing
  • Real-time operating system

DSP solutions are ideal for adding analytics to the edge asthese processors are specifically designed for these type of applicationspaces and to meet all of the above requirements.

Small size. DSPs are designed for the embedded spaceand consequently pack a lot of performance power into a very smallpackage. There are popular DSP solutions that are available in packagesizes as small as 13x13mm2 that provide greater than 2.5GFLOPS of processing. For processing at the edge that can fit a 21x21mm2 package, there are DSP solutions that provide as many 40GFLOPS ofperformance, which is enough to run multiple channels of multipleanalytic algorithms like motion detection.

Low power. DSPs are often the best choice in terms of performance per Watt of power consumed. The 13x13mm2 ,>2.5GFLOPs DSP mentioned above only consumes about three-quarters ofa Watt. Larger DSPs can deliver 10-15x this performance for around 3-5Watts. The largest DSP suppliers have portfolios of scalable devices todeliver the amount of processing needed for a particular analyticsrequirement.

High reliability. DSPs are designed for productswith long life spans and large continuous use cycles. Many DSPs aretested for 100,000 power on hours and have been used for decades inautomotive, satellite, defense, industrial and medical imagingapplications. They have been field tested and proven to be highreliability devices.

High performance computational processing. Analyticalgorithms are often computationally intensive, with mathematicaloperations at the heart of the algorithm. The DSP architecture has beenoptimized to maximize mathematical computation and throughput with manyarithmetic units, specialized instructions for complex math operations,and a sophisticated data movement engine that can keep the arithmeticunits running at high efficiency. It is this mathematical architecturethat often makes DSPs the most efficient processor for running analyticsbased algorithms.

Real-time operating system. Even if a processor hasenough performance and throughput to handle the requirements for anembedded processing analytics application, often times they will not beable to meet the performance criteria of real-time without the abilityto run a real-time operating system. In this case, real-time reallyrefers to determinism – where the processor will respond to an interruptin a known and consistent fashion. This is very different from highlevel operating systems (HLOS) where the time to service an interruptcan vary wildly from one call to the next. DSPs started in the world ofreal-time programming with real-time operating systems. While many ofthe more powerful DSPs today support a high level OS like Linux, the DSPcan also be run with an RTOS for embedded analytics applications thatrequire deterministic processing.

The below figure shows a high level block diagram of a DSP used foradvanced embedded analytics. This DSP provides two high-performance DSPcores in a small 21x21mm2 package that when running at 1Ghz, provides 32GFLOPS of processing capability at around 3.5W.

Figure 2. High level block diagram of a DSP device used for advanced embedded analytics.

To see the benefits of using DSPs to process analytics, consider thefollowing video analytics solutions from AllGoVision, a solutionprovider in the video analytics industry. AllGoVision has videoanalytics solutions deployed in numerous fields including consumer andsecurity segments and they have multiple video analytics algorithmsrunning on various platforms, including platforms using a DSP foracceleration of video analytics and non-DSP platforms. Having the samecompany implement the same algorithm on both a DSP and a non-DSPplatform provides insight into the advantages a DSP based solutionprovides.

The below figure shows the architecture of a networked video recorder(NVR)/digital video recorder (DVR) design using a DSP to accelerate thevideo analytics. The NVR/DVR is capable of running multiple algorithmssimultaneously, including intrusion detection algorithms (tripwire,trespass, tailgating, camera tampering), suspicious incidencesalgorithms (left object detection, missing object detection, loitering,crowding) and retail/business intelligence algorithms (counting.)

Figure3. Architecture of AllGoVision’s networked video recorder (NVR)/digitalvideo recorder (DVR) using a DSP for video analytics processing.

In this type of configuration, video analytics algorithms can beoffloaded to a DSP. The host will get video from the IP camera and passit via PCIe to the DSP memory. The DSP then performs vision analyticalgorithms on the video data and sends the alerts and alerts overlaidvideo back to the host.

Running a video analytics framework on the DSP enables straightforward accelerated video analytics functions on the DSP.

Below is example code illustrating how easy it is to move analyticsprocessing from a host PC over to a more efficient DSP platform. Thisfirst example shows how the analytics are called when executed on thePC.

The code snippet below describes the sequence of code that is executedon the DSP. The role of the host PC is to send the captured video framesfrom the camera to the DSP using a message mailbox. The DSP returns thealarm overlayed video back to the host PC using a message mail box.

This table shows some of the performance and load characteristics of this system:


The DSP’s efficiency at running video algorithms, relative to thehost processor, increases the efficiency of the system. The added DSPcan be used to boost overall system performance or it can be combinedwith a smaller/slower host processor to achieve the same performance butat a lower power and cost point. The current host + DSP system can runsix features simultaneously and support three channels per DSP core.Figure 3 shows the performance in terms of cost per channel and powerper channel of running the video analytics algorithms on various hostprocessors and various DSPs. As this is showing cost/power per channel,the lower the number, the more efficient the solution. In all cases, theDSP solution is at least 2x better than the host processor in terms ofcost and power, two keys for embedded analytics systems.

Figure4. Efficiency of various processors to implement video analyticsalgorithms. The four processors on the left are host processors. Thethree on the right are DSPs. The DSP can process analytics at a lowercost per channel and lower power per channel.

The above example showcases the efficiency gained by a NVR/DVRsolution by leveraging a DSP for the video analytics. The sameefficiencies can be gained at the edge by adding a DSP to a camera. Byadding a DSP to the camera, many of the same video analytics algorithmsthat run on the NVR/DVR can be run inside the camera, thereby reducingthe amount of data sent to the NVR/DVR and allowing the same NVR/DVR toprocess more channels and use its processing to run more advancedalgorithms, thereby increasing the functionality of the system. Thegreat thing is that the code development can be leveraged from theserver to the edge with a scalable DSP platform. In this configuration, asingle core DSP is at the edge running analytics in the camera whilemulticore DSPs work in the NVR/DVR to run higher level analytics.Algorithms can run on both platforms with minimal additional developmenttime, providing high levels of software and development reuse.

In conclusion, DSPs are being used to provide analytics processing inembedded processing solutions in security and surveillance, industrialand factory automation, military and defense, and automotive visionapplications because they provide an efficient, cost effective solutionto the need for highly reliable real-time processing at low power andsmall size.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.