Here's how you can implement both video compression and analytics on a single DSP. Topics covered include algorithm optimization, code optimization, and memory optimization.
By Nik Gagvani, PhD and Steve MacLean,
Cernium Corporation
Video analytics builds upon techniques in computer vision and pattern recognition to analyze and interpret video streams. It is currently used in a variety of applications including surveillance, traffic monitoring and retail. In an earlier
article, we introduced the building blocks for video analytics. These consist of 1) segmentation, which detects changes in the scene; 2) classification, which qualifies those changes as objects; 3) tracking, which follows objects as they move through the scene; and 4) activity recognition which interprets the tracking data to alert users to specific conditions. In this article, we discuss design and optimization strategies for a practical video analytics implementation running on a DSP.
Design Considerations
Video analytics is commonly used in smart cameras for the purpose of automated surveillance. A smart camera uses analytics techniques to watch the scene and infer activities or behaviors of interest. Examples include the detection of a loitering person or an illegally parked vehicle. In addition to performing analytics, smart cameras are capable of streaming live video over IP networks. Video is compressed using either motion-JPEG (MJPEG) or MPEG-4 and streamed using a protocol such as RTP/RTSP.
This requires two processes, one for analytics and the other for compression. Each process grabs raw video frames and processes them in real time. Thus, the two processes are independent. However, the situation is more complicated if analytics results need to be embedded into the compressed stream. In this case, the analytics results are provided as overlays indicating the location of objects of interest.
Video analytics is also used in smart video servers and digital video recorders (DVRs). These devices process multiple channels of video input. Thus, smart video servers and DVRs need to perform compression and analytics on multiple channels. Figure 1 shows the output from an analytics-enabled 4-channel server with overlays indicating the presence of people and cars.

(Click to enlarge)
Figure 1. Video analytics on a four channel server. People and cars are indicated by yellow overlays. This is an optimized implementation running MPEG-4 compression and analytics on a single DSP.
A smart camera designer is faced with multiple challenges for the implementation of the video sub-system. Both video analytics and video coding are computationally demanding applications. Additionally, analytics can require a significant amount of memory to maintain historic information for purposes of tracking. There is also the practical need to reduce bill of materials (BOM) costs and maintain a compact form factor for the camera. These factors make it is desirable to consolidate both analytics and compression into a single processor while maintaining a small memory footprint.
These goals can be achieved by using a processor with accelerators for compression or analytics processing. It is easy to find a processor with compression accelerators, but dedicated accelerators for analytics processing are uncommon. One solution to this problem is to use simpler analytics techniques which provide limited functionality. For example, the designer could limit themselves to motion detection, which cannot discriminate between people and shaking foliage. The designer looking for a high-quality analytics solution coupled with H.264 or MPEG-4 compression may turn to separate processors, one for analytics and another for compression.
However, it is possible to create a no-compromise single-processor solution with the right choice of processor and analytics algorithms, along with an optimized implementation. The rest of this article describes the approach used in Cernium Edge, which is available as a licensable library that enables MPEG-4 or H.264 compression with analytics on a single processor. Figure 2 shows two possible configurations on a single DSP, such as the Texas Instruments DM6437. The first configuration is for a smart camera and operates at full D1 resolution for MPEG-4 coding with analytics running at 10 frames per second. The second configuration shows a 4 channel DVR implemented on the same processor, with 4 concurrent channels of analytics and compression on a single processor.

Figure 2. Alternate configurations of analytics and encoding running concurrently on a single processor.