Building mobile and embedded consumer devices that can "see"
Eldad Melamed of CEVA provides some general guidelines for developing signal processing algorithms that will allow the use of real-time face detection applications on any mobile device.
In response the market demand for embedded vision capabilities, an industry alliance has been formed. Spearheaded by the market research firm BDTI, the Embedded Vision Alliance (EVA) consists of 16 initial members from the IC and embedded industries. Its mission is to "inspire and empower embedded system designers to incorporate vision capabilities into their products, by providing them with practical insights, information, and skills." The EVA hopes to facilitate the flow of high-quality information and insights on embedded vision technology and trends.
The emergence of the EVA validates the need for great collaboration within the industry. However, much progress has already been made to improve the vision capabilities of electronic systems. One of the key requirements is a flexible processing architecture that can address the considerable performance and power needs of mobile image detection and recognition features in products.
Much like the human visual system, embedded computer vision systems perform the same visual functions of analyzing and extracting information from video in a wide variety of products. In embedded portable devices such as smartphones, digital cameras, and camcorders, the elevated performance has to be delivered with limited size, cost, and power.
Emerging high-volume embedded vision markets include automotive safety, surveillance, and gaming. Computer vision algorithms identify objects in a scene, and consequently produce one region of an image that has more importance than other regions of the image. For example, object and face detection can be used to enhance video conferencing experience, management of public security files, content based retrieval and many other aspects.
This paper presents an approach for real-time deployment of face detection application on programmable vector processor. The steps taken are general purpose in the sense that they can be used to implement similar computer vision algorithms on any mobile device. Included in this is a method for cropping and resizing that can be done to properly center the image on a face. (Figure 1).
The application can be used on a single image or on a video stream, and is designed to run in real time. As far as real-time face detection on mobile devices is concerned, appropriate implementation steps need to be made in order to achieve a real-time throughput.
Figure 1. CEVA face detection application
Challenges of face detection
While still image processing consumes a small amount of bandwidth and allocated memory, video can be considerably demanding on today’s memory systems.
At the other end of the spectrum, memory system design for computer vision algorithms can be extremely challenging because of the extra number of processing steps required to detect and classify objects. Consider a thumbnail with 19x19 pixels size of face pattern. There are 256,361 possible combinations of gray values only for this tiny image, which impose extremely high dimensional space. Because of the complexities of face image, explicit description of the facial feature has certain difficulties; therefore, other methods that are based on a statistical model have been developed. These methods consider the human face region as one pattern, construct classifier by training a lot of "face" and "non-face" samples, and then determine whether the images contains human face by analyzing the pattern of the detection region.