CMP EMBEDDED.COM

Login | Register     Welcome Guest RFID World  Logic NVM  TeardownTV
 

Automotive vision system recognizes road signs: Part 2 - Architecture and design tips
Pattern recognition and processors deal with computational and I/O challenges of handling video data stream.



Automotive DesignLine

Now that we discussed the basic functions of traffic sign recognition in Part 1, we present an efficient software architecture that consolidates all the elements into a functional system.

Calling filter blocks and controlling DMA channels
Traffic sign recognition is performed on processor core 1 (core A, white, top center in the figure below). After initializing all the required interfaces, the processor starts to transfer an image via a video interface (PPI0) and the associated direct memory access (DMA) channel. This image is written into the external SDRAM memory (red) with the name "frame 0" (upper left). There is not enough space in the fast L1 on-chip memory (green) to store a complete picture, so an additional DMA channel reads the image, line by line, from the SDRAM and stores it in the internal L1 memory (uiL1_buffer_A_sub0, top green at left).

View a full-size image of the left hand side of the above diagram for most of the discussion below on this page

View a full-size image of the right hand side for the short discussion at the end of this page

The processor can begin computation of the Sobel edge detection filter, when the first three lines are resident in the L1 buffer. The DMA channel will transfer additional lines to a second area of the L1 memory (uiL1_buffer_A_sub1, below the sub0 memory in the diagram), simultaneous to the computation. These lines will be used for a future computation. The processor will access the two memory areas alternately, in order to compute the complete image or the image portion of interest.

Tip: By storing data and constant lists (Alpha LUT, LUT) in different sub-banks, the computing time can be optimized because only different sub-banks can be accessed simultaneously in one single processor cycle (see the Blackfin® Processor Manual).

The results of the computation are written to a table (Sobel ROI) in the L2 on-chip memory (yellow). During the subsequent read operations accessing this memory, additional processor cycles will be required. Before starting the Hough transformation, the processor starts transferring an image ("Get new image," light blue). Again, writing to only two registers in the DMA controller is necessary to ensure that the data for processing the next image will be available on time.

The Hough transformation is executed in the next step. The processor accesses the results of the previously executed Sobel filtering, a table with constants for the circle detection (Circle LUT, yellow) and the Hough space, which is located in the slow external SDRAM L3 memory (red, at left). Since the computed values have to be added to values in the Hough space, the processor must first read the memory location, add it to the result and then write it back. In this case, each read requires several processor cycles, but the subsequent write does not!

Tip: Each write operation is supported by a writeback buffer. So, if there are sufficient cycles (e.g., the write operations required for computing the subsequent results) between write operations, the processor core will not wait for the write operation to complete but will continue processing. It appears therefore that writing into external memory is possible within one processor core cycle.

Of course, this process could also be made more efficient through use of the DMA controller. For execution of the "Clustering" block (white, center), the data from the Hough space are read out line by line and made available in the L1 memory. Again, it is a DMA channel that performs this task. This filter block can be executed at full processor speed because only the fastest memory is being accessed. Since there is no more demand for the contents of the Hough space, it must be reset to zero for the next sequence ("Clear Hough Space"). This task, too, can be performed by a DMA channel and does not burden the processor core.

In the next step, the circle detector (white, center) accesses the results of the clustering and Sobel blocks and determines circle radii and exact position. Since the position of the possible traffic signs is now known, they can now be downloaded from the original image via a DMA channel. To do this, first one region with several traffic signs is transferred into the L2 memory (yellow). From this region, successive details from individual traffic signs are placed into the L1 memory (green).

Comparison patterns are needed for examining the traffic signs. Part of a pattern database is loaded into the L2 memory. From there, just like with the signs to be identified, the comparison patterns are successively loaded into the L1 memory. Now the recognition block can compare the located traffic signs with all patterns and make a list of recognized signs. Once this process is completed, processor core 1 (Core A) will send a message to processor core 2 (Core B, in right side image) specifying which traffic sign was found. Processor core 1 will now begin with the computation of the second image (frame1).

1 | 2

Rate this article: Low High
Current rating
  • .
Embedded.com Career Center
Ready to take that job and shove it?
SEARCH JOBS

Browse all jobs

SPONSOR
RECENT JOB POSTINGS


 :