Speed up small form factor machine vision with integrated GPU/X86 SoCs - Embedded.com

Speed up small form factor machine vision with integrated GPU/X86 SoCs

Machine vision technology is evolving quickly, fueled by dramatic gains in processing performance through innovative heterogeneous architectures that leverage FPGAs, DSPs, and GPUs paired with a microprocessor, which accelerate image processing functions and handle data transfer and I/O respectively.

The relatively recent arrival of PC-based ‘smart cameras’ that forego conventional DSP and FPGA-based processing platforms heralds another significant advance in intelligent vision system technology as the industry shifts away from specialized legacy processors and narrowly-supported imaging software in favor of the more versatile x86 platform.

The arrival of x86 accelerated processing units (APUs) enabled another major leap forward for machine vision technology. With the APU, the silicon-level integration of a low-power x86 CPU and the parallel processing performance of a programmable, discrete-class general-purpose graphics processing unit (GPGPU) in a single device drives the high speed processing that’s essential for achieving high performance machine vision.

Combining a GPU core on the same die as the CPU enables the system to offload computation-intensive pixel data processing from the CPU to the massively multicore GPU, distributing the processing workload across available processor cores in parallel to improve the real-time performance of the whole system.

This can yield an order of magnitude increase in image processing performance versus serial task execution on a CPU alone and offers the simplified hardware architecture of a standard PC platform.

Recently introduced integrated system on chip platforms further reduce the APU’s two-chip architecture – the APU and the companion I/O controller hub – with the silicon-level integration of the I/O controller hub.

With between 85 and 185 single precision GFLOPs of compute performance, such an architecture can help eliminate the need for FPGAs or DSPs to accelerate image processing. And with a footprint of only 24.5mm x 24.5mm, the SOC simplifies design complexity, helping enable machine vision system designers to shorten design times and achieve aggressive form factor goals without sacrificing processing performance.

Offering PC-caliber performance and application agility complemented by a robust ecosystem of industry-standard, x86-optimized software, applications and development environments, x86 machine vision systems are unlocking myriad development, deployment and management efficiencies.

The x86 architecture provides smooth interoperability with the growing IP-based factory infrastructure to help facilitate improved data management capabilities and provide tight integration with IT networks and x86-based distributed control systems. This introduces additional benefits for the applications hosted on these networks such as the ability to leverage the same applications for database management, security and remote management.

Collectively these efficiencies can help yield leaner cost structures for integrators and end users alike, and allows them an opportunity to overcome the hardware and software incompatibilities and cumbersome software maintenance processes that can result from different processor architectures deployed throughout a factory.

OpenCV/OCL for fast processing and code portability
In order for machine vision system designers to most effectively take advantage of the increases in parallel processing performance provided by heterogeneous architectures, their programs must be written in a scalable fashion so as to run on the widest possible range of systems without coding modification. Open development tools like OpenCV and OpenCL are playing a major role in this effort.

The free for use, cross-platform operational OpenCV (Open Source Computer Vision) programming library has emerged as a key enabler for high-performance, parallel processing-driven computer vision applications.

It provides real-time responsiveness and advanced intelligence for modern smart camera systems spanning applications including automated inspection and measurement, security and surveillance, and image detection and identification. OpenCV’s machine vision-optimized algorithms are commonly used today to detect and recognize faces, identify objects, classify actions in videos, track camera movements and moving objects, and even extract 3D models of objects.

Meanwhile OpenCL, the open and royalty-free programming standard for maximizing parallel compute utilization on heterogeneous systems, gives machine vision system designers a cross-platform, non-proprietary solution for accelerating their applications across mainstream processing platforms including APUs, SOCs and multicore CPUs and GPUs.

OpenCL allows developers to focus on applications rather than chip architectures via a single, portable source code base, providing a unified tool chain and language to target all of the parallel processors currently in use.

This is done by presenting the developer with an abstract platform model that conceptualizes all of these architectures in a similar way, as well as an execution model supporting data and task parallelism across heterogeneous architectures.

The recent introduction of the OpenCV OCL module allows machine vision system designers to achieve the best of both worlds with OpenCV and OpenCL in a unified framework, equipping these designers to use OpenCL to accelerate select OpenCV functionality on OpenCL compatible devices including APUs, SOCs and discrete GPUs to exploit the high compute throughput that these processors provide.

The OpenCV OCL module is designed to be easy to use and doesn’t require prior OpenCL knowledge or experience to get started, and allows designers to easily integrate built-in functionality with custom OpenCL kernels within the OpenCV framework.

OpenCV OCL provides machine vision system designers with a rich repository of ready-to-use functions that aid the development and application of advanced vision algorithms. It includes utility functions, low-level vision primitives, and high-level algorithms.

The utility functions and low-level primitives provide a strong foundation for developing real-time vision algorithms that take advantage of OCL, and the high-level functionality includes sophisticated algorithms for achieving high-precision capabilities like face detection.

Scaling from low-power to high-performance
With the combination of OpenCV and OpenCL via the OpenCV OCL module, machine vision system designers are afforded an elegant, non-proprietary programming platform to accelerate parallel processing performance across a wide range of compute-intensive systems, from PC-based smart cameras to compact vision systems and machine vision servers.

The ability to develop and maintain a single, portable source code base that can be applied to SOCs and APUs within these systems helps enable developers to achieve significant programming efficiency gains while preserving the value of their resource-intensive source code.

APUs and the new generation of SOCs offer similarly compelling scalability advantages, supporting a wide range of performance and power profiles and thereby eliminating the need to bifurcate underlying processing platforms to accommodate disparate low-power and high-performance systems.

Providing x86 compatibility and the ability to scale from low-end to high-end system support, these SOCs and APUs can help provide consistent, high-speed image processing performance across diverse system platforms, allowing for design and operation efficiencies via a consistent heterogeneous architecture.

Using open development tools like OpenCV OCL to unlock the full processing performance of APU- and/or new SOC-based machine vision systems, designers are equipped to develop parallel processing-driven systems that can surpass the speed limits of earlier generation vision systems without compromising on system form factor.

Cameron Swen is the strategic marketing manager for Industrial Controls and Automation in AMD’s Embedded Solutions Division. He joined AMD in 2003 as the Manager of Technical Marketing for AMD’s Innovative Solutions Group. He started his career 20 years ago as an engineer working with embedded computer systems and has held a variety of technical marketing positions at National Semiconductor and AMD for the last 13 years. He holds a degree in Engineering from Colorado State University.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.