|
||||||||||||||||||||
| home |
|
|
||||||||||||||||||
| columns > | Programmable Corner |
|
||||||||||||||||||
|
|
||||||||||||||||||||
|
FPGA coprocessors accelerate the performance of 3-D stereo image processing
Image processing is mathematically intensive to begin with. The added dimension of 3-D processing grows the task exponentially. The best way to deal with the load is a balanced approach of DSP processor, supplemented with the parallel capabilities inherent in FPGA-based DSP co-processors. Three-dimensional (3-D) image processing systems have reached the mainstream and are now embedded in a wide range of products including security and surveillance devices, industrial robotics, and autonomous vehicles. These systems – that are offered in single- and multi-camera configurations – use a diverse array of techniques to generate 3-D information, including beaming structured light to calculate topographical information and focusing lasers in reflected angle or time-of-flight calculations. The most widely used technique is stereo image processing with two cameras, which mimics the way the human eye gathers depth information and avoids the need for special lighting requirements or expensive lasers.
Additionally, the availability of standard interfaces such as Gigabit Ethernet, IEEE-1394 (FireWire), and USB 2.0 means an entire system can be assembled simply by plugging the cameras into a commercially available desktop PC. However, this simplicity comes at a price: drastically higher digital signal processing performance is needed to execute complex algorithms. Such enhanced performance can be realized with a combination of discrete DSP processors and FPGA coprocessors in a 3-D stereo image processing system implementation. Implementation
DSP processors can achieve performance levels comparable to common desktop PCs, even running at slower clock speeds. The Altera Stratix FPGA family further extends the performance of DSP processor-based systems and, at the same time, provides the flexibility needed to enable variant system architectures. FPGAs provide 10 times greater efficiency than DSP processor-based implementations, especially for data flow algorithms with minimal control processing. This efficiency is the inherent parallelism available in an FPGA as compared to the fetch, compute, and store serial processing of a DSP processor. A pure DSP processor-based implementation requires many clock cycles to execute the multiple computations FPGA coprocessors accomplish in a single cycle, as shown in Figure 1.
For this reason, the DSP performance of FPGA coprocessors is orders of magnitude higher than that of discrete DSP processors. Some FPGAs include dedicated hardwired DSP blocks that achieve even higher performance for computationally intensive functions. For example, Altera’s Stratix II family offers up to 384 18 x 18-bit multipliers running at 450 MHz for a throughput of 173 Giga Multiply Accumulates (GMACs) per second. The ideal solution is a combination of a DSP processor and an FPGA to reap the benefits of both. By itself, the DSP processor provides coding simplicity and impressive performance. Teamed with the parallel operation prowess of an FPGA, it delivers the highest level of performance. A product that exemplifies these capabilities is the Valde Systems VS1502 stereovision processor, which utilizes the power and flexibility of a DSP processor and adds the coprocessing power of an Altera Stratix II FPGA. The VS1502 provides video input streams via two GbE digital camera interfaces, an industry standard 10/100BASE-T Ethernet port for interfacing, and discrete I/O to interact with external devices to meet typical image processing system requirements. Lens distortion Remapping is achieved with either a Look-Up Table (LUT) of predefined coordinates or by calculating the position based on a geometric algorithm. The LUT method is faster, but as image resolution increases, the memory required to store the table becomes prohibitive. Performing the remapping in real time is straightforward and requires no memory, but when calculated across two images, it consumes significant processing time. The most cost-effective way to achieve real-time performance of this algorithm is to implement it in an FPGA that simultaneously executes many operations. Image correspondence A common method for determining correspondence is the traditional Sum-of-Squared-Differences (SSD) algorithm that is described as follows:
Correspondence is another function best performed by an FPGA in the image processing system. While a relatively simple function, image correspondence is often performed millions of times across the pixels in the image, a challenging task for a DSP processor that serially executes operations. Epipolar geometry
Ep1 passes through two points: E1 and p1. E1 and E2, called epipolar points, are the intersection points of the baseline C1 and C2 with each of the image planes. For a point in the first image, its correspondence in the second image must lie on the epipolar line in the second image. Called the epipolar constraint, this allows the reduction in correspondence dimension search from two to one. Calculation of the epipolar lines involves geometric projections and calculations that are produced at a high rate of speed in an FPGA implementation. Disparity maps
A current example of this trend is the Valde Systems VS1502 customizable stereo image processor, which leverages FPGAs to perform computationally intensive functions, thus reducing the processing load on the DSP. This approach enables high performance, low power consumption, and a compact form factor, making the VS1502 ideal for industrial automation applications in inspection, vision-guided robotics, facial recognition, and surveillance cameras. |
||||||||||||||||||||
|
©MMVIII DSP-FPGA.com. An OpenSystems Publishing, LLC publication. About this Magazine and Website | Contact Us | DSP-FPGA.com Media Kits |
||||||||||||||||||||