DSP-FPGA.com
home
articles
products
newswire
vendors
E-letter
E-cast Schedule
articles > FPGAs: Video imaging
Video imaging
RSS Link


Turbo charge video and imaging applications with FPGA coprocessors

Alex Soohoo By Alex Soohoo
Altera


As video, image, and signal processing design challenges become more complex, designers are turning to the next generation of FPGA-based system architectures to boost DSP performance and lower overall costs.

As video and imaging equipment deployment shifts toward high definition and next-generation video compression standards, the computational performance required has outstripped what standalone DSPs can provide. In addition, DSPs often do not offer the flexibility to implement custom video interfaces and peripherals, forcing designers into using difficult and expensive design alternatives such as Application Specific Integrated Circuits (ASICs) or Application Specific Standard Products (ASSP). The inherent lack of flexibility with compromised feature sets in these types of solutions can require an entire system redesign as feature requirements grow and the need for higher performance increases.

FPGAs are a design option that can bridge the flexibility gap in these types of designs. Additionally, with the increasing number of embedded hard multipliers and high-memory bandwidth, the latest generation of FPGAs can enable customized designs for video systems while offering up to a 30-fold performance improvement over the fastest available standalone DSPs. The emergence of new DSP-centric design flows has made FPGA coprocessor architectures an attractive option for video and image processing systems. Designers can utilize a state-of-the-art FPGA coprocessor design flow to implement high-performance DSP video and image processing applications, reaping the price and performance benefits of FPGAs with a system architecture that is more scalable and powerful than traditional DSP-only designs.

These comprehensive flows integrate a combination of traditional C-language based development environments for programmable DSPs and Hardware Description Language (HDL) tools for FPGAs with powerful system integration capabilities such as Altera’s SOPC Builder tool. (See Figure 1.) Through innovative system partitioning, designers now have the ability to leverage a legacy code base for DSPs and off-load the most computationally intensive algorithms to an FPGA to create systems optimized for both price/performance and time to market.

Combined DSP Design Flow

Figure 1

DSP design flow
While engineers have refined integrated development environments for DSPs throughout the years, there are numerous choices for implementing FPGA coprocessors. The design of DSP systems with FPGAs can utilize both high-level algorithm and HDL development tools (refer to Figure 2). The most straightforward technique is to create an entire design from scratch, writing custom DSP functions in HDL and then using standard FPGA design software. While it is possible to develop high-performance, optimized designs, it is often a time-consuming and labor-intensive effort. FPGA suppliers and third-party vendors now offer highly optimized, parameterizable, off-the-shelf Intellectual Property (IP), typically common video and image processing functions, as part of a DSP library. Designers can directly integrate these IP cores into a system design, enabling shorter design cycles and accelerated time to market.

Model-based design environments, such as The Mathworks Simulink®, enable designers to develop, simulate, and verify a DSP processing data path for an FPGA coprocessor. Designers can build models using a mix of proprietary and off-the-shelf DSP building blocks. In addition, FPGA design software can integrate this environment, combining capabilities with standard FPGA HDL synthesis, simulation, and customized development tools.

FPGA Coprocessor Flow

Figure 2

Finally, system-integration design software enables rapid development of custom solutions and regeneration of existing solutions to add new capabilities and improve system performance for FPGA coprocessors. By automating the integration phase of system components and peripherals, this software can enable users to focus attention on system-level requirements instead of the mundane, manual task of integrating individual blocks with varying requirements. For example, the task of creating and verifying the interface between an FPGA and a DSP can be complex. The system integration tools will enable the designer to drop in a FIFO-based IP core to interface to the external processor without having to manage or consider the specific pin-out details. This capability provides a powerful platform for composing systems defined at the block or component level.

FPGA coprocessing for high-performance video and image processing
At the core of the FPGA coprocessor design flow is the device itself. Properly architected systems can off-load a DSP processor and efficiently execute computationally intensive blocks of a DSP algorithm in a parallel implementation on the FPGA. This feature is especially attractive for emerging video and image processing applications where DSP performance requirements are growing at the fastest rates.

Consider a simple video filtering example. A 7 x 7, two-dimensional, preprocessing filter kernel applied to broadcast HDTV 1080p video at 1920 x 1080 resolution, 30 frames per second, and 24 bits per pixel requires more than 9 Giga MACS per second. This is more performance than the fastest commercially available DSP can provide. A designer can implement the same function on a low-cost FPGA with headroom to spare.

For video compression systems, FPGA coprocessing architectures can create especially cost-effective solutions compared to platforms based on multiple DSPs. Designers can implement high-definition broadcast quality encoding utilizing video codecs MPEG2, MPEG4, and H.264 with a single FPGA and DSP solution.

Figure 3 shows sample FPGA coprocessor partition of the H.264 encoding standard. The FPGA has implemented the sections of the algorithm that effectively require the most cycles on the DSP, including the motion estimation block, entropy coding, and the de-blocking filter. The DSP can focus on the remaining parts on the algorithm that are more control oriented and better mapped to a C-code implementation. Newer entropy coding techniques, such as Context-Adaptive Variable Length Coding and Context Adaptive Binary Arithmetic Coder, do not map well to DSP instructions and are best realized as hardware accelerated blocks on the FPGA.

FGPA

Figure 3

In the case of the newest video compression standards, which are constantly changing, the FPGA coprocessor architecture provides a number of advantages. When a standard is relatively new or in flux, many system developers prefer that some degree of flexibility exist in the design features to allow for system modification. The motion estimation block, in particular, leaves room to incorporate a range of different techniques for motion vector search. From the equipment vendor’s point of view, this flexibility enables customization and differentiation that is not possible when the only choice is a fixed ASSP.

Shorter time to market with FPGA coprocessor designs
Video and image processing performance requirements are growing as OEMs and design engineers adopt new standards for higher resolution video systems. Given the increasing processing demands, the parallel processing capabilities of FPGAs make them an attractive implementation option for highly repetitive tasks found in video and image processing. Leading-edge FPGA coprocessor design flows enable designers to realize the price/performance benefits of FPGAs as coprocessors for DSP video and image processing applications. By utilizing high-performance FPGAs within this FPGA coprocessor design flow, designers can implement complex, high-performance DSP designs while also shortening time to market.


other headlines