TECH FEATURE
Reconfigurable Computing
FPGAs and Multicomputers: A Formidable Blend
There are unique benefits to mixing algorithm-specific FPGA schemes with general-purpose multicomputing. An effective solution must marry the design flows of these diverse architectural approaches.
MARK LITTLEFIELD, MERCURY COMPUTER SYSTEMS
In embedded multicomputer systems the use of FPGAs alongside more traditional microprocessors and digital signal processors (DSPs) has moved from novelty to necessity. For certain classes of problems, field-programmable gate arrays (FPGAs) deliver dramatically better performance than microprocessors. And while custom ASICs also offer this performance advantage, FPGAs deliver additional flexibility because they are programmable.
High performance and operational reconfiguration form a compelling argument for the use of FPGAs in modern embedded systems. However, the difficulties in writing code for FPGAs and integrating FPGA-based modules into larger multicomputer systems tends to temper developers desire to use them. There remains a dichotomy between FPGAs high performance and flexibility, and their issues regarding integration and ease of use. There are also numerous architectural issues that system designers face when integrating FPGAs into embedded multicomputer systems.
Heterogeneous Multicomputing
In the embedded realm a constant battle rages between increased performance, lower power consumption, lower cost and faster time-to-market or deployment. In many applications one or more of these market forces is driven beyond what Moores law can compensate for. As a result, developers are on a constant search for ways to improve one or more of these dimensions. With that in mind, interest in heterogeneous multicomputing is on the rise. By mixing computing resources of different typesgeneral-purpose microprocessors, special-purpose processors, DSPs, ASICs, FPGAs, and so ondevelopers can derive the maximum from a projects power/size/fiscal budget.
One problem with heterogeneous multicomputing is that rarely are all of the necessary components for solving a problem available from a single vendor. As a result, developers are often faced with a jumble of non-compatible parts that must be integrated to form a system. As a result, the cost or size/power benefits of heterogeneous multicomputing are often offset by increased development costs and time lost during implementation. There can also be performance costs when incompatible components from different vendors are combined. When FPGAs are added to the mix, the general problem is compounded by the relative difficulty in developing for an FPGA, to say nothing of the integration of the FPGA-based application into the larger system.
Multicomputing Problems
Many problems in real-time multicomputing are computationally challenging and often require tens or even hundreds of state-of-the-art microprocessors working in concert. Some of these difficult problems such as convolution, rebinning, backprojection, and synthetic aperture radar (SAR) signal formation and range/azimuth compression can be implemented in FPGAs with a 5:1 to 50:1 performance improvement over a single general-purpose microprocessor. That said, some algorithms are not well suited for implementation on an FPGA, such as those that perform different types of processing on different types of data. Rarely is an FPGA a good fit for all the algorithms in an application.
For those algorithms that do perform well on an FPGA, there is still a catch: they are not easy to work with. Implementing an algorithm on an FPGA is roughly 10 to 30 times more difficultin terms of hours of effortthan programming on a general-purpose device such as a RISC processor. And, after the required algorithm is running on an FPGA, there is still the task of creating the interfaces so the FPGA can communicate with the rest of the computing systemI/O, memory, and other processors.
This complex design situation can be divided into three groups of problems. First, an effective design must provide a simple, flexible way to partition an application for optimal performance, running some algorithms on FPGAs and others on different devices such as RISC processors. Second, to keep well-matched algorithms running very fast on an FPGA, the design needs equally fast memory and I/O access. And third, even though programming and integrating FPGAs is difficult, development projects must adhere to competitive schedules.
A general approach to solving the first two groups of problems is to link FPGAs with other types of processing devices via a switch fabric as seen in Figure 1. This approach affords application developers the flexibility to execute different types of algorithms on different types of processing nodes. I/O can be implemented directly to an FPGA or to another specialized device. Systems with this type of architecture can be adapted to a variety of application implementations.
