FPGA BOARD SOLUTIONS
Recent Trends in FPGAs Drive Adoption in Mil Apps
A new generation of development tools and an FPGA-friendly mezzanine design along with ever advancing developments in FPGAs themselves, are pushing their use in ever more powerful applications.
JOHN WRANOVICS, CURTISS-WRIGHT CONTROLS EMBEDDED COMPUTING
Page 1 of 1
The adoption of FPGAs into rugged applications continues to increase at a rapid pace thanks to a number of factors. These factors, which make FPGAs more attractive and easier to integrate into high-end mulitprocessor systems, include enhanced support for high-speed I/O, improved tools for multiprocessor system development, and the emergence of the FMC (VITA 57) module that defines improved high-speed I/O mezzanines ideal for use in FPGA-based applications.
In military applications such as radar and SIGINT, FPGAs are being used to augment, and sometimes even replace, general-purpose processors (GPPs) or DSPs, thanks to their vastly larger gate counts, specialized DSP units, embedded processors and high-speed serial link. Their flexibility and computational performance per watt also make them an attractive choice for systems with tough size, weight and power (SWaP) constraints. One alternative to GPPs is provided by the recent introduction of the platform FPGA, such as Xilinx’s Virtex-5 FXT family, which enhances the programmable logic typically found in FPGAs with DSP engines, embedded processor hard cores and other specialized features. The FXT’s Xtreme DSP MAC (multiply-accumulate) engines provide the computational horsepower for high-performance signal processing applications like radar and signal intelligence (SIGINT) while mitigating SWaP concerns (Figure 1).
New FPGA designs, such as the FXT, provide an opportunity for integrating signal acquisition and processing into a single, compact package suitable for deployment at the sensor in SIGINT applications. This approach delivers a significant advantage because performing the analog-to-digital conversion, down-conversion and initial signal processing at the sensor maintains an end-to-end digital form of the incoming signal, thus avoiding signal loss or the introduction of noise.
Formerly, one factor that slowed the adoption of FPGAs was that FPGA-based systems were considered relatively difficult to develop. Typically they required longer development cycles and specialized development talent. In recent years, though, FPGAs have become much easier to develop. New, higher-level language-based development flows, leveraging C or Matlab for example, enable developers to approach FPGA development from a more familiar, abstract perspective. Advanced tools such as The MathWorks’ Simulink have the potential to provide a graphical-based object-oriented environment for FPGA development as well as the development of hybrid FPGA/GPP-based systems.
Another advancement in FPGA system development is the introduction of tools that simplify the integration of multiprocessor systems. Complex applications like radar often require large numbers of processors. To test and develop these applications, system designers are faced with repetitive tasks just to get started. There are numerous software development tools for developing embedded real-time applications on single processor systems. Unfortunately, these tools do not scale well for multiprocessor real-time applications. And the challenge only becomes more difficult as the number of processors in the system increases.
In a typical cross-hosted embedded software environment, with one SBC and one processor, the frequently recurring cycle of edit/compile/load/debug involves a number of manual steps either with traditional command line control or with graphical Integrated Development Environments (IDE) such as Eclipse. In comparison, with a multiprocessor system with a 64 processor multicomputer, one would have to repeat the same manual steps 64 times before the system could be booted and tested. Without a tool that understands multiprocessor systems, the process is rife with potential human error, time-consuming and labor-intensive.
Today, tools such as Curtiss-Wright’s Continuum Insights, enable the programmer to treat the system at a hierarchical level (Figure 2). This enables the developer to define groups of processors that run the same code. Once these multiple groups have been defined, it is possible to operate on all the processors within that group. For example, one could define a signal processing group that includes all the nodes of a particular quad processor DSP board, and all those processors are running the same kernel and the same application. With a single key stroke it would be possible to download the application not only to one processor but to all the processors within that group. Accordingly, similar operations can be performed on the other groups that have been defined to comprise the system.
Another recent improvement in real-time multiprocessor development tools is a GUI providing an icon-based representation of the system, eliminating the need for the programmer to write scripts. The GUI provides different views of the system (chassis level, card level, component level and group level) that significantly ease and speed the process of working with the system.
The debugging of multiprocessor systems has also been simplified. A popular tool for single processor real-time debugging is Wind River’s System Viewer. System Viewer provides time-stamped event analysis of what happens within the application. For multiprocessor systems, the difficulty is to get the information time-aligned, across multiple processors. The challenge for multiprocessor system developers is to find a way to ensure that timestamps are common across multiple processors. Then, they would need to correlate that timing information, by hand, into some method of analysis that showed when events occur on one processor versus another. Today this would typically be in a non-visual fashion, and would become increasingly difficult when scaled to larger and larger systems.
One common method for gaining visibility into the state of software execution is to log messages to a console, aka printf. This approach, while time-proven, does not scale well for large multiprocessor systems. It would, for example, in the case of a 64 processor system, require 64 serial cables and 64 instances of a terminal emulation program running on the development host. Today, new tools enable the consolidation of all of the processor log messages with a means to organize and delineate by processor and time in space and time the different messages.
Another improved tool for real-time multiprocessor development is a scaled-up source-level debugger that can support breakpoints in multiple processors whereby the entire system, or selected processors, can be halted and thus allow examination of the state of the entire system at the point of interest. Traditional source-level debuggers were incapable of working beyond the confines of a single processor.
Improved High-Speed I/O
Thanks to their support of high-speed serial I/O, and interface blocks such as Serial RapidIO and PCIe, new FPGAs are well suited for use in heterogenous FPGA/processor multicomputing systems. Additionally, integration challenges are being obviated through support for communications middleware and IP libraries.
The latest generation of FPGAs incorporates a much larger complement of high-speed serial I/O and embedded hard cores and/or vendor-supplied standard interface blocks such as PCI Express and Serial RapidIO. This makes them ideal for use within heterogeneous (mixed FPGA and processor) multi-computing systems. While FPGAs are ideally suited to repetitive fixed-point algorithms such as convolution, filtering and decimation, the resultant data streams often need to be distributed to a further level of processing. In a large system, this might be, for example, an array of multicore Power Architecture processors with AltiVec vector processing enhancements.
In the military embedded market, the migration of standards-based connectivity from parallel standards such as PCI and PCI-X to the serial connectivity of PCI Express, Serial RapidIO and Ethernet, has been facilitated by the introduction of the VPX (VITA 46) standard and the widespread use of advanced processors, such as Freescale Semiconductor’s 8641D dual core Power Architecture processor. The latest FPGA devices–such as Xilinx’s Virtex-5 or Altera’s Stratix IV–incorporate multi-GHz, high-speed I/O signaling to satisfy the new serial fabric and networked vision of system connectivity. For example, the Virtex-5 includes up to four scalable PCI Express endpoints, configurable from x1 to x8 lanes, and Serial RapidIO soft cores, as well as up to eight 10/100/1000 Mbit/s Ethernet Media Access Controllers (MACs).
Practical heterogeneous computing architectures augmented with serial standards connectivity framework, a common interprocessor communications layer and a set of IP cores and tools, provide a way to resolve the SWaP vs. performance challenge. FPGA devices properly supported by COTS vendors’ IP and tools, such as the Virtex-5 or Stratix IV, have the logical, arithmetic and I/O capability to perform front-end DSP operations. In addition, they can slot into heterogeneous, serial standards-based computing systems, satisfying the military’s complex future multicomputing applications.
FMC Mezzanine Cards
A problem in the industry has been that the I/O interfaces are tightly coupled to FPGAs. This has limited the amount of reuse in FPGA designs because boards have to be designed specifically to handle a particular type of I/O. It has also limited the availability of COTS FPGA boards because it is difficult to design an FPGA with the right I/O for a wide range of customers. The new VITA 57 standard, also known as the FPGA Mezzanine Card (FMC), addresses these issues. The FMC standard defines an I/O mezzanine module that works intimately with an FPGA (Figure 3).
For years, the PMC, and more recently the XMC, have provided an industry standard mechanism for a modular and flexible I/O design, primarily for use with 3U and 6U SBCs. The PMC/XMC form factors have been used extensively in the embedded computing realm, but they aren’t the optimal solution for modular FPGA designs. PMCs are much larger than an FPGA I/O mezzanine needs to be, they have the wrong type of and too many connectors, and the interface between the PMC/XMC and the baseboard (PCI, PCI-X, PCIe, Serial RapidIO, and so on) is much more complex and resource intensive than is required for an FPGA I/O mezzanine to interface to an FPGA.
The FMC (VITA 57) standard was developed to provide an industry standard mezzanine form factor in support of a flexible, modular I/O interface to an FPGA located on a baseboard or carrier card. It allows the physical I/O interface to be decoupled from the FPGA design while maintaining a close coupling between a physical I/O interface and an FPGA. This approach separates FPGA board designs into two pieces–a carrier and a mezzanine. The carrier contains one or more FPGAs and the associated functionality that will always be common to any variation of the board design. The mezzanine contains the functionality that can be variable within a board design, such as the I/O portion of the design.
The FMC standard defines an I/O mezzanine module that works intimately with an FPGA. The standard defines two widths–single and double width. The single-width module, measures 69 x 76.5 mm, and is approximately half the size of a PMC module, and supports a single connector, P1, to the carrier. The double-width module measures 139 x 76.5 mm and can support one or two connectors to the carrier, P1 and P2.
There is a choice of two different connectors to interface the FMC to an FPGA on a carrier: a Low Pin Count (LPC) connector with 160 pins and a High Pin Count (HPC) connector with 400 pins. The VITA 57 connector was chosen to ensure developers have the functionality and performance they need to allow them to move their I/O to a mezzanine card. The connector is designed to support single-ended and differential signaling up to 2 Gbits/s and signaling to an FPGA’s Multi-Gigabit Transceivers (MGTs) up to 10 Gbits/s. The LPC connector provides 68 single-ended user-defined signals or 34 user-defined differential pairs. The HPC provides 160 single-ended user-defined signals or 80 user-defined differential pairs.
The FMC connector can support very high bandwidths; a single differential pair can provide 2 Gbits/s of bandwidth when clocked at 1 GHz since data can be transferred between the FMC and the carrier on the rising and falling edges of the clock. Utilizing 48 differential pairs (12 bits/ADC x 4) of the HPC connector clocked at 107.5 MHz (one half of the ADC sampling rate) would provide the required bandwidth to move the data from the four ADCs into the FPGA on the carrier.
The FMC modules MGT interfaces support multi-gigabit serial links. Moving the copper connectors or fiber optic transceivers from the base-FPGA design to an FMC mezzanine card makes it much easier for a single FPGA design to support various physical interfaces. Next-generation ADC and DAC chips that support the JEDEC JESD204 standard (Serial Interface for Data Converters) interface will directly connect to one or more FPGA MGT ports.
Curtiss-Wright Controls Embedded Computing