TECH FEATURE
Switched Fabrics
Increasing Demands on Display Wall Controllers Leads to Fabric-Based Architecture
Migrating from PCI to StarFabric increased bus speeds from 1 Gbit/s to over 40 Gbits/s, allowing up to 100 wall projectors.
ERIC WOGSBERG, PRESIDENT AND CO-FOUNDER, JUPITER SYSTEMS
In many military, government and private industry situations there is a need to display large amounts of dynamic data so numerous people can collaboratively view, analyze and respond to it. The amount of data involved in such an operation can be substantial and often will not fit in a window on a desktop display or even on the entire display. The data to be displayed may include numbers, charts, graphics, video signals, satellite images, weather maps and sometimes even the entire desktop displays of one or more users.
This data must be updated frequently. Live video must look like it’s live, and application programs must respond to user input as fast as a desktop computer would. This places incredible demands for bandwidth on any system designed to handle these disparate data sources in an integrated manner. To provide the needed display, wall projectors with 1-2 million pixels resolution each are assembled into a tiled array creating a wall that can be used as a shared display. The projectors are driven by a display wall controller so the entire display surface can be treated as a single logical display.

These controllers are basically powerful computers with the ability to drive multiple outputs and to capture and display both video and high-resolution RGB signals. The advantage to this integrated approach is that video and RGB signals appear in windows, just like application programs, and can be resized and moved anywhere on the display wall. But as modern demands for data increase, the size and capabilities of the video wall must increase. Here’s how Jupiter Systems migrated a PCI-based controller system to one linked by PICMG 2.17 and the serial scheme StarFabric, and in the process realized an order of magnitude increase in performance, with lots of capability headroom.
Typical Application Environment
Figure 1 shows a modern electric utility control room. This center, installed by HB Communications of North Haven, Connecticut, contains three display walls and numerous operator consoles. Each display wall consists of eight rear-projection cubes and provides over 10 million pixels of display space. In addition to running local and networked applications on the wall, up to 16 video signals and eight high-resolution RGB signals can also be displayed.
Display walls are usually fixed installations, but they can also be small and portable. Figure 2 shows such a system with three front projectors mounted on overhead struts in a military tent setting. This tactical operations center (TOC) configuration can be set up in one hour and taken down in half that time.
Existing System Architecture
The goal was to create a next-generation controller architecture that would support more output displays, higher resolution per display and multiple live video overlays. In addition, the entire configuration needed to be scalable to as-yet-undefined configurations. In 2002 Jupiter’s display wall controllers, the Fusion 930, Fusion 950 and Fusion 970 were based on the 32-bit/33 MHz PCI bus (in a CompactPCI form-factor in the Fusion 970).

The 32-bit/33 MHz PCI bus provides 1 Gbit/s of bandwidth for CPU access to the graphics controllers and for peer-to-peer communication between the various devices on the bus. This bandwidth wouldn’t have been a problem if CPU activity were the only traffic on the bus, but the bus is also used for pixel traffic for some video signals and all of the captured RGB signals that are displayed on the wall.
The amount of bus bandwidth required to update a window containing a captured RGB signal is substantial; an SXGA signal is 1280x1024 resolution, or 1.3 Mpixels. Using 16-bit color depth and a thirty frame per second update rate requires 600 Mbits/s of bus bandwidth, about 60% of the theoretical PCI limit. Attempting to handle multiple signals at that update rate isn’t going to work. In an ideal and static world, the bus could be segmented to localize such traffic, but in the real world that isn’t feasible. It was apparent that the current architecture had reached its performance zenith, and that further demands would result in inadequate graphics performance and impinge on the very purpose of the display wall.

In addition, the market was changing. Customers were building bigger display walls and really liked the ability to bring video and high-resolution RGB signals onto the wall and were using those features more heavily. The shared PCI bus was getting hammered with all the traffic and performance was compromised, especially with larger configurations.
Quandary
The decision was made to revamp the entire product line moving from P3 and dual-P3 CPUs to P4 and dual-Xeon processors, increasing available memory to multiple gigabytes, using the latest generation of ATI Radeon graphics processors, adding Gigabit Ethernet interfaces and doubling the RGB display capability. Bus bandwidth needed improvement. One obvious alternative was to go to a wider and/or faster PCI bus. This approach would work well for the mid-range product, but was not so easy for the high-end product.
Jupiter was committed to provide hot-swap capability on this product because it is used in many mission-critical applications. That meant a Eurocard format with a passive backplane. But increasing the speed of the PCI bus to 66 MHz cut in half the number of slots that could be driven in one segment of the bus. It was undesirable to use up slot space for bridges and to put bridges on the backplane would have made it a potential failure element, one that could not be quickly or easily replaced. Furthermore, a single shared bus, no matter how fast, would still bog down under heavy use.
A separate bus for pixel traffic was considered using the serial fabric Hyper-Transport because of its high bandwidth, simultaneous bi-directional communication, low pin-count interconnect and potential for multi-node (tunneling) operation. However, this would make the system architecture very complex, play havoc with the need for hot-swap capability and make it harder to extend signals into an expansion cabinet (keeping in mind the future goal of scalability).
An advanced type of interconnect that would allow node A to talk to node B without getting in the way of traffic between all the other nodes was required. These interconnects, called switch fabrics, are common in the telecommunications industry but are new to computer architectures. Indeed, a switch fabric with a PCI interface was necessary in order to maintain some legacy interoperability with existing hardware and software. There seemed to be only one viable alternative, and that was open standard StarFabric from StarGen. Figure 3 contrasts the shared bus architecture of the Fusion 970 to the fabric-based architecture of the Fusion 980.
StarFabric Implementation
Switch
fabrics have an interesting characteristic in that the inherent bandwidth of
the fabric increases in proportion to the number of nodes connected to it. Thus
additional nodes don’t compete for a fixed bandwidth but rather increase
the total bandwidth available. This is exactly the opposite of the shared bus
scenario, where each additional node competes for a fixed amount of available
bandwidth.
The StarFabric-based system architecture is shown in the right of Figure 3. Each slot connects to the two switches. Each switch provides independent data paths from each slot to every other slot, increasing bandwidth and providing a form of fault tolerance.
Jupiter wanted to provide up to 80 display channels and 32 RGB capture channels in the high-end system and needed a high-density chassis or else large configurations would require multiple racks. The decision was made to house the CPU and disk drives in a separate 4U chassis, and a modified PICMG 2.17 chassis was designed with four power supplies, 15 peripheral slots and two switch slots. Using new quad-graphics and quad-RGB cards, the maximum configuration could be achieved with the 30 slots available in two 8U cabinets (Figure 4).
Each peripheral slot has redundant StarFabric connections to the two switch card slots, and if need be the system would run fine with one switch card. Adding a second switch card would double the interconnect bandwidth and allow removal of either switch card while the system keeps running on the remaining one, providing hot-swap capability.
Success
Performance is excellent. In a fully configured system, 20 captured SXGA signals can be displayed at 45 frames per second over the entire wall (compared with 30 frames per second in some previous PCI-based configurations). This represents 40 Gbits/s of data throughput and is an amazing improvement over the performance of any system built on a shared bus architecture.
The choice of interconnect schemes gives Jupiter good product differentiation within its product range and a significant advantage over competitors. The StarFabric interconnect has worked out so well that Jupiter is considering the expansion of the Fusion 980’s display capability to drive 100 or more projectors.
Jupiter Systems
San Leandro, CA.
(510) 667-9000.
[www.jupiter.com].


Adlink
Elma