Switched Fabrics and Publish- Subscribe Middleware Combine for a Robust Communications Architecture

Designers of complex, distributed systems stand to gain the most from the emergence
of switched fabrics. Table 1 shows some of the common requirements that these
designers are faced with and how those requirements are typically met with existing
bus backplane technologies. The complexity of software needed to implement current
solutions to these common requirements increase system development and post-deployment
maintenance costs. In some cases, the required complexity results in projects
that fail to complete development.

Switched fabrics such as StarFabric, PCI Express Advanced Switching (AS) and Serial
RapidIO give new freedoms to designers to implement their systems. In the case
of StarFabric, the solutions are available today. For others, solutions are just
around the corner. On VME, the new VITA 41 spec calls for the fabric interconnect
to be implemented on the P0 connector. This leaves the backplane backward compatible
with earlier specifications.

As the “switched fabric wars” rage and the switched fabric trade
associations race to establish the physical and electrical specifications for
their technologies, niceties such as the development of abstracted, full-featured
communications protocols are left to others to define, and some would argue
that this is for the best. An exception to this is InfiniBand, which defines
a very rich API. One software solution for leveraging switched fabrics to move
messages or large amounts of data with ultra-low latencies in closely coupled,
multiprocessor systems was described in an earlier issue (see “Software
Solutions for Interprocessor Communications,” p. 56, RTC, August 2003).

However, for applications requiring a more loosely coupled architecture supporting
dynamic loading, hot failover and other advanced features, we suggest that Publish-Subscribe
protocols—already popular with users of Ethernet interconnects—are
an ideal choice. Switched fabric features such as Scalability, Quality of Service
and High Availability fit very well with the way publish-subscribe operates.

Publish-Subscribe Communications

There are three common communication models: point-to-point, client-server and
more recently, publish-subscribe. Point-to-point is like a phone call. You know
the address of the remote node, establish the connection, and then communicate.
A phone call and a TCP/IP socket connection are examples of point-to-point or
connection-oriented communications.

The client-server model was created to help scale the point-to-point model. Multiple
client nodes can establish connection to a known address where a server waits
to establish connections with each client. Clients can then make requests of the
server and get replies. Overlaid on the concept of software objects, client-server
underlies remote method invocations. This model is well established and is the
basis of Microsoft’s DCOM and the Object Management Group’s (OMG)
Common Object Request Broker Architecture (CORBA) standards.

For developers building real-time distributed applications, the client-server
model has some distinct disadvantages: 1) The server represents a bottleneck and
potential single-point of failure; 2) The request-reply semantics require two
messages to get the data for each client, which increases bandwidth load and transaction
latency; and, 3) It is often based around a remote method invocation or “object-centric”
design that is not suitable for many distributed real-time applications that simply
need to communicate data and not objects. Shoehorning object-centric communication
models into “data-centric” systems frequently leads to unnecessarily
complex system designs and significantly degraded performance.

Publish-subscribe excels at real-time data distribution. Publish-subscribe is
characterized by a set of data producers and data consumers. Where client-server
has a request-reply form, publish-subscribe is more a “push” model.
That is, after the publishers and subscribers have identified themselves on the
network, the data is pushed onto the network by the publishers. Subscribers can
then pull the data off the network anonymously—no requests or polling are
required.

Another advantage is anonymous communications—publishers and subscribers
don’t need to know each other’s physical address. This is in direct
contrast to the connection-oriented communications models. The middleware keeps
track of which subscribers want which data from which publishers. This makes complex
data distribution patterns quite simple to program. This anonymity also makes
it simpler to set up redundant publishers for fault-tolerant systems. It’s
also straightforward for nodes to come into and leave the network and for applications
to be moved from node to node as required in load-balanced systems.

The OMG (which manages the CORBA standard) recognized the need for publish-subscribe
communications. In June 2003, the OMG adopted the new Data Distribution Service
for Real-Time Systems (DDS) standard. Now there is a publish-subscribe standard
for developers to use that is tailored specifically for real-time distributed
systems.

As concerns communication network topologies, Ethernet uses the carrier sense
multiple access/collision detect (CSMA/CD) algorithm for arbitrating transport
access. This algorithm’s non-deterministic method of handling contentions
or “collisions” is well-known. The TCP/IP protocol solves this issue
by providing a reliable transport protocol. However, TCP/IP is also problematic
in real-time systems. Its reliability algorithm introduces non-deterministic delays.
Also, it is a connection-oriented protocol that doesn’t scale well and is
hard to use when you need the flexibility the connectionless protocols provide.

This tradeoff between reliability on the one hand and determinism and scalability
on the other is simply not an option for many real-time distributed system designers.
One solution is to implement a replacement Ethernet transport for use with real-time
publish-subscribe middleware. Another is to leverage the capabilities of switched
fabrics.

Switched Fabrics and Publish-Subscribe

Switched fabrics have been designed from the ground up to provide scalable, reliable
and high-availability communications. Buses scale poorly. They are restricted
by physical size and bandwidth. Using networking technologies such as Ethernet
helps but introduces its own limitations such as the need to trade reliability
for determinism. Switched fabrics on the other hand are highly scalable both in
the number of nodes and in the bandwidth between nodes without the determinism
and reliability constraints. For example, StarFabric can support thousands of
nodes with interconnecting links supporting 2.5 Gbits/s in each direction. With
support for transmission distances of 10+ meters over standard Cat 5e cable, StarFabric
also allows designers to physically scale systems to room size.

Systems designed using publish-subscribe protocols are naturally scalable.
With anonymous messaging, designers can change the number of subscribers to
published data without affecting the publishing application code by simply duplicating
the subscribing code on the added nodes. Publish-subscribe also simplifies bandwidth
upgrades like those needed to improve, say, a control loop’s resolution
in an industrial automation system to support a faster sensor. The designer
simply adds the new sensor and increases the sensor’s publishing rate.
The controller node receiving the publications will be notified of new data
at the faster rate.

Publish-subscribe is inherently multicasting because it can efficiently publish
data to any node that may potentially be subscribed to the data. Unlike IP, which
relies on the stack to perform multicast function, switched fabrics implement
multicasting in the switches. The result is that protocol stack overhead is minimized.

Quality of Service (QoS) features are essentially lacking in Ethernet protocols.
Switched fabrics, however, offer rich QoS features that help designers develop
reliable, hard real-time systems.

For example, the credit-based flow control mechanisms used by StarFabric and PCI
Express AS permit bandwidth-reserved isochronous transactions across the fabric.
Isochronous transactions occur at a fixed periodic interval and fixed latency.
The result is a guaranteed messaging with deterministic behavior. For hard real-time
applications with strict latency requirements, isochronous messaging support combined
with the matching periodic publish-subscribe messages makes for a communications
architecture that is both robust and easy to program.

Just as switched fabrics offer deterministic latencies at the transport level,
DDS publish-subscribe middleware makes determinism possible at the application
level. For example, the DDS specification allows application developers to specify
a Latency_Budget QoS policy. The middleware can use this Latency_Budget policy
to better manage how it aggregates data for sending from multiple applications
running on one node to multiple applications on another node. In this manner,
publishers can ensure the middleware expedites its data versus the data of other
publishers on the node.

The ability to replace processor blades in a powered and running system is becoming
a common requirement. Switched fabrics specifications provide for physical layer
hot plug capability. DDS publish-subscribe also provides “virtual”
hot plug capability at the application level that complements switched fabrics’
support. Because publish-subscribe messaging is anonymous, the unannounced removal
of data-reading nodes from the network will not cause the errors that would occur
under a connection-based client-server model.

Switched fabrics provide rich error management features designed to support High-Availability
requirements. For instance, failures in PCI Express AS fabric paths are reported
to a Fabric Manager (FM) node that identifies the failed paths and reroutes traffic
to avoid the failure.

At the application level, DDS-compliant middleware users can set a Deadline QoS
policy on their subscribers. If the publisher doesn’t publish a new update
to the subscriber within the specified Deadline time duration, the subscribing
application is notified that new data was not available. This could indicate a
failed application on the publishing node allowing appropriate application-specific
error recovery to take place.

Support for redundant fabric paths is also a key feature of switched fabrics.
For example, StarFabric’s support for a distributed switch topology results
in each node having multiple, redundant paths to other nodes in the fabric (Figure
1). Combined with the error management features, this gives the designer simple-to-implement
physical interconnect redundancy.

At the application level, DDS provides for redundant publishers. Redundant applications
can be created that publish the exact same data onto the fabric, but with different
“strengths”. Subscribers to the data topic will receive data from
the higher “strength” publisher (the higher strength publication “masks”
the lower strength publication). As shown in Figure 2, if the higher strength
publisher fails or is removed from the network, the middleware automatically switches
the subscribers to the lower strength or backup publisher without skipping a beat.
This provides for “hot failover” redundancy.

Designers
of distributed, embedded computing systems with one or more of the following
requirements should consider the advantages of using publish-subscribe atop
a switched fabric interconnect:

  • Deterministic messaging
  • Fault-tolerance
  • High-availability requirements such as hot swap
  • Load balancing—either dynamically in the deployed system or simply
    as part of an iterative software development cycle
  • Support for scaling up number of processor cards in the future

By employing publish-subscribe the designer can extend the features of switched
fabric interconnects to the application layer thus providing not just physical
but software redundancy and determinism as well. As long as the physical distribution
of the interconnect is limited to about 10 meters, copper media-based, high-speed
switched fabric interconnects can meet these needs. For more widely distributed
systems, a mixture of both a switched fabric and Ethernet could be used—the
former for the more physically co-located, hard real-time processor cards and
the latter for remote, soft real-time subsystems. The publish-subscribe middleware
provides a common API and communications model over both networks. The combination
of switched fabrics and publish-subscribe middleware provides a robust, real-time
communications platform that greatly simplifies developing scalable, fault-tolerant,
field-maintainable distributed systems.

Dy 4
Kanata, Ontario, Canada.
(613) 599-9191.
[www.dy4.com].

Real-Time Innovations
Sunnyvale, CA.
(408) 734-4200.
[www.rti.com].