BROWSE ARTICLES BY TECHNOLOGY

DIGITAL EDITION

RTC Magazine Digital Edition

INDUSTRY NEWS

RECENT COMMENTS

  • Hi Juan, This article shows you how to implement a quadrature encoder interface on the FPGA using digital lines. It was written for our PCI or P...

    Meghan Meckstroth Kerry - See Article

  • Good coverage on the general advantages of COM, and X86 implementations. It would have been nice to ARM options for lower-power (handheld) applicat...

    Brian Empey, P.Eng. - See Article

  • Your article about Application Service Platforms in RTC April is another example of great reporting by RTC. Can we have a new RTC index category -...

    Kenneth G Blemel - See Article

  • Static analysis tools/scanners are a great arsenal for companies who require high quality code. It does a great job of finding a wide range of pro...

    Andrew Yang - See Article

  • I hope that the microcessor based Insulin Pump riding on my belt would be held to a higher standard. If it quits, I can work around that inconvenie...

    Karl Williamson - See Article

WHITEPAPERS

QUICK DOWNLOADS

RTEC10 is an index made up of 10 public companies which have revenue that is derived primarily from sales in the embedded sector. The companies are made up of both software and hardware companies being traded on public exchanges.

COMPANY PRICECHANGE
Kontron
7.81
4.577%
Adlink
1.54
2.388%
Advantech
2.32
1.505%
Interphase
1.61
-3.012%
Radisys
9.26
-1.016%
-   Performance Technologies2.100.000%
-   Enea5.630.000%
PLX
3.62
-3.209%
Mercury Computer
11.76
-2.931%
Elma
412.98
-0.476%
HIGH LOW MKT CAP
7.85
7.43
435.04
1.58
1.52
185.11
2.33
2.30
1,198.70
1.70
1.61
11.00
9.41
9.24
223.74
2.102.1023.34
5.635.54101.86
3.74
3.61
134.28
12.17
11.76
279.57
412.98
412.98
94.25
RTEC10 Index: 490.94 (1.11%)
RTEC10 is sponsored by VDC research

INTERCONNECT STRATEGIES

Fault Tolerance Using RapidIO

To achieve five nines availability at the system level a combined procedural, software and hardware approach is required. RapidIO was developed with the most demanding levels of availability in mind and offers a robust switched interconnect for high-performance embedded applications

VICTOR MENASCE, TUNDRA SEMICONDUCTOR

  • Page 1 of 3
    Bookmark and Share

System failure is an expensive business. If part of a mobile phone network fails for several hours during a peak period, for instance, the combined cost to the operator in terms of lost revenue and damage to reputation can totally dwarf the cost of rectifying the actual fault. Yet the cost of failure isn’t always just financial. The "cost" of a hardware or software failure on an aircraft, for example, can include the lives of its passengers and crew.

The problem is, however, that no electronics system in the real world can be made 100% reliable; it can only strive to approach the magic number. Moreover, the more complex (and invariably useful) the system, the more potential modes of failure exist, and the more failure prone it becomes.

"Availability" is a measure of system reliability in the broadest sense as it encompasses both planned and unplanned system failures and outages. Planned outages include necessary hardware and software upgrades and routine network maintenance. Unplanned outages primarily include hardware and software failures, and system failures due to operator error.

Economic viability and competitive markets demand system reliability of 99.999%—the so-called "five nines". This equates to five minutes of downtime per year per system for all sources of outage: planned and unplanned. Unfortunately for hardware designers, five nines reliability at the system level demands seven nines reliability at the hardware level because system failures are also due to software and human faults. The good news is that seven nines capability is already here in the form of robust fault tolerance technology built into the latest generation of embedded interconnect architecture called RapidIO.

System Failure Incident Rates

Recent research done by the author using data maintained by the U.S. Federal Communications Commission (FCC) in their ARMIS database revealed that the vast majority of complete electronics system level failures are caused by software-related errors (around 63%), followed by operator error (22%) with hardware failure third (15%).

The impact of these errors in terms of the actual mean outage downtime is 45% for software, 35% for operator and 20% for hardware failures. This outage downtime varies because the amount of time it takes a system to recover or be repaired is different for each type of error. These figures reflect the general fact that although software errors are more common, they are generally easier and quicker to fix than operator or hardware-induced ones.

If one estimates that planned outages on average account for three minutes per system per year, the inescapable conclusion is that there are only two minutes remaining for unplanned outages, if five nines reliability is to be maintained. Of this, hardware failure can account for no more than 24 seconds of outage per year (20% of 2 minutes), which equates to an availability of 99.99994%, or six nines.

In other words, five nines of system availability at the system level demands that the hardware be two orders of magnitude more reliable to compensate for software failures and human error. RapidIO was designed to meet this extreme reliability requirement by including built-in error detection and correction from the outset.

An Overview of RapidIO

RapidIO is a chip-to-chip (on-board) and board-to-board (backplane) packet-switched interconnect that is built into the hardware data path fabric of a system. It is based around a broad, open standards protocol that is backed by a dedicated standards committee—the RapidIO Trade Association. This presently comprises over 50 active member companies including Motorola, Alcatel, Ericsson, IBM, Cisco Systems, Lucent Technologies and Nortel Networks.

RapidIO was developed to eliminate in-system bottlenecks primarily in high-performance embedded, networking and communications applications by replacing existing bridged hierarchies of shared bus structures such as PCI and PCI-X with RapidIO (Figures 1 and 2). PCI and PCI-X have served the industry well, but are beginning to approach their ultimate migratory limits in terms of bandwidth, reliability and scalability.

For example, a typical 64-bit PCI bus running at 133 MHz can reach a data transfer rate of 1 Gbyte/s. Contemporary demands, however, are for speeds well in excess of the 1 Gbyte/s barrier, together with the ability to handle more devices with fewer pins, and to be able to work with existing PCI and CPU hardware and software.

LEAVE A COMMENT