Four Ways to Arm Linux for Real-Time Duty

Fractal Realms series. Backdrop of fractal elements, grids and symbols on the subject of education, science and technology

The Linux kernel serves as an extremely high quality and stable core for many
widely used Linux OS distributions. It’s proven itself as an OS for Internet
servers and workstations. In contrast, using Linux in a real-time operating
system role is problematic. That’s because its kernel was designed to optimize
average response over worst-case response. Linux was crafted with limited kernel
pre-emption. It was made to run the entire kernel and interrupt handlers at
high priority, and makes widespread use of FIFO queues. And the Linux kernel
uses a software timer to limit application timer resolution. Moreover, the kernel
lacks any priority inheritance both for applications and for kernel locks.

All those design choices contribute to a worst-case response time that’s
typically many orders of magnitude longer than the average response time. For
example, the time taken on a 500 MHz Pentium to start an application in response
to an interrupt is about 25µs on average, but can occasionally take 100,000µs
or more. A typical RTOS on the other hand ensures that those two times differ
only slightly, with a worst-case time of perhaps 50µs or less.

A growing number of embedded systems developers are beginning to consider and
use some form of Linux in their devices. Most embedded apps require some level
of predictable latency and response time. Developers have historically chosen
among these four approaches to cast Linux in a real-time embedded role:

1) Run a standard Linux kernel, with all of its applications, as a single low-priority
thread under control of a small external RTOS.
2) Use standard Linux with little or no modification, and hope that it can meet
the unspecified, unanalyzed requirements of some “soft” real-time
3) Convert Linux to an RTOS. This involves analyzing the Linux kernel itself,
making the minimum strategic modifications that permit its latency and response
time to be precisely bounded, thus converting Linux into a true RTOS.
4) Make a Linux-compatible RTOS. Place a Linux API layer over an existing UNIX-like
RTOS so that Linux applications can be executed with minimal change.

A Separate RTOS to Run Linux

The earliest strategy used for real-time Linux, this approach makes use of
a small RTOS running on the processor, and all application threads requiring
predictable response running directly under its control (Figure 1). Along with
those real-time application threads, the entire Linux kernel, including all
the application threads running on it, are treated as a single thread, competing
with the real-time application threads at a low priority. This approach requires
new device drivers for the RTOS. The Linux drivers will work only for devices
that are exclusively under Linux control.

Using a separate RTOS in combination with a general-purpose OS is not new.
For its part, VentureCom’s RTX kernel allows real-time applications to
run alongside, for example, Windows 2000. This approach works especially well
with operating systems that can’t be changed to support predictable response.
It requires little or no change in the target OS.

Using this approach successfully in an embedded application means dividing
applications into two separate component groups. One group runs on the RTOS
and can meet predictability requirements, and the other runs on Linux. A benefit
to this scheme is that there’s little or no memory, deadlock or overrun
protection between the real-time application threads, or between the Linux and
real-time application threads. Two examples of this approach are the Real-Time
Application Interface (RTAI) and RTLinux (from FSM Labs).

Running Standard Linux

To avoid the difficulties of the separate RTOS approach, some developers prefer
to leverage the generally good performance of Linux for running “soft real-time
applications” on Linux. For some applications, the intrinsic speed of Linux,
running on a high-speed processor, provides acceptable response times (Figure
2). Some Linux distributions go a step further by incorporating kernel patches
to address a few of the most obvious Linux predictability shortcomings. Two
examples of this approach are the Linux distributions from Red Hat and MontaVista
Software. While this technique is suited for soft real-time applications, that
category is frequently misleading. The problem is there’s no clear way
to analyze the behavior of such systems in the presence of a general-purpose
operating system such as standard Linux.

The standard Linux approach carries a significant risk of performance anomalies
late in the development lifecycle. That’s when timing anomalies almost
always first become visible. It is during the final integration phase when enough
load can be placed on the system to see the resulting performance problems.
When such anomalies occur late in the lifecycle, it is usually difficult to
fix the problem without making major changes. This increases the likelihood
of project failure.

Converting Linux to an RTOS

This approach involves analyzing each operating system resource management
component and ensuring that all resource usage and allocation is done within
a bounded time frame (Figure 3). This is the most difficult approach for the
Linux vendor. On the up side, it’s also the easiest application for designers
since virtually every Linux application can be run without change, and standard
Linux device drivers can be used.

The first step of this approach is full pre-emption, followed by microsecond-precision
timers and extensive use of individually prioritizable kernel threads for interrupts,
communications and device management components. To complete Linux’s conversion
to an RTOS, priority inversion avoidance mechanisms, including priority inheritance
and priority ceiling protocol emulation used both in the kernel and available
to the application designer, are essential.

A benefit of this strategy is it helps eliminate the risk of timing problems
emerging late in the development lifecycle during integration and final test.
Moreover, over time this approach lets application developers take advantage
of the rapid advances in Linux being continuously made by the open source community.
TimeSys Linux represents an example of this approach.

A Linux-Compatible RTOS

This approach avoids the response-time issues of Linux by not actually using
Linux at all, and it permits an existing, well-tested RTOS to be used by many
Linux applications. The prime example of this approach is LynxOS from LynuxWorks.
To enable this the RTOS vendor has to make one operating system fully compatible
with another by creating a modified API (Figure 4). Such a complex feat has
seldom seen success. A drawback is that the extensive library of Linux device
drivers cannot be used in such an RTOS. This approach permits low development
risk, but doesn’t take advantage of the rapid advances in Linux that continue
to be made by the open source community.

All of the approaches described have been applied to practical systems, but
with varying levels of complexity, lifecycle cost and risk. Developers must
carefully assess each of these approaches for their applications. This involves
addressing key questions such as: What are the performance requirements for
the application? What level of performance risk is acceptable, given the time-to-market
requirements and cost limits? What is the long-term roadmap for the product?

In regard to the performance question, applications requirements span a fairly
wide range. If response times less than about 20µs are required, probably
only the separate RTOS approach will work. If response times less than about
30,000µs are required, the separate RTOS, converting Linux to an RTOS
and the Linux-compatible approach are the only ones likely to work. If response
times above about 30,000µs are required, any of the four approaches will
probably work.

Performance risk is minimized in the separate RTOS, converting Linux to an
RTOS and the Linux-compatible approaches. Virtually any embedded application
can use them, regardless of whether they are soft or hard real-time applications.
It’s even possible to use the breakthrough reservations technology (see
sidebar) with the converting Linux to an RTOS approach to guarantee CPU and
network responsiveness.

Examining the long-term roadmap for a real-time Linux solution should reveal
what new features require additional resource management and more stringent
performance. If extensibility is an issue, converting Linux to an RTOS is very
strong, because it uses standard Linux device drivers and it supports the standard
Linux APIs. That significantly reduces the long-term cost.

Linux has clearly already emerged as the fastest growing operating system for
embedded systems. And now that there are clear ways to handle real-time predictability,
it is poised to become the de-facto standard for robust embedded systems because
of its performance, its stability, its extensibility and its extensive functional

Pittsburgh, PA.
(412) 232-3250.

Reservations Technology Takes Aim at Real-Time

While most embedded systems divide into real-time or non real-time categories,
nearly all actually combine a range of performance requirements from high
throughput to soft and/or hard real-time. For example, an industrial control
application might need to simultaneously handle an event recording function,
user configuration management interfaces (soft real-time) and several
heartbeat failure detections (hard real-time). Usually the mix consists
of a small number of hard response requirements and a much larger number
of soft real-time and/or throughput requirements.

In general implementations, virtually the only mechanism provided to
control application performance in both general-purpose and real-time
operating systems is priorities. While priorities can be used in many
kinds of systems to meet time constraints, they have some major limitations
under dynamic loads. For example, telecommunication systems must continue
to provide acceptable service on Mother’s Day or in the aftermath
of an earthquake, even though the incoming load is not well bounded. A
next-generation set-top box must continue to process an incoming video
stream without a hiccup even while it is downloading software and files
or responding to user requests.

These dynamic load situations are difficult to handle using only the
priority mechanism. Priorities do not inherently reflect dynamic timing
requirements; they merely indicate timing needs of one thread relative
to another. Under changing load conditions, priority changes are frequently
attempted using heuristics, with unpredictable results.

A technology called Reservations, previously only described in academic
real-time literature, provides a step beyond simple priorities for embedded
applications needing bounded Quality of Service (QoS) or real-time response.
It consists of the ability to reserve, in the same sense as airline seats
or hotel rooms, a certain level of CPU and/or network bandwidth.

Instead of an application thread requesting a very high priority, for
which it is difficult to predict the resulting response or throughput
performance under heavy load, the thread might request that it be guaranteed
to get 31 milliseconds of CPU time out of every 188 millisecond period,
from now on. And it wants it regardless of the level of CPU load in the
system, or the priorities of other (non-reserved) threads. Or, the thread
might want the OS to guarantee access to 2.4 Kbytes of incoming IP packets,
or 1.1 Kbytes of outgoing IP packets, to or from the network interface
card every 188 milliseconds, or all of the above.

An OS providing such reservations would first determine whether it can
meet the request, then either accept or reject the request. If accepted,
it would dynamically manage the underlying priorities (computed automatically
by the OS) such that the guarantee would be met. If the reservation isn’t
completely used, other non-reserved threads can use it, so the resource
needn’t be idle.

This means that an overloaded processor could still guarantee that certain
critical threads or groups of threads will meet their deadlines, or would
still provide the QoS that was required. It becomes possible to completely
change the designers’ thinking about managing overloads! Just have
the critical threads request the appropriate reservations with parameters
taken specifically from the critical threads’ measured performance
characteristics, and the requirements will be met.

Further, this means that systems in which some components, left to themselves,
would consume unbounded amounts of CPU or network bandwidth, such as X-Windows
or relational databases, can now, for the first time, combine such components
with time-constrained or bounded QoS threads. Just request a reservation
for the unbounded components, which can be done without changing or recompiling
those components if necessary, and the remaining components can be protected
from them.
Reservations act as a temporal firewall in both directions. They can limit
otherwise unbounded components, or they can protect critical components,
or both in the same system. And, if implemented correctly, all of this
can be done without changing the OS’s API, other than adding the
interface to request, modify or delete the reservations.