Most embedded systems today need some form of operating system. The question is, should it be a real-time operating system (RTOS) or a general-purpose operating system (GPOS)? In many cases, there is little argument. When designing a flight control system, medical instrument, or process control application, for example, most system designers will choose an RTOS, simply because these systems must meet absolute deadlines. To quote software developer Bill Gallmeister, “…it doesn’t do you any good if the signal that cuts fuel to the jet engine arrives a millisecond after the engine has exploded.”
Sometimes, however, the answer isn’t so obvious. For example, do Internet routers, automotive infotainment systems, software defined radios and consumer appliances also need an RTOS? Often, the answer is yes, even though these systems can miss deadlines without causing loss of life or limb.
To understand why, consider a system that must satisfy quality of service requirements, such as a device that presents live video. If the device depends on software for any part of its content delivery, it can experience dropped frames at a rate that users perceive as unacceptable. Likewise, consider a system where users need or expect immediate response to input. Any delay will create user annoyance, loss of confidence in the manufacturer, or, in the case of an in-car telematics system, driver distraction. By using an RTOS, the system designer can ensure that media rendering or user feedback executes in preference to other system activities.
Some would argue that faster processors and peripherals allow a GPOS to handle such timeliness requirements. However, for reasons of cost, power consumption or heat dissipation, many systems cannot use faster hardware. In systems that ship in the thousands or millions of units, for instance, even a small reduction in per-unit hardware costs can save the manufacturer a small fortune. Case in point: In the automotive telematics market, the typical 32-bit processor runs at about 600 MHz—far slower than the desktop-class processors for which most GPOSs have been tuned.
In any case, faster hardware doesn’t always guarantee predictable performance. It can improve overall performance significantly, but often fails to improve worst-case response times to events. As a further complication, most embedded systems are moving from single-purpose designs to multi-function designs, with more applications competing for resources and greater levels of concurrency and task interaction. This rising complexity makes it difficult to design systems that behave predictably under all usage scenarios and system loads.
Security also comes into play. Many people are familiar with the principles of security (for instance, complete mediation) formalized by Anderson, Saltzer and Schroeder in the 1970s. Since then, the security community has devised several enhancements, including the need to ensure that high-priority threads or operations proceed without undue delay caused by low-priority subjects or operations. Many RTOSs incorporate a strict protocol to ensure that, if system resources become sparse, only higher-priority requests float to the top of the processing queue. They also implement a prioritization protocol to ensure that higher-priority system-level requests never get “starved” out by lower-priority events.
Honor Thy Priorities
The need for predictable response times—and for RTOSs that enable them—remains common in embedded systems. The question is, what does an RTOS have that a GPOS doesn’t? And how useful are the real-time extensions now available for some GPOSs?
Let’s begin with task scheduling. In a GPOS, the scheduler typically uses a “fairness” policy to dispatch threads and processes onto the CPU. Such a policy enables the high overall throughput required by desktop and server applications, but offers no assurances that high-priority, time-critical threads will execute in preference to lower-priority threads.
For instance, a GPOS may decay the priority assigned to a high-priority thread, or otherwise dynamically adjust the priority in the interest of fairness to other threads in the system. A high-priority thread can, as a consequence, be preempted by threads of lower priority. In addition, most GPOSs have unbounded dispatch latencies: the more threads in the system, the longer it takes for the GPOS to schedule a thread for execution. Any one of these factors can cause a high-priority thread to miss its deadlines, even on a fast CPU.
In an RTOS, on the other hand, threads execute in order of their priority. If a high-priority thread becomes ready to run, it can, within a small and bounded time interval, take over the CPU from any lower-priority thread that may be executing. Moreover, the high-priority thread can run uninterrupted until it has finished what it needs to do—unless, of course, it is preempted by an even higher-priority thread. This approach, known as priority-based preemptive scheduling, allows high-priority threads to meet their deadlines consistently, even when many other threads are competing for CPU time.
In most GPOSs, the OS kernel isn’t preemptible. Consequently, a high-priority user thread can never preempt a kernel call, but must instead wait for the entire call to complete—even if the call was invoked by the lowest-priority process in the system. Moreover, all priority information is usually lost when a driver or other system service, usually performed in a kernel call, executes on behalf of a client thread. Such behavior results in unpredictable delays and prevents critical activities from completing on time.
In an RTOS, on the other hand, kernel operations are preemptible. There are still windows of time in which preemption may not occur, but in a well-designed RTOS, those intervals are extremely brief, often in the order of hundreds of nanoseconds. Moreover, the RTOS will impose an upper bound on how long preemption is held off and interrupts disabled; this allows developers to ascertain worst-case latencies.
To realize this goal, the RTOS kernel must be as simple and elegant as possible. The best way to achieve this simplicity is to design a kernel that only includes services with a short execution path. By excluding work-intensive operations, such as process loading, from the kernel and assigning them to external processes or threads, the RTOS designer can help ensure that there is an upper bound on the longest non-preemptible code path through the kernel.
In a few GPOSs, some degree of preemptibility has been added to the kernel. However, the intervals during which preemption may not occur are still much longer than those in a typical RTOS; the length of any such preemption interval will depend on the longest critical section of any modules (for instance, networking) incorporated into the GPOS kernel. Moreover, a preemptible GPOS kernel doesn’t address other conditions that can impose unbounded latencies, such as the loss of priority information that occurs when a client invokes a driver or other system service.
Mechanisms to Avoid Priority Inversion
Even in an RTOS, a lower-priority thread can inadvertently prevent a higher-priority thread from accessing the CPU—a condition known as priority inversion. When an unbounded priority inversion occurs, the system can miss critical deadlines, resulting in outcomes that range from unusual system behavior to outright failure. Many examples of priority inversion exist, including one that plagued the Mars Pathfinder project in July 1997.
Generally speaking, priority inversion occurs when two tasks of differing priority share a resource, and the higher-priority task cannot obtain the resource from the lower-priority task. To prevent this condition from exceeding a bounded interval of time, an RTOS may provide a choice of mechanisms unavailable in a GPOS, including priority inheritance and priority ceiling emulation. We couldn’t possibly do justice to both mechanisms here, so let’s focus on an example of priority inheritance.
Let’s say two jobs are running, Job 1 and Job 2, and that the two jobs share a resource controlled by a mutual exclusion lock. If Job 1 is ready to execute, but Job 2 is using the resource, Job 1 must wait until Job 2 has unlocked the resource. The time required for Job 2 to unlock the resource can’t vary according to any parameter; it must remain bounded. Otherwise, Job 1 will fail to meet its deadline.
Now let’s introduce a third job—Job 3—that has a higher priority than Job 2, but a lower priority than Job 1 (Figure 1). If Job 3 becomes ready to run while Job 2 is executing, it will preempt Job 2, and Job 2 won’t be able to run again until Job 3 blocks or completes. This, of course, will further delay Job 1 from executing. The total delay introduced by the preemption is a priority inversion. In fact, multiple jobs can preempt Job 2 in this way, yielding an unbounded priority inversion and causing Job 1 to fail to meet any of its deadlines.
Priority inheritance prevents this scenario by allowing Job 2 to temporarily inherit the priority of Job 1. This mechanism prevents Job 3 from preempting Job 2 and thereby avoids the resulting priority inversion (Figure 2).
GPOSs—including Linux, Windows and various flavors of Unix—typically lack the real-time mechanisms discussed thus far. Nonetheless, vendors have developed a number of real-time extensions and patches in an attempt to fill the gap. There is, for example, the dual-kernel approach, in which the GPOS runs as a task on top of a dedicated real-time kernel (Figure 3). Any tasks that require deterministic scheduling run in this kernel, but at a higher priority than the GPOS. These tasks can thus preempt the GPOS whenever they need to execute and will yield the CPU to the GPOS only when their work is done.
Nonetheless, tasks running in the real-time kernel can make only limited use of existing system services in the GPOS—file systems, networking, and so on. In fact, if a real-time task calls out to the GPOS for any service, it will be subject to the same preemption problems that prohibit GPOS processes from behaving deterministically. As a result, new drivers and system services must be created specifically for the real-time kernel, even when equivalent services already exist for the GPOS. Also, unlike most modern RTOSs, the real-time kernel doesn’t typically provide a robust memory-protected environment for real-time tasks. Instead, the tasks run unprotected in kernel space. Consequently, a real-time task that contains a common coding error, such as a corrupt C pointer, can easily cause a fatal kernel fault. That is a problem, since most systems that need real time also demand a very high degree of reliability.
To complicate matters, different implementations of the dual-kernel approach use different APIs. In most cases, services written for the GPOS can’t easily be ported to the real-time kernel, and tasks written for one vendor’s real-time extensions may not run on another’s.
Such solutions point to the difficulty of making a GPOS capable of supporting predictable behavior. This isn’t a matter of “RTOS good, GPOS bad,” however. GPOSs such as Linux, Windows and the various Unixes all function extremely well as desktop or server OSs. They fall short, however, when forced into environments where users expect or need consistently predictable response times.
Still, there are benefits to using a GPOS, such as support for widely used APIs and, in the case of Linux, the open source model. With open source, a developer can customize OS components for application-specific demands and save considerable time troubleshooting. To maintain these benefits, the RTOS vendor should make its source code easily accessible, under commercial-friendly licensing terms. QNX Software Systems, for example, not only publishes its source code on a community portal, but also uses a transparent development model, in which source code to the QNX Neutrino RTOS and other software components is published as it is being developed.
QNX Software Systems