SOFTWARE & DEVELOPMENT TOOLS
Achieving RAMS Objectives in Hard Real-Time Java Software
RAMS ”Reliability, Availability, Maintainability and Safety ”represents software quality objectives that are relevant to most embedded systems development. Disciplined use of real-time Java technologies can help developers achieve RAMS objectives.
KELVIN NILSEN, AONIX
Page 1 of 1
Creating reliable software is largely a matter of applying careful and disciplined software development and maintenance practices. However, the choice of programming language also plays an important role. Two topics that are affected by programming language issues include simplicity and portability.
Use of a simpler programming language is more likely to result in reliable software because programmers are less likely to accidentally misuse the language when they misunderstand its semantics. Programmers who are dealing with less complexity are less likely to make programming errors. C++, for example, is a very complex language, and certain of its features, such as overriding standard operators, are often misunderstood by programmers. Teams that have used C++ for large development efforts generally reach the conclusion that every team member must be an expert C++ programmer, or the whole team will suffer the mistakes made by less experienced developers.
One area in which C and C++ are much more complex than Java is the way that they manage the allocation and deallocation of temporary memory objects. C programmers use malloc() and free() function calls. C++ programmers use the new and delete() services. In both C and C++, program reliability will suffer if memory becomes fragmented, and preventing fragmentation is largely outside the realm of control of the application developer. Further complications with dynamic memory management in C and C++ programs are the risks that the memory for allocated objects will not be reclaimed, and that program components will retain pointers to objects that have been reclaimed. These common programming errors are respectively known as memory leak and dangling pointer.
Memory management in traditional Java is much simpler, and thus much more reliable, than the techniques used in C and C++. Java s automatic garbage collection system reduces memory leaks by automatically reclaiming objects that are no longer in use, and garbage collection totally eliminates the possibility of dangling pointer errors. Typical Java garbage collection implementations also defragment memory in order to further enhance reliability. The draft safety-critical Java profile provides the same benefits, but does so without the use of traditional tracing garbage collection. Instead, it allocates temporary objects on the run-time stack, and the memory for these objects is instantly reclaimed when control leaves the currently executing method. All temporary memory allocation is organized as a stack to prevent memory fragmentation. Additionally, special compile-time analysis of the application programs guarantees the absence of dangling pointers.
Figure 1 illustrates the organization of temporary memory after a main thread has spawned three child threads, setting aside portions of its run-time stack to serve the temporary memory needs of each of the spawned thread s run-time stacks. The safety conventions allow pointers from inner-nested objects to outer-nested objects, as illustrated with the solid-filled arrowheads in this figure. Pointers in the other direction are disallowed. Enforcement of this protocol is performed at compile time by a special hard real-time byte-code verifier based on annotations provided in the source code.
Reliability benefits from portable development methodologies in two important ways. First, programmers who are able to target a portable platform are less likely to make programming errors because they misunderstand or overlook incompatibilities between platforms. Second, programming languages that support portability between different platforms make possible far greater reuse of software components that were developed for one platform but reused on another. When such software is reused, it can be reused exactly as is, without any changes required for porting. For typical deployments, this means a far greater percentage of the complete software system is comprised of mature time-proven software components.
Within the domain of real-time programming, the Real-Time Specification for Java (RTSJ) does not support portable real-time semantics. Compliant implementations of this specification may differ in significant and incompatible ways, preventing real-time Java code that was developed and tested on one RTSJ implementation from running correctly on another. Compliant RTSJ implementations may differ in the scheduling of real-time threads, the times at which task deadline and CPU cost overruns are detected and handled, access to scope-allocated objects within the standard libraries, and allocation of objects within scoped-memory contexts by the standard libraries. The proposed safety-critical Java profile addresses these portability issues by subsetting from the full set of RTSJ capabilities, and carefully specifying the required semantics of the subset to ensure that all compliant implementations represent the same portable real-time programming platform.
Availability means that high-integrity software must be always ready to perform its function. Availability is often measured in terms of a quantity of nines. For example, five nines availability means the system is running reliably 99.999% of the time. Clearly, reliability contributes to availability by extending the mean time between system failures. But high availability also means reducing the time required to recover from failures when they do occur.
Providing fast, deterministic restart of a failed system is an obvious requirement for high-availability applications. Achieving this is not trivial. In typical Java environments, startup is especially troublesome because the startup process includes dynamic loading and JIT compilation of byte code. The draft safety-critical Java profile requires static compilation, initialization and linking of components. In traditional Java, the initial values of many shared variables, even of so-called constant variables, depend on the order in which certain non-deterministic startup activities are performed. The safety-critical profile s byte-code verifier, on the other hand, enforces fully deterministic initialization of shared static variables. The safety-critical Java linker binds all of the components together and initializes shared memory in the static load image. The large majority of this load image can be burned into ROM and accessed directly out of ROM.
Occasionally, highly available systems experience hardware failures. When hardware must be replaced, it may also be necessary to replace software device drivers. Few real-time operating systems provide direct support for dynamic replacement of device drivers. Larger, desktop operating systems usually support plug-and-play devices, but the protocols for use of plug-and-play technologies are not especially reliable. Often, conflicts between device drivers supplied by different vendors result in unreliable operation of the newly configured environment. A goal for the safety-critical Java profile is to support reliable and deterministic reconfiguration of device drivers, both for situations in which the device drivers are replaced without down time, and for situations in which hardware replacement requires system reboot.
Figure 2 illustrates the ability to safely and efficiently reconfigure temporary memory in order to support on-the-fly replacement of device drivers. Here, the three threads that had been previously spawned by the main thread have been terminated and replaced with four new threads that occupy the same memory. Note that the last-in, first-out process of reclaiming memory from the three original threads has the beneficial side effect of eliminating all memory fragmentation within those thread scopes.
As with many other issues, the safety-critical Java profile tackles this challenge using a combination of programmer annotations, special byte-code verification and reliable run-time memory management services. This makes it possible to guarantee at compile time that a given device driver is a suitable replacement for another.
Support for redundant computations and failover processing is not directly supported by the safety-critical Java profile. It is worthwhile to mention that the Java platform was originally introduced as an Internet programming language. As such, there is considerable experience using Java for networked applications. Since the draft safety-critical Java profile establishes a strong foundation for reuse of portable hard real-time software components, it will be straightforward to develop portable libraries to support safety-critical networked communications in support of fault-tolerant and high-availability redundant computations.
Maintaining real-time software is particularly difficult because traditional interface declarations do not reflect all of the constraints required for reliable composition of the real-time components. This means developers who are called upon to make changes to existing software cannot determine by looking at the component interface alone what rules they must follow in order for their maintenance changes to integrate reliably with other existing software. Maintainers of real-time software must search for all the contexts in which particular components reside in order to determine what sort of changes they may make to those components without compromising the reliability of the system.
Compared with C and C++, Java has shown tremendous strengths as a platform to support easy maintenance and integration. This is because all of the Java software is very portable, and because strong object-oriented abstractions mean that independently developed components integrate cleanly, without compromising the integrity of each other s encapsulation boundaries. C, in contrast, offers very little to help manage the complexity of ever-expanding software systems. With its object-oriented features, C++ does much better than C at separating concerns of independent software development teams in order to facilitate software maintenance and scalability issues. However, the lack of true portability, the inherent complexity in the language itself, and its lack of automatic garbage collection make C++ a much more difficult tool than Java in its support for software maintenance and scalability.
The proposed safety-critical Java profile contributes to maintainability by requiring real-time interface constraints to be represented in source code using standard Java annotations, enforcing the consistency of these annotated interface requirements with a special byte-code verifier, and providing a static analysis tool to automatically determine the memory and CPU-time requirements of particular components. Tools to automate the required consistency checking and resource needs analysis are not generally available for C and C++ development.
DO-178B Level-A certification requires that all code coverage analysis and testing be performed on the native machine language, and that responsibility for every machine code instruction and for every test case be traceable from original system requirements to architecture and design, to source code and test plans, and finally to machine code. If certain machine instructions are not exercised sufficiently by the existing test cases, developers are required to analyze whether the code is really necessary to satisfy the system requirements. If that code is not necessary, the corresponding source code should be removed or restructured to make it consistent with the system requirements. If the code is necessary, the test plan must be modified to make the test plan consistent with the original system requirements. In some cases, failure of test cases to cover all machine code reveals inaccuracies or inconsistencies in the original system requirements. In this case, the original system requirements must be revised. A complete traceability audit trail must be maintained at all times.
Note that the traditional Java execution model is entirely inconsistent with this requirement for traceability from source code to machine code. Traditional Java virtual machines hide the translation of byte code to machine code within a JIT compiler that is part of the run-time environment. Some sophisticated virtual machines actually produce multiple native-code translations for each byte-code method, optimizing the code differently in each translation based on run-time profiling information. The safety-critical Java profile supports deterministic compilation, linking and initialization. The entire safety-critical application is translated to native machine code and linked into a ROM-loadable image prior to execution. Since this technology is designed to support safety-critical development, tools will facilitate mappings between machine code and the corresponding source code.
When measured in terms of RAMS objectives, the draft safety-critical Java specification offers important benefits over alternative approaches based on C, C++, traditional Java and RTSJ Java. Commercial availability of the proposed safety-critical Java standard to developers of safety-critical systems will enable economical creation of high-quality hard real-time software that satisfies the RAMS objectives discussed in this article.