BROWSE BY TECHNOLOGY










RTC SUPPLEMENTS


INDUSTRY INSIGHT

Software Analysis Tools

Analysis Tools Get to the Heart of Software Performance

So now the code runs. Or does it? Only analysis of many factors can reveal hidden problems and provide confidence for time and missioncritical systems.

TOM WILLIAMS, EDITOR-IN-CHIEF

  • Page 1 of 1
    Bookmark and Share

Getting code to run requires programming skill. Making sure that it runs correctly requires that it be exercised and debugged. Assuring that it runs efficiently, optimally, reliably and safely requires in-depth analysis. The tools to achieve these latter goals start from the concepts of debugging but are used to examine code from myriad different aspects. However, for embedded projects that must work and on which human life and safety often depend, they are not an option.

When applying analysis to running code, the developer is very often confronted with a kind of “Heisenberg dilemma.” That is, “What level of intrusiveness into the actual execution can I tolerate and still be confident that what I’m seeing is what will actually happen in the deployed system?” That question applies mostly to timing issues. Among the other issues are assessing the overall performance, making sure that memory is being used efficiently and reliably and finding intermittent glitches that may not always show up in a standard debugging session.

In addition to the level of intrusiveness, one must also consider the specificity of the tool. Generally speaking, and of necessity, the deeper a tool delves into the inner workings of a system, the more intrusive it tends to become as well as more specific to the underlying operating system. There was a time when a certain tool loaded instrumentation tags into the source code, which then produced instructions that shot analysis data off to a connected development host or some attached probe. That has since largely fallen out of favor—at least for embedded development, although a certain amount of instrumentation may be needed in some cases.

Generally, however, the overhead on the target is caused by a relatively non-intrusive monitor or interface that buffers and sends execution data to the host. The least intrusive and deepest analysis is possible using tools that have a hardware assist, such as a JTAG probe, and take advantage of on-chip debug facilities such as the Traceports or embedded trace macrocells (EMTs). When not relying on instrumentation, a tool must be able to reproduce the timing characteristics as they would be without the overhead introduced by the tool.

Thus, when running a profiler, for example, the code may run slower overall than the deployed system, but the results displayed must reflect the actual execution times of the various functions. Most profilers can generate accurate results whether the code is running on the target or under an instruction set simulator on the host. In the latter case, it certainly runs slower, but is able to get an accurate instruction count and deliver performance measurements that can be used to identify areas that may be bottlenecks. It is these routines that the developer will want to zero in on to try to make more efficient.

A rich set of analysis tools, known as ScopeTools is available from both Wind River Systems and from Real-Time Innovations. Targeted at Wind River platforms using the Tornado development environment, the suite consists of five tools that are representative of the kinds of analysis tasks that need to be done for quality embedded software. They examine the code and its behavior from different aspects.

StethoScope is a tool that can monitor a running system and watch a set of variables, of any memory location. It lets you trigger data collection on specific events, change variables in a running program, see peak values and save all the data to disk. The purpose of StethoScope is to provide live data analysis on a running system without interfering with the code.

ProfileScope, on the other hand, is used to diagnose the execution speed of a program on a function-by-function basis. The profiler produces histograms of the execution times of the various routines so that you can zoom in on those that represent bottlenecks and concentrate on the areas that appear to be taking the most CPU resources in an effort to improve the overall performance of the application (Figure 1).

MemScope is a visual memory analysis tool that helps to efficiently manage memory use by identifying memory leaks as they occur and to check memory consistency and find errors that may occur in the memory pool. It offers Aggregate, Tree, Time and Fragmentation views and can track the allocation and deallocation of memory. This includes a view of the full stack of allocation to help figure out why memory was allocated. The tool can be used with a running system with no need for instrumentation using special compilations.

TraceScope lets the developer follow program execution by recording any calls to a user-specified set of function in the running system. Every time a specified routine is traced, the tool records what routine was called, what task called it and what arguments were used. This tool does not record the execution of every instruction, but lets the user zero in on routines of interest and see them in the context of calling sequences.

CoverageScope is used in conjunction with testing to show what sections of code have actually been executed during testing—and therefore, what areas need yet to be exercised. The tool provides a color-coded scheme that can indicate different levels of coverage: function, block, decision and condition. While viewing results, the user can also use the source window to browse the corresponding sections of the source code. A coverage tools lets a developer establish a level of confidence, if 100

percent of the code hasn’t been covered. In some cases, cost and time considerations may indicate that further testing is resulting in finding no more errors and the code can be released at, say, 90 percent coverage. At such a point, at least one knows where one stands.

Massive Tracing

In some cases, developers need to delve even more deeply into the workings of timing relationships, intermittent glitches and the interaction of the applications with both the hardware and the operating system. In the past, in-circuit emulators were used to gather execution trace data of every cycle. With today’s highly integrated CPUs, tracing requires on-chip support. Green Hills Software has recently introduced a trace probe and analysis tool that can work in tandem on up to a whole gigabyte of recorded trace data.

The SuperTrace probe works with ARM processors, such as the ARM7, ARM9 and ARM10 that have the embedded trace macrocell (ETM), and with processors with a more generic JTAG-like trace port such as the PowerPC 405/440. With a full gigabyte of trace memory, it is able to record up to 1.7 billion cycles from a PowerPC 405, for example, running at 600 MHz. Additional target support is under development. This allows the probe to record several seconds of execution, or if using software-assisted branch analysis, up to minutes of program execution (Figure 2).

In addition, when used with Green Hills’ Integrity RTOS, the probe can support virtual memory. It does this, according to Green Hills’ VP David Kleidermacher, by understanding where all the mapping tables are and the address translation. “We can detect whenever there’s an address switch so we can tell for any point in the code what address space we’re running in. The mapping information gives us exactly what’s running at any time.”

With such massive trace capability, it was possible to apply a new kind of analysis tool called the TimeMachine, which, in effect, lets you run the program backward and forward as many times as you wish. What is actually happening, of course, is that the tool is following the recorded instruction sequences back and forth—and displaying the source code—as if the actual code were running, and stepping backward and forward as fast or slowly as the developer likes.

What such a combination of deep trace and analysis enables is an improved ability to catch and find the causes of intermittent glitches that may occur only under unusual circumstances. Such events are difficult to reproduce and may be caused by an error that occurred much earlier in the program, such as a corrupted pointer. Using the TimeMachine, one could simply let the program run until it hits a glitch and then set a watch point on a suspected variable, and then run the program backwards, set to break when the variable changes. This is what Kleidermacher calls “catching the bug in the box.” It means you don’t have to try to reproduce the possibly rare conditions that caused the bug but have a better chance of finding out what they were.

A tool like the TimeMachine can be used for non-kernel-specific diagnosis, but can also be linked with tools that are very specific to the Integrity RTOS, such as the EventAnalyzer. The EventAnalyzer shows interactions with the kernel by showing graphical displays of operating system events including interrupts, context switches, service calls and exceptions with a display that resembles a traditional logic analyzer (Figure 3). This lets you look at the virtual-to-physical mappings, even in systems that switch between multiple threads.

Kernel Awareness

In order to verify real-time operation, it is necessary to drill down to the specifics of the individual operating system. While this does yield vital information, it also limits a tool to that particular OS. However, if you need to verify the schedulability of code, a tool such as the RTXC Quadros kernel awareness tool from Quadros Systems is what is needed. The tool provides profiles of all execution entities: threads, interrupt service routines and kernel. It measures execution times, preemption times and latency to give you a worst-case timing for each entity, including how often and how long it was preempted or interfered with. This information allows you to set priorities for deadline monotonic scheduling (Figure 4).

Under deadline monotonic scheduling, the tasks with the shortest deadlines are scheduled first. It is necessary to measure under real-world conditions because as Quadros president Tom Barrett says, “Things are not always what they seem. For example, was a task’s release time stable or did it hang around a long time before it got control?” Once you have the information, the theory is that if you are sure that all the tasks can meet their deadlines, then the system is schedulable by definition.

Another scheduling tool, called Rapid RMA from Tri-Pacific Software, supports rate monotonic scheduling as well as deadline monotonic. With rate monotonic scheduling, you set the priorities of your tasks according to the rate at which they have to run, the frequency of their duty cycle. It works with Wind River’s Tornado environment and interfaces with that company’s WindView tool, which is similar to the Green Hills EventAnalyzer. Other versions of Rapid RMA interface with object-oriented UML-based graphical development tools such as Rational Rose Real Time and the Rhapsody tool from iLogix.

Looking at Java

With Java moving ever deeper into embedded systems, there is a need to analyze and verify its performance. To that end, Aonix has adopted a tool called OptimizeIt from the desktop environment that is basically a profiling tool for Java. It helps identify which pieces of the system are consuming the most CPU time. It gives a breakdown of how much time is spent in each method and can also narrow its view to blocks within a method and identify which pieces of the system are allocating memory.

OptimizeIt works with Aonix’ PERC real-time Java implementation, but can theoretically work with any Java virtual machine that supports the profiling interface API defined by Sun. The tool runs mostly on the host system and works at the VM level by connecting to the byte code and communicating with the VM in terms of ranges of byte code and mapping those back to the source code level. This makes it fairly intrusive in terms of the speed at which the system runs while OptimizeIt is gathering data. It does not, however, alter any of the code by inserting instrumentation.

Keeping track of memory is vital for getting Java to perform in embedded applications. Programmers are encouraged to think at a high level of abstraction and often do not adequately consider performance issues. In Java, memory, once allocated, is not specifically deallocated, but is returned to the system via the garbage collector. In embedded systems, it is important to keep track of how much memory is allocated and that when garbage collection does run, that it not interfere with vital tasks.

To help decide when to run garbage collection, PERC 4.1 has a pacing agent, a thread that runs on top of the VM management API and schedules garbage collection. Rate monotonic analysis helps characterize the real-time work load so that a certain amount of CPU time can be reserved for certain priorities that will not be interfered with by the garbage collector. The garbage collector can then only run in the leftover CPU time and thus does not interfere with the application’s deadlines. The pacing agent is used during development to characterize the schedulability of both priority tasks and the garbage collector. It also runs in the deployed system and looks specifically at how to add garbage collection to the rate monotonic workload.

One effect of applying analysis tools to Java, according to Aonix’ Kelvin Nielsen, is that programmers learn to develop a discipline about the effects on the system’s resources of certain kinds of programming. For example, they may learn that it’s not a good idea to allocate a lot of objects in time-critical loops or that there are certain things, like string concatenation, that you can do in Java without realizing that you’re allocating memory. The result, according to Nielsen is that, “having paid the price of education, the next time they write code, they will be thinking, ‘Oh, I know what happens when I write this kind of code.’”

That effect can probably be expanded to apply to the discipline of analyzing software in general. After all, the goal of analysis is to understand. Once we understand what’s going on (and realize how often we haven’t), we can take action to improve the code and, perhaps most important, gain confidence in it. That then feeds back into coding practice and in that sense, investment in good software analysis tools, detailed and time-consuming as they may be to use, has rewards not only in programs that run better but also in programmers who program better.

Aonix
San Diego,CA.
800) 972-6649.
[www.aonix.com].

Green Hills Software
Santa Barbara. CA.
(805) 965-6044.
[www.ghs.com].

Quadros Systems
Houston, TX.
(832) 351-2830.
[www.quadros.com].

Real-Time Innovations
Sunnyvale, CA.
(408) 734-4200.
[www.rti.com].

Tri-Pacific Software
Alameda, CA.
(510) 814-1770.
[www.tripac.com].

Wind River System
Alameda, CA.
(510) 748-4100.
[www.windriver.com].