TECHNOLOGY CONNECTED
Computing vs. Electrical Power
Use Power Debug to Optimize Software for Minimal Power Consumption
With power debugging, developers can synchronize and optimize their source code to minimize energy requirements. This ensures that their project is as energy-efficient as possible without compromising the performance of the application.
ANDERS LUNDGREN AND LOTTA FRIMANSON, IAR SYSTEMS
Long battery lifetime is an essential characteristic for embedded systems in market segments ranging from medical to consumer electronics. Traditionally, minimizing power consumption has fallen to the hardware engineers. In active systems, however, power consumption depends not only on the design of the hardware, but also on how it is used—which, in turn, is controlled by the system software.
Using a technique called power debugging, software developers can map the power consumption of their systems to the program’s instruction sequence, allowing them to discover and remove errors in source code that increase power consumption. These debugging capabilities result in code that is as energy-efficient as possible without compromising the performance of the application.
The IAR C-SPY debugger visualizes the power consumption data both statically and dynamically in different views and provides both power profiling and debugging opportunities for an application. In this article we will show how to use the various views and interlinked features of the debugger to improve the performance of a sample system by better than an order of magnitude.
How Does Power Debugging Work?
The technology for power debugging is based on the ability to sample the power consumption and correlate each sample with the program’s instruction sequence and hence with the source code. One difficulty is to achieve high precision with sampling. The ideal would be to sample the power consumption with the same frequency the system clock uses, but power system capacitances tend to compromise temporal resolution, making it difficult to isolate power consumption to any one instruction.
From the software developer’s perspective this isn’t necessarily a problem since it is more interesting to correlate the power consumption with the source code and various events in the program execution than with individual instructions. The resolution needed therefore is much lower than one sample per instruction.
The IAR I-jet debug probe measures the voltage drop across a small resistor in series with the supply power to the device (see “Power Debugging—Minimizing Power Consumption by Tuning the Code,”). The voltage drop is measured by a differential amplifier and then sampled by an A/D converter.
The key to accurate power debugging is a good correlation between the instruction trace and the power samples. The best correlation can be done if complete instruction trace is available, as is the case for ARM MCUs with embedded trace module (ETM) support. The drawback with using ETM is that it requires a special debug probe and ETM support in the device itself.
A less accurate approach, but one that still gives good correlation, uses the program counter (PC) sampling facility available in the ARM Cortex-M3/M4 cores. The data watchpoint and trace (DWT) module implements the PC sampler, sampling the PC at around 10 kHz and triggering an instrumentation trace macrocell (ITM) packet for each sample taken. The ITM is the formatter for events originating from the DWT. It packetizes the events and time stamps them.
The debug probe samples the power consumption of the device using an A/D converter. By time stamping the sampled power values and the PC samples, the debugger can present power data on the same time axis as graphs like interrupt log and variable plots, and correlate power data to source code (Figure 1).
Figure 1
By time stamping the sampled power values and the PC samples it is possible for the debugger to correlate power data to the source code.
Power Debugging in Action
Let’s investigate the kinds of efficiency improvements that power debugging can deliver by running through a few examples. We will be using the EFM32 Gecko development kit from Energy Micro and the IAR Embedded Workbench for ARM. Our application is a burst waveform generator for a marine sonar system.
First, we download the application and then we open up the Power Log window. The Power Log window displays a log of all collected power samples. This window can be useful to find peaks in the power sampling. Since the power samples are correlated with the executed code, it is possible to double-click on a value in the Power Log window to get to the corresponding code. Depending on the power sampling frequency, the precision will be different, but there is a good chance that you will find the code sequence that caused the peak.
A few seconds into execution we see that the application is using almost 82 mA of power (Figure 2). Somewhere at the start of the program we have set up the application to consume a lot of power. To find the location, we set the debugger and begin stepping through the code. The Power Log window shows us that initialization of the GP I/O port increases power consumption to nearly 70 mA. GP I/O port D is only used for diagnostic outputs, which we don’t require at this point in time, so we will comment out that portion of the code. After reloading the application and running it, the Power Log window shows that consumption has dropped to less than 10 mA.
Figure 2
The Power Log window shows an average power consumption of almost 70 mA after initialization prior to debugging (top). That value drops by an order of magnitude once an unnecessary GP I/O port is deactivated (bottom).
Another way to cut power consumption is to reduce the CPU frequency of the device. Let’s investigate what happens when we reduce the frequency from 32 MHz to 14 MHz (Figure 3). First, we open up the Timeline window. In the Timeline window the power samples are displayed in a time scale together with interrupt activities and up to four user-selected application variables. Also, the Timeline window is correlated to both the Power Log window and the Source Code windows, so you are just a double-click away from the source code that corresponds to the values you see on the time line.
Figure 3
Cutting CPU frequency from 32 MHz to 14 MHz reduces peak power use from 3.5 mA to 2.5 mA.
For this application we’ll choose the Power Log and the Data Log plots. The Data Log is a log of the data breakpoints we have defined in our application. We set up one for the wave trace to show our sonar as we run our application. After startup interval of roughly four seconds, our sonar bursts begin. If we drop down to the Power Log window, we can zoom in to look at individual peaks (Figure 4). Hovering over a peak brings up a text box that displays the statistics; in this case it shows that current usage ranges from 1.6 mA to 10 mA. Now, we change the CPU frequency to 14 MHz, then we run the application again. The minimum current remains at 1.6 mA but the maximum current drops from 10 mA to 5 mA. If we check the debug logs, the application appears to be running just fine.
Figure 4
The Timeline window allows us to zoom in to resolve individual power peaks; hovering brings up a text box with specifics. Double clicking takes us to the block of source code responsible for this performance.
Function Profiling
In a task-oriented system, it is probably more interesting to see how a particular function affects power consumption than to see statement-by-statement how the power consumption changes. The Function Profiler will help uncover the functions where most time is spent during execution for a given stimulus. In this way, we can discover sections in the application that present potential for power consumption optimization.
The Function Profiler window lists the number of samples per function, the percent of energy usage, and also the average values together with maximum and minimum current values. In general, optimizing for power is very similar to optimizing for speed—the faster a task is executed, the more time can be spent in a low power mode. By maximizing the idle time, we can reduce the power consumption. Let’s use our test system as an example.
As we can see in Figure 3, power consumption in our waveform generator rises to 5 mA almost as soon as the device begins generating a waveform, even when the generator is between pulses. We need to find a way to correct this. The EFM32 Gecko features multiple low-power modes that offer the potential to save a significant amount of power. What we will do is modify the code to put the EFM32 into Energy Mode 2 when exiting the main loop, and we will put it into Energy Mode 1 at the end of each sonar burst.
Figure 3
Cutting CPU frequency from 32 MHz to 14 MHz reduces peak power use from 3.5 mA to 2.5 mA.
We can check the energy savings of this behavior using power profiling. In power profiling, we combine function profiling with power sampling to measure the power consumption per function and display that in the Function Profiler window.
In the Timeline window, we highlight an area of interest—in this case, three bursts—then bring up the function profile selection. Now we can sort to discover the functions that within the selection consume the most power (Figure 5). The list shows that our modifications worked—the system now spends more than 95% of its energy in idle mode.
Figure 5
The Power Profiler sorts the function list by greatest energy usage, showing that the bulk of system energy is consumed by idle operations
Waiting for Device Status
One common mistake that could cause unnecessary power to be consumed is to use a poll loop to wait for a status change of for example a peripheral device. Code constructions like the examples below execute without interruption until the status value changes into the expected state.
while (USBD_GetState() <
USBD_STATE_CONFIGURED);
while ((BASE_PMC->PMC_SR &
MC_MCKRDY) != PMC_MCKRDY);
Another related code construction is the implementation of a software delay as a for or a while loop like in the following example: i = 10000; // SW Delay do i--; while (i != 0);
This piece of code keeps the CPU very busy executing instructions that do not do anything except pass time.
Consider another example with our EFM32 Gecko connected to an LCD display. We download the code and start the application. If we check the Function Profiler window, we see that the functions that handle the LCD are at the top when it comes to power consumption. The Timeline window reveals current spikes that indicate heavy power usage. Because the current values are correlated to the executed code, we can easily switch to the corresponding source code by double-clicking on one of the values.
The code that we reach is a while loop where the processor waits for the LCD to be ready for update. Because the CPU is running at full speed while waiting for the LCD, the power consumption goes up. Polling of a device status change should be solved with interrupts if possible, or by using a timer interrupt so that the CPU can sleep between the polls.
The power debugging methodology described above gives the embedded developer the opportunity to examine the application program code and flow with regard to power consumption. In our primary example, system power usage dropped from 82 mA to less than 3 mA. Power debugging can detect other opportunities for improving efficiency, including optimal use of peripherals and eliminating time spent in loops.
IAR Systems
Uppsala, Sweden.
+46 18 16 78 00.
[www.iar.com].






