Efficient Memory Protection for Embedded Systems

Embedded systems, like desktops and servers, are vulnerable to programming errors
that can cause unintended behavior and even system failure. A bug in one program
can overwrite data in another program or in the kernel, threatening the entire

Many techniques exist for protecting programs from the errant actions of other
programs on the same system. In desktops and servers, this is effectively done
through memory management hardware and a multi-user operating system that gives
each user an independent virtual machine environment. This isolation effectively
prevents direct access to the resources of other users or the operating system
itself, which could do great damage (Figure 1). However, desktop and server
approaches require more resources and may be impractical for an embedded system,
which typically has demanding performance requirements and memory constraints.

Another approach, called Embedded Memory Protection (EMP), is specifically designed
to provide memory protection for embedded systems. EMP provides the benefits of
memory protection without the typical overhead or complexity of desktop approaches.

Memory Protection Background

In desktop and server applications, memory protection has traditionally involved
virtual memory and a process model programming environment in which each process
has an independent virtual address range mapped to physical memory. This approach
introduces overhead to do the complex bookkeeping of mapped page tables, requires
additional memory for page tables, imposes performance penalties for accesses
outside of cached table entries, and additional performance penalties for copying
data during transfers between processes. Because of this overhead, such memory
protection approaches have been limited to use in desktop and server systems,
which enjoy ample processor and memory capacity and no demanding real-time response

Embedded systems are vulnerable to program bugs as well, even though they are
not multi-user oriented. Some embedded operating systems have employed process
model memory protection schemes as an effective defense. Typically, though, such
systems have ample memory and processor resources that can tolerate the inherent
overhead of such an approach. Such resource-rich embedded systems are generally
found in military, aerospace and high-end telecommunications applications.

In a process model approach, each process is provided with a complete virtual
address range that is mapped, page by page, to pages of physical memory. Physical
addresses are constructed by appending the address offset within the page to the
page address itself, resulting in a 32-bit address (in 32-bit architectures).

Each process can only reference addresses within its own virtual address range,
preventing it from accessing any physical memory not explicitly mapped to its
virtual address range. Mapped physical memory is able to be accessed according
to the permissions for that page, as stored in the Memory Management Unit (MMU)
by the operating system upon initialization (Figure 2).

A process model approach offers benefits to the developer by providing an independent
virtual machine to each process. Each process gets an independent, complete 32-bit
(or 64-bit on certain architectures) virtual address range that is mapped to physical
memory one page at a time. This provides inherent protection of one process from
the actions of another process. This also enables use of noncontiguous physical
memory pages to hold a process’s code and data, often convenient in dynamic-loading
systems where memory fragmentation might otherwise block program loading and memory

The Cost of Virtual Memory

While process model virtual memory systems deliver added security for embedded
applications, they also increase overhead in memory consumption and processor
performance, and programming complexity is greater for the developer. This introduces
certain costs for the benefits of this approach.

There will be an impact on the cost of hardware. Since a sacrifice in performance
might prevent an embedded system from performing its intended tasks, the cost
of a process model approach can include the need for a more powerful, more expensive
processor to make up for the increased overhead. Also, sufficient additional memory
must be provided for an embedded system to perform its intended functions. With
the increased need for memory resources, the cost of a process model approach
may require additional memory with the attendant increase in package size and
power consumption that results.

The increased program complexity will lengthen development time because development
time is directly related to the ease with which programmers can produce correct
code. Process model architectures generally require more complex programming approaches,
costing additional development time and delaying time-to-market.

These costs might be tolerable in a large military or telecommunications system
where ample memory and fast, expensive processors are common, or where cost is
not an overriding concern. But, in a typical embedded system, where resources
often are constrained to reduce cost, such drawbacks may prove prohibitive.

As a result, most embedded applications have opted to live without the benefits
of MMU support in the OS, or to suffer the additional costs associated with use
of an OS offering a process model approach. But, there is a way to enjoy the benefits
of memory protection in an embedded system without significant additional cost
in performance overhead, increased memory size or added programming complexity.

A New Approach

Memory protection can be provided for embedded applications while retaining the
small size, ease of use and high efficiency of the underlying real-time operating
system (RTOS). This offers protection for user threads and the kernel from inadvertent
overwrites caused by bugs in application code—one of the most common reasons
why embedded developers consider using memory protection. It’s possible
to meet this objective in a straightforward, efficient manner appropriate for
resource-constrained embedded systems, without introducing the additional overhead
and complexity of a process model solution.

A simple mechanism is all that’s needed to enable developers to place boundaries
around critical data structures, critical threads and the kernel, preventing unintended
access. Much like the way watertight compartments protect a ship from being flooded
by a single break in its hull, simple MMU services can prevent bugs in an application
thread from causing unintended actions outside of that thread, restricting the
malfunction and “keeping the ship afloat” until an orderly recovery
can be performed.

This approach also automatically identifies certain bugs during development, since
any attempt to cross these boundaries activates a processor trap and the offending
instruction is identified by an error handler routine. This response streamlines
test and validation, speeding time-to-market.

Memory protection also plays a role in the field, after the system has been deployed,
by protecting it against latent bugs that escaped testing. Rather than allowing
such bugs to bring down the entire system, they are automatically trapped and
reported to the system, keeping them from causing further damage to application
resources or the kernel.

The key to implementation of simple memory protection is to use the permissions
capability of the MMU and not the virtual-to-physical address translation capability.
All kernel and application memory pages are mapped directly to physical memory
pages in a one-to-one manner. Each virtual page address is mapped to the identical
physical page address. No provision for virtual demand-paging is needed since
embedded systems rarely, if ever, employ the mass storage required for such
operation (Figure 3).

This can be implemented with a few simple RTOS services that set MMU protection
registers to permit or prevent access to designated data areas and to activate
or deactivate those protections. Eliminating the complexity of multiple virtual
memory spaces saves memory and simplifies application programming, while still
providing protection from errant writes.

Access to each mapped page is specified as Read-Only, Read/Write or No Access,
according to the permissions bits of each MMU page register. Each application
thread can set up as many boundaries as desired to protect it from accessing or
overwriting memory outside those boundaries. By setting protection boundaries
around each thread, or around certain portions of a thread, the developer ensures
that such threads cannot unintentionally overwrite other memory or the kernel.
This protection effectively eliminates some of the most difficult bugs to find,
and does so without significant overhead.

How This Differs from Other Protection Approaches

Right from square one, this takes an antithetical approach to traditional memory
protection. The rule here is that protection should only be used when needed.
Protection is left off by default, so developers can decide when protection is
needed. Most virtual memory management schemes protect everything by default,
and the developer must perform some action(s) to turn protection off when it gets
in the way. The simple approach offers several advantages over process model approaches
to memory protection, while still providing added reliability and security.

For one thing, it reduces the need for additional memory. Very little code is
added to provide memory protection services, keeping the RTOS small enough to
satisfy the demanding memory requirements often found in embedded systems. In
fact, these services can be removed from an application upon deployment, enabling
the manufacturer to recover even the small amount of memory it requires.

Memory protection using EMP requires only a handful of API calls. Programming
simplicity and operational efficiency are retained by using a simple API and only
a handful of service calls. When contrasted to the multiple address ranges of
most virtual memory approaches, this is far simpler and much more appropriate
for embedded use.

Finally, EMP results in more efficient performance. Because a single “logical
equals physical” address space is used, there is no overhead introduced
at context switch time. Page translation is done completely in hardware, within
the address-generation stage of the processor pipeline. Also, because of the single,
unified address range, messages can be sent from one thread to another without
the need to copy the data, as is required when multiple virtual address ranges
are used. Zero-copy data transfers are performed at all times, further increasing

Express Logic
San Diego, CA.
(888) 847-3239.