SOFTWARE & DEVELOPMENT TOOLS
Multicore Software
Multicore – What’s the Big Deal?
Multicore is being talked about a lot and is getting plenty of attention in the press. Why do we need it, what is the state of multicore and how do we make it work?
SVEN BREHMER, POLYCORE SOFTWARE
Why do we need multicore? The simple answer is that with multicore we continue pushing the performance/power envelope through parallel processing rather than increasing the clock speed as we have done for many years. With multiple cores, the processors can run at lower frequencies, with lower supply voltages, and cores can be turned on and off based on the system load. This means that higher MIPS/watt can be achieved—that is, if the software executes in parallel.
The availability of multicore-enabled applications will drive multicore silicon revenue, but existing applications are mostly single threaded. The current software infrastructure is almost entirely aimed at single-threaded, single-processor applications. Even though some embedded applications are multi-threaded, it is not easy to migrate them to multicore because the different threads, which executed in sequence on a single-processor system, may now execute at the same time, which requires precaution to avoid destruction of shared data. The lack of multicore software and standards presents a significant barrier to entry for multicore, which must be reduced to enable broad adoption and continued silicon revenue growth.
While parallel processing has been used in high-end applications (aerospace, defense, industrial and high-performance computing) for many years, it is a poorly understood concept in the broader markets now being “exposed” by the proliferation of multicore silicon. In PCs with two going to four cores, it is currently relatively simple to take advantage of multicore with multiple (contained) applications running in parallel (SMP), which may become more challenging as the number of cores increase.
What Is the State of Multicore?
There is a variety of multicore silicon from commercially available “standard” parts to custom SoCs. For simplicity, let’s take a look at standard parts, there are:
• Homogenous cores
• Heterogeneous cores
• Wide ranging number of cores; two to many
• Shared memory
• Local memory
• Different types if interconnect
Homogenous is sometimes associated with Symmetric Multi-Processing (SMP) and heterogeneous with Asymmetric Multi-Processing (AMP). While there is some logic to that labeling, as SMP requires homogenous cores and shared memory, I believe it is more relevant to look at application requirements when we talk about SMP and AMP as programming models; more on that below.
The number of cores on a die will influence the approach to parallel processing, as two cores can be quite powerful and have substantial memory per core whereas many cores imply simpler cores and less resources per core.
Shared memory, commonly used in today’s multicore chips, has the advantage of being visible to all the cores, but the disadvantage of being shared between the cores, which creates contention for memory access, and limits scalability. Local memory has the opposite properties and will be more common as the number of cores per die goes up. A combination of shared and local memory can provide flexibility and scalability.
The bus is currently the most common interconnect; it is simple as all the cores are connected to the bus and can “see” one another. However, the bus doesn’t scale very well and will over time be combined with or replaced by other types of interconnects such as direct links between cores making up a network on chip (NoC) and others. A combination of a bus and other interconnect(s) can make migration from legacy systems easier and provide higher throughput and performance.
Software
Since the current software “infrastructure” is primarily targeted to single processing there is a lot of work to be done to make the software multicore “friendly.”
Some of the challenges are:
• Single-threaded applications – Lots of them!
• Choice of programming model
• Lack of multicore-enabled system software
• Lack of multicore software tools
• Lack of standards
Migrating single-threaded applications to multicore can be quite a challenge since the C & C++, the most popular languages in embedded systems, are sequential (no parallel concepts), so the partitioning has to be done at the system level (not counting parallelizing compilers that operate primarily with data parallelism).

Kontron
Interphase