BROWSE ARTICLES BY TECHNOLOGY

DIGITAL EDITION

RTC Magazine Digital Edition

INDUSTRY NEWS

RECENT COMMENTS

  • Hi Juan, This article shows you how to implement a quadrature encoder interface on the FPGA using digital lines. It was written for our PCI or P...

    Meghan Meckstroth Kerry - See Article

  • Good coverage on the general advantages of COM, and X86 implementations. It would have been nice to ARM options for lower-power (handheld) applicat...

    Brian Empey, P.Eng. - See Article

  • Your article about Application Service Platforms in RTC April is another example of great reporting by RTC. Can we have a new RTC index category -...

    Kenneth G Blemel - See Article

  • Static analysis tools/scanners are a great arsenal for companies who require high quality code. It does a great job of finding a wide range of pro...

    Andrew Yang - See Article

  • I hope that the microcessor based Insulin Pump riding on my belt would be held to a higher standard. If it quits, I can work around that inconvenie...

    Karl Williamson - See Article

WHITEPAPERS

QUICK DOWNLOADS

RTEC10 is an index made up of 10 public companies which have revenue that is derived primarily from sales in the embedded sector. The companies are made up of both software and hardware companies being traded on public exchanges.

COMPANY PRICECHANGE
Kontron
7.81
4.577%
Adlink
1.54
2.388%
Advantech
2.32
1.505%
Interphase
1.61
-3.012%
Radisys
9.26
-1.016%
-   Performance Technologies2.100.000%
-   Enea5.630.000%
PLX
3.62
-3.209%
Mercury Computer
11.76
-2.931%
Elma
412.98
-0.476%
HIGH LOW MKT CAP
7.85
7.43
435.04
1.58
1.52
185.11
2.33
2.30
1,198.70
1.70
1.61
11.00
9.41
9.24
223.74
2.102.1023.34
5.635.54101.86
3.74
3.61
134.28
12.17
11.76
279.57
412.98
412.98
94.25
RTEC10 Index: 490.94 (1.11%)
RTEC10 is sponsored by VDC research

SOFTWARE & DEVELOPMENT TOOLS

Multicore Software

Multicore – What’s the Big Deal?

Multicore is being talked about a lot and is getting plenty of attention in the press. Why do we need it, what is the state of multicore and how do we make it work?

SVEN BREHMER, POLYCORE SOFTWARE

  • Page 1 of 3
    Bookmark and Share

Why do we need multicore? The simple answer is that with multicore we continue pushing the performance/power envelope through parallel processing rather than increasing the clock speed as we have done for many years. With multiple cores, the processors can run at lower frequencies, with lower supply voltages, and cores can be turned on and off based on the system load. This means that higher MIPS/watt can be achieved—that is, if the software executes in parallel.

The availability of multicore-enabled applications will drive multicore silicon revenue, but existing applications are mostly single threaded. The current software infrastructure is almost entirely aimed at single-threaded, single-processor applications. Even though some embedded applications are multi-threaded, it is not easy to migrate them to multicore because the different threads, which executed in sequence on a single-processor system, may now execute at the same time, which requires precaution to avoid destruction of shared data. The lack of multicore software and standards presents a significant barrier to entry for multicore, which must be reduced to enable broad adoption and continued silicon revenue growth.

While parallel processing has been used in high-end applications (aerospace, defense, industrial and high-performance computing) for many years, it is a poorly understood concept in the broader markets now being “exposed” by the proliferation of multicore silicon. In PCs with two going to four cores, it is currently relatively simple to take advantage of multicore with multiple (contained) applications running in parallel (SMP), which may become more challenging as the number of cores increase.

What Is the State of Multicore?

There is a variety of multicore silicon from commercially available “standard” parts to custom SoCs. For simplicity, let’s take a look at standard parts, there are:

• Homogenous cores

• Heterogeneous cores

• Wide ranging number of cores; two to many

• Shared memory

• Local memory

• Different types if interconnect

Homogenous is sometimes associated with Symmetric Multi-Processing (SMP) and heterogeneous with Asymmetric Multi-Processing (AMP). While there is some logic to that labeling, as SMP requires homogenous cores and shared memory, I believe it is more relevant to look at application requirements when we talk about SMP and AMP as programming models; more on that below.

The number of cores on a die will influence the approach to parallel processing, as two cores can be quite powerful and have substantial memory per core whereas many cores imply simpler cores and less resources per core.

Shared memory, commonly used in today’s multicore chips, has the advantage of being visible to all the cores, but the disadvantage of being shared between the cores, which creates contention for memory access, and limits scalability. Local memory has the opposite properties and will be more common as the number of cores per die goes up. A combination of shared and local memory can provide flexibility and scalability.

The bus is currently the most common interconnect; it is simple as all the cores are connected to the bus and can “see” one another. However, the bus doesn’t scale very well and will over time be combined with or replaced by other types of interconnects such as direct links between cores making up a network on chip (NoC) and others. A combination of a bus and other interconnect(s) can make migration from legacy systems easier and provide higher throughput and performance.

Software

Since the current software “infrastructure” is primarily targeted to single processing there is a lot of work to be done to make the software multicore “friendly.”

Some of the challenges are:

• Single-threaded applications – Lots of them!

• Choice of programming model

• Lack of multicore-enabled system software

• Lack of multicore software tools

• Lack of standards

Migrating single-threaded applications to multicore can be quite a challenge since the C & C++, the most popular languages in embedded systems, are sequential (no parallel concepts), so the partitioning has to be done at the system level (not counting parallelizing compilers that operate primarily with data parallelism).

LEAVE A COMMENT