

## Configurable Multiprocessing

Cmpware, Inc.

Copyright (c) 2004 Cmpware, Inc.



#### Introduction

- **Multiprocessing** is the new trend in hardware architecture and design
- **Microprocessors**: *all* newly announced desktop CPUs are multiprocessor.
- **FPGAs**: multiple processor cores found in FPGAs. "Soft" processors increasingly popular.
- **ASICs**: recent designs using several processors, with hundreds being reported.



#### Microprocessors

| <u>CPU</u>       | <u>Year</u> | <u>Transistors</u> | <u>Speed</u> |
|------------------|-------------|--------------------|--------------|
| Intel 8008       | 1972        | 3,500              | 0.5 Mhz      |
| Intel 386        | 1985        | 275,000            | 33 MHz       |
| Intel Pentium    | 1993        | 3.1 M              | 66 Mhz       |
| Intel Pentium II | 1997        | 7.5 M              | 300 Mhz      |
| Intel Pentium 4  | 2000        | 42 M               | 1 Ghz        |
| Intel Itanium 2  | 2004        | 400 M              | 1.6 Ghz      |

- 30 years of improvements (Moore's Law)
- This trend has "hit the wall"
  - Clock speeds no longer increasing
  - Power consumption cannot increase

#### **Multicore Microprocessors**

- Microprocessors going multicore
- All major desktop processor vendors have announced multicore processors
  - Hyperthreaded (HT) Intel Pentium 4
  - Sun SPARC (Gemini, Niagara, Rock)
  - IBM PowerPC 970MP
  - Sony / IBM / Toshiba Cell
- All future high-performance processor designs will be multicore

### Field Programmable Gate Arrays

- FPGAs have same clock speed and power problems as microprocessors
- Mircroprocessor cores now found in FPGAs
- Multiple PowerPC cores in Xilinx Virtex II Pro
- Altera NIOS "soft" processor cores becoming increasingly popular
- Newer ALU arrays point toward increasing cell complexity



#### **Application Specific ICs**

- ASIC vendors reporting multiple processor core designs
- Customers average six cores per design
- Designs of over a hundred cores reported
- Popular in networking, Digital Signal Processing and multimedia
- Small startups offering multiprocessor devices (*picoChip*, Cradle, QuickSilver, Icera, 3plus1, etc.)

# Configurable Multiprocessing

- Multiple CPU Cores
- On-chip interconnect network
- Convergence of CPU, ASIC and FPGA trends



# The Hardware Design Crisis

- Hardware design challenges:
  - Managing up to 1 billion transistors requires large teams and high levels of coordination
  - New silicon processes difficulties with yields, signal integrity, etc ...
  - Power power limitations becoming the primary design constraint
  - Verification as much as 70% of the design effort is now in verification. And re-spins of silicon are still common.

# The Configurable Multiprocessing Solution

- Design process greatly simplified
  - Uses large IP blocks (processor cores)
  - Can fill up even the largest die
- Verification all but eliminated
  - CMP uses pre-verified IP (processor cores and interconnection networks)
- Excellent power efficiency and performance
- Provides a highly programmable solution

# Multiprocessing by the Numbers

- Example: Arc Cores Inc. Arc600
  - 32-bit RISC CPU for embedded SoC
  - Base core: 27k gates, 8 mW @ 200 MHz
- Multiprocessing with an Arc core:
  - 3,000 Arc600 cores on a die (assuming 5T per gate @ 500M T)
  - → 600K MIPS (600 GIPS) raw performance
  - → 24 W total power consumption
  - → 25,000 MIPS / Watt (!)

# A Reprogrammable Solution

- System defined by software
- No hardware re-designs or re-spins
- Field upgradable for bug fixes and enhancements
- All design uses standard software development tools (compilers)
- None of the restrictions of earlier "reconfigurable" architectures



#### **CMP** Programming

- Use of processors permit standard High Level Languages (HLLs) such as 'C' or Java
- Large, slow and expensive hardware design tools not a part of the programming flow
- CMP communication an architectural / hardware decision that will define the programming model
- Important fact: very high communication bandwidth paths available on-chip

# Communication / Computation

- Ratio of computation to communication defines which algorithms will benefit from a parallel architecture
- System level multiprocessors have relatively powerful processors and relatively slow communication links (10,000:1 ratio)
- CMP has essentially 1:1 ratio
- CMP characteristics similar to hardware
- Lots of parallelism exposed; easy to exploit



#### **CMP** Design

- Use cheap MIPS to do everything (or almost everything) in software
- Actually reduces system complexity
- Power / performance tradeoffs available
- Processing resources can be easily redeployed and re-used
- Reuse offers further size and power advantages



# Configurable Multiprocessing from Cmpware, Inc.



Copyright (c) 2004 Cmpware, Inc.