PROGRAMMING FINE-GRAINED
RECONFIGURABLE
ARCHITECTURES

by

STEVEN ANTHONY GUCCIONE, B.S.E.E., M.S.E.E.

DISSERTATION
Presented to the Faculty of the Graduate School of
The University of Texas at Austin
in Partial Fulfillment
of the Requirements
for the degree of
DOCTOR OF PHILOSOPHY

THE UNIVERSITY OF TEXAS AT AUSTIN
May 1995

PROGRAMMING FINE-GRAINED
RECONFIGURABLE
ARCHITECTURES

Publication No. ______________________

Steven Anthony Guccione, Ph.D.
The University of Texas at Austin, 1995

Supervisor: Mario J. Gonzalez, Jr.

Recently, several systems based on reconfigurable logic have been designed and built. These systems permit arbitrary digital logic functions to be configured in hardware. This ability to dynamically configure such circuits promises to provide the flexibility of a software based system with the performance of custom hardware.

This dissertation proposes a high level language approach to programming reconfigurable logic based machines. This approach uses a data parallel variant of the C language. To support this high level language approach, a novel reconfigurable logic device is described. These devices are in turn interfaced to a memory system and used to perform computation.

To demonstrate the validity of this approach, several computationally intensive algorithms are implemented and simulated. Among these algorithms are cellular automata, image processing, neural networks, the Mandelbrot set and the Fourier transform. In addition, selected portions of the Livermore FORTRAN kernels are simulated. Estimates of performance and required system resources are reported for each algorithm.


Chapter 1

Introduction

Traditionally, there have been two major methods for implementing algorithms. Most commonly, an algorithm is coded in software and implemented on a general purpose processor. In other cases, custom hardware is designed and built to implement the given algorithm. The custom hardware solution is usually characterized by its high performance, but relatively high cost and inflexibility. The software solution, on the other hand, has typically been characterized by its high degree of flexibility, but at a lower cost and lower performance. The choice of creating a custom hardware solution to a problem as opposed to a software-based solution has traditionally been based on cost and performance versus flexibility.

A relatively new technology, reconfigurable logic, provides a new way to implement algorithms that promises to change this traditional tradeoff between cost, performance and flexibility. Reconfigurable logic permits custom digital circuits to be dynamically created and modified via software. This ability to create and modify digital logic without physically altering the hardware provides a more flexible and lower cost solution to the implementation of custom hardware. This represents a significant divergence from the traditional hardware / software tradeoff.

Reconfigurable logic devices were first introduced commercially in the mid-1980s. These initial devices had a relatively low density and could be used to create custom circuits of several hundred equivalent logic gates. These early devices were used primarily by hardware designers to group random logic into a single package. This approach to logic design has two advantages. First, the combining of several integrated circuits into a single package reduces the size and cost of the circuit board. Perhaps more important, however, is the ability to modify the hardware design with a simple software change. Because the function of the reconfigurable logic device is defined by software, design errors can be corrected without having to fabricate new hardware. Existing system hardware may also be modified and upgraded without any physical modifications. Only a change to the software used by the reconfigurable logic device is required.

As the density of these devices has increased, it has became apparent that entire systems can be constructed using reconfigurable logic. This type of architecture can provide a flexible approach to producing customizable hardware. Several such systems based on reconfigurable logic have been designed and implemented. All have been used successfully to accelerate selected algorithms. Larger systems have reported performance surpassing that of conventional supercomputers while using modest amounts of hardware running at relatively low clock speeds.

While impressive levels of performance have been reported, none of these architectures are poised to challenge more traditional high-performance systems. Currently, the main limitation in the use of systems based on reconfigurable logic is software. The task of programming these systems is typically based on hardware design rather than traditional high level language programming.

It is the goal of the research described in this dissertation to describe a system which:

Chapter 2 begins with a short history of programmable logic. The uses of this technology, particularly reconfigurable logic, is outlined. Special emphasis is given to machines which use reconfigurable logic to perform general purpose computations. Other related work is also examined.

Chapter 3 explores some of the issues involved in the large scale use of reconfigurable logic. Of particular interest is the high overhead, both in terms of hardware and software, associated with the use of reconfigurable logic. Conclusions based on these architectural features are drawn.

Chapter 4 proposes a parallel programming methodology based on the computational characteristics of reconfigurable logic. Compilation and optimization techniques are presented. Finally, a novel construct for providing data sequencing is presented.

Chapter 5 examines existing reconfigurable logic devices and proposes a novel microarchitecture geared toward high-level computation. This microarchitecture directly supports the software methodology proposed in Chapter 4.

Chapter 6 discusses the system level architecture of a machine based on reconfigurable logic. This architecture is designed specifically to make use of the programming methodology described in Chapter 4 as well as the microarchitecture described in Chapter 5.

Chapter 7 describes the implementation of selected algorithms. These algorithms represent a set of popular computationally intensive algorithms typically executed on parallel or vector supercomputers. A discussion of which sorts of algorithms are not appropriate for this architecture is also presented.

Chapter 8 discusses the suitability of fine-grained reconfigurable architectures as a high-performance computing medium. Possible future directions for this technology are mentioned.