A Systolic Array Optimizing Compiler by Monica S. Lam

By Monica S. Lam

This booklet is a revision of my Ph. D. thesis dissertation submitted to Carnegie Mellon collage in 1987. It files the learn and result of the compiler expertise built for the Warp computing device. Warp is a systolic array outfitted out of customized, high-performance processors, each one of that can execute as much as 10 million floating-point operations in keeping with moment (10 MFLOPS). less than the course of H. T. Kung, the Warp computer matured from an instructional, experimental prototype to a advertisement made from common electrical. The Warp computing device validated that the scalable structure of high-peiformance, programmable systolic arrays represents a pragmatic, low-priced solu­ tion to the current and destiny computation-intensive purposes. The good fortune of Warp resulted in the follow-on iWarp venture, a joint venture with Intel, to enhance a single-chip 20 MFLOPS processor. the provision of the hugely built-in iWarp processor can have an important effect on parallel computing. one of many significant demanding situations within the improvement of Warp used to be to construct an optimizing compiler for the desktop. First, the processors within the xx A Systolic Array Optimizing Compiler array cooperate at a great granularity of parallelism, interplay among processors has to be thought of within the iteration of code for person processors. moment, the person processors themselves derive their functionality from a VLIW (Very lengthy guideline notice) guideline set and a excessive measure of inner pipelining and parallelism. The compiler includes optimizations referring to the array point of parallelism, in addition to optimizations for the person VLIW processors.

Show description

Read or Download A Systolic Array Optimizing Compiler PDF

Best international books

Membrane Computing: 12th International Conference, CMC 2011, Fontainebleau, France, August 23-26, 2011, Revised Selected Papers

This ebook constitutes the completely refereed post-conference court cases of the twelfth overseas convention on Membrane Computing, CMC 2011, held in Fontainebleau, France, in August 2011. the nineteen revised chosen papers offered have been conscientiously reviewed and chosen from 27 papers and five posters awarded on the convention.

Foundations of Augmented Cognition. Directing the Future of Adaptive Systems: 6th International Conference, FAC 2011, Held as Part of HCI International 2011, Orlando, FL, USA, July 9-14, 2011. Proceedings

This e-book constitutes the refereed complaints of the sixth foreign convention on Augmented Cognition, FAC 2011, held in Orlando, FL, united states in July 2011, in the framework of the 14th foreign convention on Human-Computer interplay, HCII 2011, with eleven different thematically related meetings.

Logic, Algebra, and Computation: International Summer School

The Marktoberdorf summer season colleges on Informatics have been begun in 1970, that allows you to convene each moment or 3rd 12 months a gaggle of most sensible researchers in computing, dedicated to pontificate their newest effects to an elite of complicated scholars - younger and so much promising humans - and ready to face their questions, feedback and recommendations.

International Handbook of Self-Study of Teaching and Teacher Education Practices

The overseas guide on Self-study of educating and instructor schooling Practices is of curiosity to instructor educators, instructor researchers and practitioner researchers. This quantity: -offers an encyclopaedic evaluate of the sector of self-study; -examines intimately self-study in more than a few instructing and instructor schooling contexts; -outlines an entire realizing of the character and improvement of self-study; -explores the improvement of a pro wisdom base for instructing via self-study; -purposefully represents self-study via learn and perform; -illustrates examples of self-study in educating and instructor schooling.

Extra info for A Systolic Array Optimizing Compiler

Example text

In a receive operation, the third argument is the variable to which the received value is assigned; in a send operation, the third argument is the value sent. The above cell program is executed by all the cells in the array. The first loop shifts in the coefficients; the second loop computes the polynomials. In the second loop, each cell picks up a pair of xdata and yin, updates yin, and forwards both values to the next cell. By the definition of asynchronous communication, the computation of the second cell is blocked until the first cell sends it the first result.

And an illustration of the interaction between cells. The micro-operations RecX and RacY receive data from the x and Y queue in the current cell. respectively; and SandX and SandY send data to the X and Y queue of the next cell. respectively. The communication activity of each cell is A Machine Abstraction 41 captured by two time lines, one for each neighbor. The data items received or sent are marked on these lines. The solid lines connecting the time lines of neighboring cells represent data transfers on the X channel, whereas the dashed lines represent data transfers on the Y channel.

More importantly, the cut theorem can be applied only to simple systolic algorithms, in which all cells repeat the same operation all the time. Any complication such as time-variant computation, A Machine Abstraction 37 heterogeneous cell programs, or conditional statements would render this technique inapplicable. In summary, synchronous computation models in which the user completely specifies all timing relationships between different cells are inadequate for high-perfonnance arrays. The reason is that they are hard to program and hard to compile into efficient code.

Download PDF sample

Rated 4.77 of 5 – based on 33 votes