(Size: 596.52 KB / Downloads: 96)
Theory and Techniques of Compiler Construction
This book has emerged from my lecture notes for an introductory course in compiler design at ETH
Zürich. Several times I have been asked to justify this course, since compiler design is considered a
somewhat esoteric subject, practised only in a few highly specialized software houses. Because
nowadays everything which does not yield immediate profits has to be justified, I shall try to explain why
I consider this subject as important and relevant to computer science students in general.
It is the essence of any academic education that not only knowledge, and, in the case of an engineering
education, know-how is transmitted, but also understanding and insight. In particular, knowledge about
system surfaces alone is insufficient in computer science; what is needed is an understanding of contents.
Every academically educated computer scientist must know how a computer functions, and must
understand the ways and methods in which programs are represented and interpreted. Compilers convert
program texts into internal code. Hence they constitute the bridge between software and hardware.
Now, one may interject that knowledge about the method of translation is unnecessary for an
understanding of the relationship between source program and object code, and even much less relevant
is knowing how to actually construct a compiler. However, from my experience as a teacher, genuine
understanding of a subject is best acquired from an in-depth involvement with both concepts and details.
In this case, this involvement is nothing less than the construction of an actual compiler.
Of course we must concentrate on the essentials. After all, this book is an introduction, and not a
reference book for experts. Our first restriction to the essentials concerns the source language. It would
be beside the point to present the design of a compiler for a large language. The language should be
small, but nevertheless it must contain all the truly fundamental elements of programming languages. We
have chosen a subset of the language Oberon for our purposes. The second restriction concerns the target
computer. It must feature a regular structure and a simple instruction set. Most important is the
practicality of the concepts taught. Oberon is a general-purpose, flexible and powerful language, and our
target computer reflects the successful RISC-architecture in an ideal way. And finally, the third
restriction lies in renouncing sophisticated techniques for code optimization. With these premisses, it is
possible to explain a whole compiler in detail, and even to construct it within the limited time of a course.
Chapters 2 and 3 deal with the basics of language and syntax. Chapter 4 is concerned with syntax
analysis, that is the method of parsing sentences and programs. We concentrate on the simple but
surprisingly powerful method of recursive descent, which is used in our exemplary compiler. We
consider syntax analysis as a means to an end, but not as the ultimate goal. In Chapter 5, the transition
from a parser to a compiler is prepared. The method depends on the use of attributes for syntactic
After the presentation of the language Oberon-0, Chapter 7 shows the development of its parser
according to the method of recursive descent. For practical reasons, the handling of syntactically
erroneous sentences is also discussed. In Chapter 8 we explain why languages which contain
declarations, and which therefore introduce dependence on context, can nevertheless be treated as
syntactically context free.
Up to this point no consideration of the target computer and its instruction set has been necessary. Since
the subsequent chapters are devoted to the subject of code generation, the specification of a target
becomes unavoidable (Chapter 9). It is a RISC architecture with a small instruction set and a set of
registers. The central theme of compiler design, the generation of instruction sequences, is thereafter
distributed over three chapters: code for expressions and assignments to variables (Chapter 10), for
conditional and repeated statements (Chapter 11) and for procedure declarations and calls (Chapter 12).
Together they cover all the constructs of Oberon-0.
The subsequent chapters are devoted to several additional, important constructs of general-purpose
programming languages. Their treatment is more cursory in nature and less concerned with details, but
they are referenced by several suggested exercises at the end of the respective chapters. These topics are
further elementary data types (Chapter 13), and the constructs of open arrays, of dynamic data structures,
and of procedure types called methods in object-oriented terminology (Chapter 14).
Chapter 15 is concerned with the module construct and the principle of information hiding. This leads to
the topic of software development in teams, based on the definition of interfaces and the subsequent,
independent implementation of the parts (modules). The technical basis is the separate compilation of
modules with complete checks of the compatibility of the types of all interface components. This
technique is of paramount importance for software engineering in general, and for modern programming
languages in particular.
Finally, Chapter 16 gives a brief overview of problems of code optimization. It is necessary because of
the semantic gap between source languages and computer architectures on the one hand, and our desire to
use the available resources as well as possible on the other.