1
Introduction |
This document defines the architecture of the Mesa processor. It specifies the processor's virtual memory structure, its instruction interpreter, and the Mesa instruction set. (The organization of the input/output system is described separately.) The Mesa processor is part of a larger plan for the development and construction of integrated Office Information Systems.
The Principles of Operation does not discuss the implementation of any particular Mesa processor, of which there are several models, differing primarily in underlying technology, configuration, and performance. It does specify the overall design that must be followed to ensure software compatibility at the instruction set level. The architecture allows common software systems to be constructed that operate on all versions of the processor; it also allows re-implementation of the processor when it is technically or economically advantageous to do so.
The Principles of Operation, often called the PrincOps (pronounced prince ops), is composed of nine additional chapters, an Appendix, a list of references, and four indices.
DATA TYPES: Basic data types: UNSPECIFIED, BIT, NIBBLE, BYTE; basic logical and arithmetic operators; numeric types: CARDINAL, INTEGER, REAL; long types; pointer types; type conversion.
MEMORY ORGANlZATlON: Virtual memory; memory mapping; memory map instructions; major data structures; main data spaces; control transfer data structures; local and global frames; processor registers; evaluation stack; register instructions.
INSTRUCTION INTERPRETER: Instruction formats; instruction fetch; address calculation; instruction execution; opcode dispatch; exceptions: traps, faults, interrupts; initial state.
STACK INSTRUCTIONS: Stack primitives; check instructions; unary operations; logical operations; arithmetic operations; comparison operations; floating point operations.
JUMP INSTRUCTIONS: Unconditional jumps; equality jumps; signed inequality jumps; unsigned inequality jumps; indexed jumps.
CONTROL TRANSFERS: Control links: frame links, indirect links, procedure descriptors; frame allocation; control transfer primitive (XFER); control transfer instructions: local calls, external calls, nested calls, returns, coroutine transfers; traps; breakpoints; xfer traps.
PROCESSES: Process data structures; process instructions; process management; scheduling; faults; interrupts; timeout queue
ASSIGNMENT INSTRUCTIONS: Immediate instructions; frame instructions: local access, global access; pointer instructions: direct, indirect; string instructions; field instructions.
BLOCK TRANSFERS: Word boundary block transfers, checksum; byte boundary block transfers; bit boundary block transfers. bit transfer utilities, bit block transfer, text block transfer.
APPENDIX: Values of constants; register indexes; fixed memory locations; fault queue indexes; system data indexes; opcode assignments.
Section 1.1 lists the major technical characteristics of the processor; §1.2 defines some frequently used and often confusing terminology; §1.3 explains the coding conventions used in describing the operation of the processor.
All Mesa processors have the following characteristics that distinguish them from other computers:
The Mesa architecture is designed to efficiently execute high-level languages in the style of Algol, Mesa, and Pascal. That is, constructs of the programming language such as modules, procedures, and processes all have concrete representations in the processor and main memory, and the instruction set includes opcodes to implement these language constructs efficiently (for example, in procedure call and return). The processor does not "directly execute" any particular high-level programming language, however.
The Mesa instruction set is designed primarily for a compact, dense representation of programs. Instructions are variable in length. The most frequently used operations and operands are encoded in a single-byte opcode; less frequently used combinations are encoded in two bytes, and so on. The instructions themselves are chosen based on their frequency of use, and this design principle leads to an asymmetrical instruction set. For example, there are eight different instructions that can be used to store variables into local frames in memory, but only four that load local variables onto the stack from memory; this occurs because typical programs perform many more stores than loads. In some cases, a particular operation is performed so infrequently that it is not provided as an instruction, and must therefore be programmed in software (for example, a quad word add). There are other cases in which an instruction is provided for an infrequently used operation because the function performed is required at the instruction set level for technical or efficiency reasons (such as for disable interrupts or checksum calculations).
The instruction set of the processor includes a wide variety of operations for accessing partial and multiword fields of the memory's basic information unit, the sixteen-bit word. Except for system data structures defined by the architecture, there are no alignment restrictions on the allocation of variables, and data structures are generally assumed to be tightly packed in memory.
The instructions associated with a program are organized separately from the data it declares in a structure called a code segment. All code segments are entirely read-only. They are also relocatable without modification, since no information in a code segment depends on its location in virtual memory.
The Mesa processor is a stack machine; it has no general-purpose registers. (It does include special-purpose registers for maintaining processor status and state.) The evaluation stack is used as the destination for load instructions, the source for store instructions, and as both the source and the destination for arithmetic instructions. It is also used for parameter passing. The primary motivation for a stack is not to simplify code-generation, but to achieve compact program representation. Since the stack is assumed as the source and destination of one or more operands, specifying operand locations requires no bits in the instruction - they are implied by the opcode.
The architecture is designed to support modular programming. It therefore suitably optimizes the transfers of control between modules. The Mesa processor implements all transfers with a single primitive called XFER, which is a generalization of the notion of a procedure or subroutine call. All of the standard procedure-calling conventions (such as call by value, call by reference (result), and call by name) and all transfers of control between contexts (procedure call and return, nested procedure calls, coroutine transfers, traps, and process switches) are implemented using the XFER primitive. To support arbitrary control-transfer disciplines, activation records (called frames)are allocated by XFER from a heap rather than a stack; this method also allows the heap to be shared by several processes.
The architecture is designed for applications that expect a large amount of concurrent activity. The Mesa processor provides for the simultaneous execution of up to one thousand asynchronous, preemptable processes on a single processor. The process mechanism implements monitors and condition variables to control the synchronization and mutual exclusion of processes along with the sharing of resources and information among them. Scheduling is event-driven, rather than time-sliced. Interrupts, timeouts, and communication with I/O devices are also supported by the process mechanism. Support for multiple processors is under development.
The Mesa processor provides a single large, uniformly-addressed virtual memory, shared by all processes The memory is addressed linearly as an array of 232 sixteen bit words, and, for mapping purposes, is further organized as an array of 224 pages of 256 words each; it has no other programmer-visible substructure, Each page can be individually write-protected, and the processor records the fact that a page has been written into or referenced.
The architecture is designed for the execution of cooperating, not competing, processes. There is no protection mechanism (other than the write-protected page) to limit the sharing of resources among processes There is no "supervisor mode" nor "privileged" instructions.
In this section, the terms architecture, processor, and programmer are defined. Several stylistic conventions used throughout the PrincOps are also explained in this and the following sections.
Note: Paragraphs beginning with the word "Note" contain comments intended for the reader of this manual. They generally point out or explain an important convention concerning the document or the code contained within it; they do not describe the Mesa processor itself.
As used in the PrincOps, the term architecture refers to the characteristics of the processor, as seen by a programmer writing executable instructions for the machine. The term does not refer to the way the processor is integrated with other hardware and software components to form a computer system. Nor does it refer to the way hardware and firmware components might be integrated to form any specific implementation of the processor.
Note: Paragraphs beginning with the phrase "Design Note" contain important points about the design of the processor architecture. They will be of interest both to implementors and to programmers.
The term processor actually refers to a particular implementation of the Mesa processor (or more likely, to all such implementations). A processor is the collection of hardware and microcode that behaves in a manner consistent with the description contained in this manual. Each Mesa processor must achieve the same result as the code appearing here, although the implementation and in many cases even the algorithm, may be different.
For example, as mentioned above, every Mesa processor includes an evaluation stack whose behavior is described in §3.3.2. There are several ways to implement a stack, varying in such details as where the stack pointer points (to the top element? to the next available entry?) and how the checks for underflow and overflow are made. The processor may use any implementation so long as the stack pointer as seen by the programmer always points to the next available entry on the stack, as described in §3.
Note: Paragraphs beginning with the phrase "Implementation Note" describe implementation techniques. Generally, these suggest efficiency improvements or point out restrictions in addition to those contained in the code. They are directed primarily at the microcoder and the hardware designer who will implement the processor.
The term programmer refers to the person writing instructions to be executed by the processor. Because the Mesa processor is designed for use in conjunction with high-level languages, the programmers in question are usually authors of compiler code generators or low-level operating system functions. A "typical" applications programmer sees the processor not at the instruction set level, but rather at the programming language level.
Note: Paragraphs beginning with the phrase "Programming Note " are intended primarily for the programmer. Techniques for exploiting a feature of the processor are described in such paragraphs. Also, they often begin with the phrase "It is the programmer's responsibility to ensure that ... It is illegal for a program to . ..". As discussed above, these notes are often directed to the authors of the compiler or the operating system, rather than to a "typical" programmer.
Note: The statements contained in programming notes concerning the legality of a program and the conditions that it (and the programmer) must satisfy all have the same intention; they mean that if the condition is violated, the program may produce undefined results, and further, that the results obtained during execution may be different on different implementations of the processor.
The PrincOps describes the processor using Mesa itself; the data structures and algorithms are written in the Mesa language. Familiarity with the Mesa Language Manual [2] is assumed. The term code is used to refer to the Mesa source code in this document that describes the behavior of the processor; it does not refer to the programs that execute on the machine. Likewise, the term routine is used to refer to a part of the code (usually a procedure), whereas program and procedure refer to the instructions being executed by the processor. These distinctions must constantly be kept in mind.
Coding conventions beyond standard Mesa style are described in this section. The code in this document makes several assumptions that are not generally known to a Mesa programmer; these assumptions are outlined below. Certain language features normally available (notably those involving pointer dereferencing, variant records, coroutines, and processes) are not allowed in PrincOps code, primarily to simplify and clarify the descriptions. These omissions are also identified.
One of the primary reasons for describing the Mesa processor in Mesa is to bring type checking to bear on the specification; in particular, all of the code contained in this document has been compiled by the Mesa compiler to verify its syntactic and type correctness. However, because main memory is inherently typeless, it has not been possible to utilize the language's type system fully, while keeping the description simple and short. In particular, all references to main memory yield values of (some variant of) type UNSPECIFIED, and there are many pointers to UNSPECIFIED in the code. When possible, the values are assigned to temporary variables to make their types clear, but the temporaries may not have a concrete realization in the processor.
The code assumes that the underlying representation of the basic types is binary, that unsigned numeric quantities are represented in true binary notation, and that signed quantities use two's-complement representation.
Except for INTEGER, all subranges used in the code have a lower bound of zero. As a result, there is no automatic biasing of subranges. Subranges occupy only the number of bits required to store their range of values when represented as binary numbers. With one exception (the stack pointer; see §3.3.2), subranges occupy exactly an integral number of bits; that is, the upper bound of all subranges is one less than a power of two.
Enumerated types are represented by assigning (binary) zero to the first value, one to the second value, two to the third, and so on; all enumerated types in this document are declared MACHINE DEPENDENT. As with subranges, enumerated types occupy only the number of bits required to represent all of their values. All enumerations occupy an integral number of bits (that is, the number of elements in each enumeration is a power of two).
Both virtual- and real-memory addresses are represented by LONG POINTER types; short POINTERS are converted to long pointers before dereferencing. The dereferencing operator ( ) is used to follow a pointer and obtain the word(s) it addresses in real memory. This operator is applied only to LONG POINTERS that have been converted from virtual into real addresses.
Arrays and records are used to define the structures used by the processor. Their semantics are as described in the Mesa Language Manual[l], except that their components are restricted to the types defined in §2. Two additional conventions are used in the code.
First, each array element is at least one word long and is always a multiple of the word size. No additional packing is specified; that is, there are no packed arrays that would lead to hidden addressing calculations below the level of a word address. In addition the fields of a record always account for all of the bits in the words occupied by the structure, whether or not they are used. No extra bits remain free. All records are declared MACHINE DEPENDENT so that this will be checked by the compiler.
Second, two distinguished field names are used in defining record structures. The name reserved, usually accompanied by an initial value, indicates that neither the processor nor the programmer uses the field. If an initial value is given, the processor may assume that the field always has that value, and the programmer must ensure its integrity. On the other hand, the name available indicates that the field may be used by the programmer, and therefore must be left undisturbed by the processor.
Conversion between types is modeled using the Mesa operators INTEGER, CARDINAL, and LONG, together with a number of built-in routines. Details of the conversions between specific types are contained in §2.4.
The code assumes the implementation of a number of basic routines for which a more detailed description is not given; for example, the And and Or routines are used as implementations of the standard logical operators, and are not described further. A complete list of these routines is contained in §2.
Although the code is written as a collection of routines, there are several cases where procedure calls and returns can be replaced by jumps in the implementation. For example, simple opcodes are described as single routines, but the calls that invoke them could be replaced by jumps (or more likely, dispatches) from the main loop of the instruction interpreter. Their returns could be replaced by jumps back to opcode fetch. Similarly, certain utility routines are always invoked at the logical end of opcode processing, and therefore need not actually be called as procedures. A jump to their beginning addresses could be used instead. These routines could return by jumping back to the location in the interpreter to which the opcode would have returned, had a true procedure call been used.
Signals are used for global transfers of control across one or more procedure calls. There is only one signal declared in the code; Abort (§4.1). It is used to describe exception processing. This signal is raised in exactly two places: by the trap and the fault routines. It is caught by the main loop of the instruction interpreter, and is never restarted.
An unnamed ERROR is used to indicate conditions that must be established by the programmer but need not be verified by the processor (for example, the Setup routine in §8.2.2.3).
On some occasions, these two conventions are used together; a few routines catch Abort and raise ERROR (for example, the SaveProcess and LoadProcess routines in §10.4.2.1). This combination indicates that it is illegal for the routine to suffer a trap or fault (as when all the memory it references must be resident).
Implementation Note: None of the statements in the code that contain an unnamed ERROR need be included in an implementation of the processor. If a program could cause the code to generate an ERROR, the program is illegal. It may produce undefined results.
Each instruction is defined as a separate routine. The name of the routine is the same as the mnemonic for the opcode. Complex instructions are broken down into multiple routines, which are assumed to be nested in their parent opcode routine unless they are shared by more than one instruction. The generic form of an instruction description is illustrated below.
Details not covered in the summary of the instruction class appear here.
OPCODE: PROCEDURE = BEGIN ... END;
Fine points and notes appear here.
There are four indexes in the PrincOps: the Primary Index, Mesa Code Index, which is an index of the types, constants, and routines contained in the code, and the Opcode Names Index and Opcode Mnemonics Index, which are indexes of the opcode routines organized by instruction names and mnemonics. In these indexes, bold face page numbers indicate where the primary, defining information can be found; plain page numbers designate further references.
[last edited 05 March 1999]