Lecture 23

Speculation:  Allow the execution of one or more instructions before the processor "knows" they should be executed.

A simple example of program controlled speculation is the annuling branch of the SPARC Instruction Set Architecture.

Pentiums and AMD K6 do not support software controlled speculation.
Newer processors (the Intel Itanium or IA-64) do support software controlled speculation.

The software must mark certain instructions as speculative.

The tough requirement for hardware that performs executes instructions out of program order or speculatively is the requirement to make interrupts precise.   The last non-speculative instruction whose execution affected the architectural (programmer visible) state must be well defined.

See page 306.
Suppose the job is to set R14 from the contents of memory word B is memory word A contains zero,and otherwise set R14 to the result of an ALU operation on the data loaded from A.  (The example below is from the text. The class example was slightly different.)

The IA-64 architecture has a "load ahead" and an explicit "check load" instruction.   If the check instruction does not take an exception, then the loaded data is ok to use and becomes no longer speculative.   The registers do have extra bits to mark when their contents are speculative.  These are the "poison" bits in PH's terminology.

To extend Tomasulo's algorithm to enable speculative execution, a reorder buffer is added between the Common Data Bus and both the interface to store data in memory and to update the architectural registers.  See figure 4.34 on PH, page 311.

Speculative instruction execution steps:
(1) Issue:  A functional unit or reservation station PLUS a reorder (result) buffer slot is chosen for the current instruction (or "RISC operation combination" called a "Quad" of the AMD K6).  The instuctions are issued in program order.  The architectural name of the result register is replaced by the particular reorder buffer slot.   A future instruction with a source that is the old architectural name will be issued with source renamed to the reorder buffer slot.

(2) Execute:  (NOT done in program order!!)  Perform the operation AFTER all the input data it needs is available.  This enables issued operations to proceed as soon as all RAW hazards caused by the operation reading data have been cleared.

In general, out of order execution and speculative execution enables hardware to perform operations by following a DATA FLOW MODEL of the computation.  Each operation can be performed as soon as all the operations that source data to it have been completed.
Data flow diagram:  Nodes are operations.  Arrows are true data dependencies.

(3) Write Result:  (NOT done in program order!) Save the result of a completed operation execution in a tempory or reorder register (the one allocated to the operation by the issue stage) so that:

(4) Commit:  (DONE in program order!)   Make effect of an instruction permanent.

Steps (3) and (4) are separated so that exceptions can be precise.