

# **RADIX – a commonly-used signed number notation (also called 2's complement)** Glossary of Common Terms

- **OPERAND – a binary number involved in an arithmetic or logical operation**
- **HALF ADDER – logic circuit that adds two binary bits to produce carry and sum outputs**
- **FULL ADDER – logic circuit that adds three binary bits to produce carry and sum outputs**
- **ADDER/SUBTRACTOR – logic circuit that adds and subtracts pairs of binary operands**
- **MAGNITUDE COMPARATOR – logic circuit that determines which binary operand is greater/less than a second binary operand**

### Glossary of Common Terms

- **CARRY LOOK-AHEAD – a means of generating the carry functions needed for addition in parallel**
- **MULTIPLIER – logic circuit that multiplies pairs of binary operands**
- **COMPUTER – device that sequentially executes a stored program**
- **PROGRAM – a series of instructions that direct the processing activity of a computer**
- **INSTRUCTION – a unit of processing activity ("line of code") executed by a computer**
- **OPCODE – the "operation code" field of an instruction**

### Glossary of Common Terms

- **MEMORY – array of D latches used to store instructions, operands, and results**
- **PROGRAM COUNTER – register that points to the next instruction to execute**
- **INSTRUCTION REGISTER – register used to "stage" the instruction fetched from memory**
- **ALU – arithmetic logic unit, performs arithmetic and logic operations on binary operands**
- **INSTRUCTION DECODER & MICROSEQUENCER – state machine that orchestrates the activities of a computer's functional blocks**

### Glossary of Common Terms

- **MICROSEQUENCE the "minute" phases of instruction processing by a computer**
- **TRANSFER OF CONTROL continue execution of program at a location different than the next consecutive instruction**
- **I/O – data input and output operations performed by a computer**
- **STACK last in, first out data structure used to support expression evaluation and subroutine linkage**
- **STACK POINTER register used to point to the top stack item (or next available location)**



# Reading Assignment:

*DDPP* 4th Ed. pp. 34-39, 5th Ed. pp. 44-48

#### Learning Objectives:

- **Compare and contrast three different signed number notations: sign and magnitude, diminished radix, and radix**
- **Convert a number from one signed notation to another**
- **Describe how to perform sign extension of a number represented using any of the three notation schemes**

### **Outline**

#### **Overview**

- **Signed number notation**
	- **Sign and magnitude**
	- **Diminished radix**
	- **Radix**
- **Comparison chart**
- **Simplifications for binary numbers**
- **Sign extension**

### **Overview**

- **In order to represent positive and negative numbers as a series of digits without "+" and "–" signs, various** *signed number notations* **have been devised**
- **We will discuss the three most commonly used signed number notations:**
	- *sign and magnitude* **(SM)**
	- *diminished radix* **(DR)**
	- *radix* **(R)**

### Sign and Magnitude

- **The original signed number convention employed by "vacuum tube vintage digijocks" was** *sign and magnitude* **notation**
- **Here the** *left-most digit* **(or "sign bit") indicates whether the number is positive or negative:**
	- **"0" positive**
	- **"R–1" negative (where R is the** *radix* **or** *base* **of the number)**

### Sign and Magnitude

- **Examples:**
	- $(+123)_{10}$  = SM(0123)<sub>10</sub>  $\overline{(-123)}_{10}$  = SM(9123)<sub>10</sub>  $(+144)_{5}$  = SM(0144)<sub>5</sub>  $(-144)_{5}$  = SM(4144)<sub>5</sub>

**SM(0123)<sub>10</sub> and SM(9123)<sub>10</sub> are referred to as the** *sign and magnitude complements* **of one another**

### Sign and Magnitude

- **Negation Method: If N is a number in base R with**  sign digit n<sub>s</sub>, such that
	- $(N)_R = n_S n_3 n_2 n_1 n_0$
	- *then*
	- $-(N)_{R} = (R-1-n_{S})n_{3}n_{2}n_{1}n_{0}$
- **Examples:**  $(+1101)_2 = SM(01101)_2$  $(-1101)_2$  **= SM(11101)**<sub>2</sub>

# Diminished Radix

- The negation (or complement) of a number represented<br>in *diminished radix* (DR) notation can be found by<br>subtracting each digit (including the sign digit) from<br>(R–1), i.e., the "radix minus one" or the "radix **diminished by one"**
- **Examples:**  $(+123)_{10}$  = DR(0123)<sub>10</sub>  $(-123)_{10} = (9999 - 0123)_{10} = DR(9876)_{10}$

 $DR(0123)<sub>10</sub>$  and  $DR(9876)<sub>10</sub>$  are referred to as the *diminisher* 

### Diminished Radix

- **Negation Method: If N is a number in base R,**  $-(N)_R = (R^n - 1)_R - (N)_R$
- **Examples:**  $(+1101)_2$  **= DR(01101)**<sub>2</sub>  $(-1101)_2$  **= DR(10010)**<sub>2</sub>

**Note that** *positive* **DR numbers have the same representation as** *positive* **SM numbers;** *negative* **DR and SM numbers, however, have different representations**

# Radix

- The negation (or complement) of a number represented<br>in radix  $(R)$  notation can be found by adding one to the<br>least significant position of the *diminished radix*<br>negation of that number
- **Examples:**  $(1123)_{10} = R(0123)_{10}$  $(-123)_{10} = (9999 - 0123 + 1)_{10} = R(9877)_{10}$

 $R(0123)_{10}$  and  $R(9877)_{10}$  are referred to as the *radix complements* **of one another**

### Radix

- **Negation Method: If N is a number in base R,**  $-(N)_R = (R^n)_R - (N)_R$
- **Examples:**  $(+1101)_2$  **= R(01101)**<sub>2</sub>  $(-1101)$ <sub>2</sub> = R(10011)<sub>2</sub>

**Note that** *positive* **R, DR, and SM numbers all have the same representation;** *negative* **R, DR, and SM numbers, however, all have different representations**





– **DR (also called** *1's complement* **): complement each** 











### **Practice**

- $\bullet$  Extend SM(09345)<sub>10</sub> to 8 digits
- Extend DR(76500)<sub>8</sub> to 8 digits
- $\bullet$  Extend  $R(01100)_2$  to 8 digits
- $\bullet$  Extend  $R(11100)_2$  to 8 digits



**SM(00009345)<sub>10</sub>** 

### Comparison/Observations

- **SM and DR notation have a** *balanced set* **of positive and negative numbers, and have** *two representations* **for zero**
- **R notation has an** *unbalanced* **set of positive and negative numbers (there is an "extra negative number"), and has a** *single* **representation for zero**
- **99% of all computers use** *radix* **notation; our discussion on addition and subtraction will therefore focus on** *radix arithmetic* **(we will assume a prefix of "R" on all numbers subsequently used)**







**28**

#### Reading Assignment: *DDPP* 4th Ed. pp. 39-43, 5th Ed. pp. 48-52

Learning Objectives:

- **perform radix addition and subtraction**
- **describe the various conditions of interest following an arithmetic operation: overflow, carry, negative, zero**

### **Outline**

- **Radix Addition**
- **Overflow Detection Radix Subtraction**
- 

# Radix Arithmetic – Addition

- **Method: Add all digits, including the sign digits;** *ignore* **any carry out of the sign position**
- **Problem: Since we are working with numbers of fixed length, the result of an addition can yield a number which is** *too large* **to represent in the same number of digits – this error condition is called** *overflow*
- **Important: When overflow occurs, there is** *no valid numeric result*

### Overflow Detection

- **Summarization: Overflow occurs if two positive numbers are added and a negative result is obtained, or if two negative numbers are added and a positive result is obtained (or, if numbers of** *like sign* **are added and a result with the** *opposite sign* **is obtained)**
- **Overflow** *cannot* **occur when adding numbers of the**  *opposite sign*
- **Another way to detect overflow: If the** *carry in* **to the sign position is** *different* **than the** *carry out* **of the sign position, then overflow has occurred**

























**48**

- **2. When subtracting the five-bit signed numbers (10111)<sub>2</sub> - (11001)<sub>2</sub> using radix arithmetic, the result obtained is:**
	- $A. (10000)_2$
	- **B.** (11000)<sub>2</sub>
	- **C.** (11110)<sub>2</sub>
	- **D. overflow** *(invalid result)*
	- **E. none of the above**



#### Reading Assignment:

*DDPP* 4th Ed. pp. 458-466, 469-478; 5th Ed. pp. 331-339, 341-345, 372-375

- Learning Objectives:
- **Describe the operation of a half-adder and write equations for its sum (S) and carry (C) outputs**
- **Describe the operation of a full adder and write equations for its sum (S) and carry (C) outputs**
- **Design a "population counting" or "vote counting" circuit using an array of half-adders and/or full-adders**
- **Design an N-digit radix adder/subtractor circuit with condition codes**
- **Design a (signed or unsigned) magnitude comparator circuit that determines if A=B, A<B, or A>B**

# **Outline**

**49**

- **Overview**
- **Half Adders**
- **Full Adders**
- **Radix Adder/Subtractors**
- **Comparators**

#### **Overview**

- **Addition is the most commonly performed arithmetic operation in digital systems**
- **An** *adder* **combines two arithmetic operands using the addition rules described previously**
- **The same addition rules (and circuits) are used for both**  *signed* **(two's complement) and** *unsigned* **numbers**
- **Subtraction can be performed by taking the** *complement*  **of the subtrahend and adding it to the minuend**

### Half Adders

 $\bullet$  A half adder is used to add two binary digits,  $X_i$  and  $Y_i$ , **to form a sum digit, Si , and a carry digit, Ci**

**Xi Yi Ci Si 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 0 Si = Xi Yi Ci = Xi • Yi**













- **B. 32**
- 
- **C. 64 D. 128**

**59**

**E. none of the above** 

**60**

**2. Implemented using a 22V10 PLD, a circuit that finds the sum of three 2-bit unsigned numbers would require no more than \_\_\_ macrocells. A. 2**

- **B. 4**
- **C. 8**
- **D. 16**
- **E. none of the above**

### Multi-Digit Adder/Subtractor Circuits

- **Two binary words, each with** *n* **bits, can be added using a** *ripple* **adder – a cascade of** *n* **full-adder stages, each of which handles one bit (also called an** *iterative* **circuit)**
- **The word** *ripple* **describes the flow of the carries from one full adder cell to the next**
- **Subtraction is performed by taking the** *diminished radix* **complement of the subtrahend (using XOR gates) and**  setting the *least significant bit carry-in* (LSB C<sub>in</sub>) to "1" **(effectively forming the** *radix* **complement of the subtrahend)**

### Review of Radix Addition

- **Method: Add all digits, including the sign digits; ignore any carry out of the sign position**
- **Problem: Since we are working with numbers of fixed length, the result of an addition can yield a number which is** *too large* **to represent in the same number of digits – this error condition is called** *overflow*
- **Important: When overflow occurs, there is** *no valid numeric result*

### Overflow Detection

**61**

- **Summarization: Overflow occurs if two positive numbers are added and a negative result is obtained, or if two negative numbers are added and a positive result is obtained (or, if numbers of** *like sign* **are added and a result with the** *opposite sign* **is obtained)**
- **Overflow** *cannot* **occur when adding numbers of the**  *opposite sign*
- **Another way to detect overflow: If the** *carry in* **to the sign position is** *different* **than the** *carry out* **of the sign position, then overflow has occurred**















### Looking Ahead…

- **The "C" (carry) flag serves multiple purposes**
- Extended precision add/subtract ("extended" = integer multiples of 32 **bits, if using a 32-bit processor)**
- **Add "C" bit (***being set***) represents the presence of a** *carry propagated forward*  **(from least significant word to most significant word of extended number) Subtract – "C" bit (***being set***) represents the absence of a** *borrow propagated forward* **(i.e., the** *complement* **of a borrow propagated forward, which means that**
- **indicates that no borrow is propagated forward) So…***regardless of whether an add or subtract is performed***, the "C" bit is simply So...** *rega* set to the
- **Conditional execution (change of flow)**
- **Flags set based on subtract of operands being compared (e.g. CMP instruction)**







**77**





### **Comparators**

- **Comparing two binary words for equality is a commonly used operation in computer systems – a circuit that does this is called a** *comparator*
- **XOR and XNOR gates may be viewed as** *one-bit comparators* **(from which larger comparators can be built)**
- **Circuits that determine an arithmetic relationship between two operands (greater or less than) are called** *magnitude comparators*

#### Example – 4-bit Magnitude Comparator

- **Design a 4-bit (signed) magnitude comparator that determines if A=B, A<B, or A>B**
- **Solution: Calculate (A–B) and examine the produced for each case**
	- **ZERO ("Z" for zero flag)**
	- **NEGATIVE ("N" for negative flag)**
	- **CARRY ("C" for carry/borrow flag)**
	- **OVERFLOW ("V" for overflow flag)**

**Need to know how the condition codes are affected for all possible results generated**



#### Example – 4-bit Magnitude Comparator

Step 1: Determine condition codes produced for all possible (2-bit) cases of A–B





















# Reading Assignment:

*DDPP* 4th Ed. pp. 478-482, 484-487, 490-491; 5th Ed. pp. 376-383, 384-386

Instructional Objectives:

- **Describe the operation of a carry look-ahead (CLA) adder circuit, and compare its performance to that of a ripple adder circuit**
- **Define the CLA propagate (P) and generate (G) functions, and show how they can be realized using a half-adder**
- **Write the equation for the carry out function of an arbitrary CLA bit position**
- **Draw a diagram depicting the overall organization of a CLA**
	- **Determine the worst case propagation delay incurred by a practical (PLD-based) realization of a CLA**
- **Describe how a "group ripple" adder can be constructed using N-bit CLA blocks**

### **Outline**

- **Overview**
- **CLA derivation**
- **CLA organization**
- **Sample CLA realization in Verilog**
- **Observations**

#### **Overview**

- **Previously we looked at one method of constructing an n-digit binary adder from** *n* **full adders: connecting the**  *carry out* **from one stage to the** *carry in* **of the next in a**  *ripple* **fashion**
- **For large values of** *n***, the propagation delay of a ripple adder can be excessive**
- A significant speed-up could be obtained by calculating<br>the carries in parallel, rather than *iteratively* a design<br>that accomplishes this goal is the carry look-ahead<br>(CLA) adder circuit (look-ahead anticipated)

### CLA Derivation

**Consider the 4-bit binary adder:**



### CLA Derivation

**Definition:** The *generate function* **G**<sub>i</sub> = 1 if there is a *carry out* **of stage i regardless of whether or not there is a** *carry in* **to stage** *i* **(i.e., both**  $X_i$  **and**  $Y_i$  **are 1)** 

### $G_i = X_i \cdot Y_i$

- Definition: The *propagate function* **P**<sub>i</sub> = 1 if a carry in to **stage i will cause a carry out of stage i (i.e.,** *either* **Xi = 0 and Y<sub>i</sub> = 1** *or* **X<sub>i</sub> = 1 and Y<sub>i</sub> = 0)**  $P_i = X_i \oplus Y_i$
- **NOTE: Another valid definition of Pi is Xi +Yi the "XOR" definition leads to some CLA circuit simplifications, however**





# CLA Derivation

- **Next we would like to write our basic full adder equations in terms of propagate and generate functions**
- **To do this, we will need to reexamine the K-maps for the sum and carry equations of the full adder**







# CLA Derivation

 **Rewriting the equations for our 4-bit binary adder, we obtain the following:**

 $C_{-1} = C_{in}$  $C_0 = G_0 + C_{in} \cdot P_0$  $C_1 = G_1 + C_0 \cdot P_1$ 

- $C_2 = G_2 + C_1 \cdot P_2$
- $C_3 = C_{\text{out}} = G_3 + C_2 \cdot P_3$

# CLA Derivation

 **We would like to write these equations in terms of**  *available inputs* (P's, G's, and C<sub>in</sub>) rather than the<br>*intermediate carries* (C<sub>0</sub>, C<sub>1</sub>, etc.) – the key to doing<br>this is *successive expansion* of the previous equations in terms of the equation for C<sub>0</sub>:  $C_1 = G_1 + C_0 \cdot P_1$ 

$$
= G_1 + (G_0 + C_{in} \cdot P_0) \cdot P_1
$$



# CLA Derivation – Exercise **Write the remaining 4-bit CLA adder carry equations:**  $C_2 =$  $C_3 =$

# CLA Derivation – Exercise



 $C_3 =$ 

# CLA Derivation – Exercise

 **Write the remaining 4-bit CLA adder carry equations:**  $C_2 = G_2 + G_1 \cdot P_2 + G_0 \cdot P_1 \cdot P_2 + C_{IN} \cdot P_0 \cdot P_1 \cdot P_2$ 

 $C_3 = G_3 + G_2 \cdot P_3 + G_1 \cdot P_2 \cdot P_3 + G_0 \cdot P_1 \cdot P_2 \cdot P_3$  $+ C_{IN} \cdot P_0 \cdot P_1 \cdot P_2 \cdot P_3$ 











### **Observations**

- Note that *regardless* of the adder length  $(n)$ , the time required<br>to produce any sum digit is the same i.e., all sum digits are<br>produced *in parallel*
- **Large CLA adders are difficult to build in practice because of the "product term explosion" that occurs as the carry equations are expanded**
- **A reasonable compromise is to make a** *group ripple adder* **by cascading m-bit CLA blocks together to make a kxm-bit adder (where k is the number of CLA blocks)**
- **The "+" operator in Verilog synthesizes most suitable adder realization based on** *optimization constraints* **(area, fmax, etc.)**



#### Reading Assignment: *DDPP* 4th Ed. pp. 45-47, 494-496, 503; 5th Ed. pp. 54-56, 416-419

Learning Objectives:

- **Describe the operation of an unsigned multiplier array constructed using full adders**
- **Determine the full adder arrangement and organization (rows/diagonals) needed to construct an NxM-bit unsigned multiplier array**
- **Determine the worst case propagation delay incurred by a practical (PLD-based) realization of an NxM-bit unsigned multiplier array**

# **Outline**

- **Overview**
- **Product components**
- **Example circuit**
- **Critical path analysis**
- **Generalizations**
- **Realizations in Verilog**

#### **Overview**

**Consider a 3x3 unsigned binary multiplication:**



### Product Components

- **Most approaches to (unsigned) combinational binary multiplication are based on the "paper-and-pencil"**  *shift and add* **algorithm**
- **Each row is called a** *product component*  **a shifted multiplicand that is multiplied by 0 or 1 depending on the corresponding multiplier bit**
- **Each xi yj term represents a** *product component bit* **(the logical AND of multiplicand bit xi with multiplier bit yj )**
- The *product*  $P = p_5p_4p_3p_2p_1p_0$  is obtained by adding **together all the product components**







# Generalizations

- **Generalizations for an NxM multiplier** – **N = number of bits in multiplicand (top)**
	- **M = number of bits in multiplier (bottom)**
	- **produces an N+M digit result**
	- **requires NxM AND gates to generate the product components**
	- **requires N–1** *diagonals* **of full adders**





















**133**

**135**

**137**



5. Assuming a large 10 ns PLD was used to generate each product component bit and implement each full adder cell, the worst case propagation delay of a 6x4 unsigned binary multiplier array would be \_\_\_ ns **A. 80 B. 90 C. 100 D. 110 E. none of the above**



7. A 4x6 unsigned binary multiplier array would require \_\_\_ "diagonals" of full adder cells **A. 3 B. 4 C. 5 D. 6 E. none of the above**



9. A 4x6 unsigned binary multiplier array would require \_\_\_ AND gates to generate the product component bits **A. 10 B. 18 C. 20 D. 24 E. none of the above**

**139**



### Realizations in Verilog

**Keys**

- **use** *expressions* **to define product components** – **use the** *addition operator* **(+) to form unsigned sum of product components**
- **Example: 4x4 multiplier realization**







**142**



#### Reading Assignment: *DDPP* 4th Ed. pp. 48-51; 5th Ed. pp. 58-60

Learning Objectives:

- **Describe the operation of a binary coded decimal (BCD) "correction circuit"**
- **Design a BCD full adder circuit**
- **Design a BCD N-digit radix (base 10) adder/subtractor circuit**

### **Outline**

- **Overview**
- **General BCD adder circuit model**
- **Decimal addition and correction**
- **Decimal adder circuits**
- **Nine's complement circuit**

#### **Overview**

- **Even though binary numbers are the most appropriate for the internal computations of a digital system, most people still prefer to deal with decimal numbers**
- **External interfaces of a digital system may need to read or display decimal numbers, and therefore need to perform arithmetic on decimal numbers directly**
- **The most commonly used decimal code is** *binary-coded decimal* **(BCD)**
- **Some computers place two BCD digits in an 8-bit byte ("packed-BCD format")**

#### **Overview**

- **Consider the problem of adding a pair of BCD digits – the objective is to design a circuit that adds the two 4 bit codes along with a carry in to produce a 4-bit coded sum digit plus a carry out**
- **We would like to use standard 4-bit binary adder modules (with which we are already familiar) as basic building blocks**
- **Note that because there are six "unused combinations" in BCD, a** *correction* **must be performed if the direct sum of the two 4-bit codes exceeds 1001**

# General BCD Adder Circuit Model



**the 4-bit adder**







### Decimal Adder Circuits

**• Summary of rules** 

- **If the sum of the two BCD digits is less than or equal to nine (1001),** *no correction* **is needed**
- **If the sum is greater than nine, the result obtained directly from the 4-bit adder must be** *corrected* **in order to represent the proper BCD digit**
- **Some microprocessors include a "decimal adjust" (DAA) instruction for performing this correction following the addition of BCD operands**















- **Thought questions:**
	- **How could an** *n-digit* **BCD adder be constructed using the "decimal full adder" circuit just designed?**
	- **How could this n-digit BCD adder be made into an n-digit BCD** *adder/subtractor***?**

**Hint: How could the** *radix complement* **of a BCD digit (where the radix or base is 10) be generated?**





#### Reading Assignment: *Meyer Supplemental Text*, pp. 1-18

Learning Objectives:

- **Define computer architecture, programming model, and instruction set Describe the top-down specification, bottom-up implementation strategy as it pertains to the design of a computer**
- **Describe the characteristics of a "two address machine"**
- **Describe the contents of memory: program, operands, results of calculations**
- **Describe the format and fields of a basic machine instruction (opcode and address)**
- **Describe the purpose/function of each basic machine instruction (LDA, STA, ADD, SUB, AND, HLT)**
- **Define what is meant by "assembly-level" instruction mnemonics Draw a diagram of a simple computer, showing the arrangement and**
- **interconnection of each functional block <sup>165</sup>**

### **Outline**

- **•** Introduction
- **Top-Down, Bottom-Up Design Methodology**
- **Simple Computer "Big Picture"**
- **A Simple Instruction Set**
- **A Simple Programming Example**
- **System Block Diagram**

### Introduction

- **The focus thus far has been on a number of digital system "building blocks"**
	- **state machines**
	- **latches and flip-flops**
	- **arithmetic logic circuits**
	- **decoders, encoders, multiplexers**
	- **basic gates (NAND, NOR, XOR, etc.)**
- **Question: What is the primary utility of these building blocks?**

**Building a computer or interfacing to an existing one**

### Introduction

- **Question: What is a computer?**
- **A device that sequentially executes a stored program**
- **(or, a device that stores and manipulates state)**
- **Question: What is a microprocessor?**
- **A single-chip embodiment of the major functional blocks of**
- **a computer**
- **Question: What is a microcontroller?**
- **A microprocessor with a number of integrated peripherals typically used in control-oriented applications**

### Introduction

- **Question: How can we apply what we know about various digital system building blocks to design a simple computer?**
- **We need a** *structured approach* **that enables us to transform a "word" description of what a computer does into a block diagram**
- **Definition: The** *architecture* **of a computer is the**  *arrangement* **and** *interconnection* **of its functional blocks**



### Introduction

 **Analogy: Designing and building a house - 2** – **then develop a "floor plan"...**



#### Introduction

 **Analogy: Designing and building a house - 3** – **then embellish the floor plan with details (outlets, lights, plumbing, HVAC, etc.)**



### Introduction

 **Analogy: Designing and building a house - 4** – **When you're ready to build, how do you proceed?**

**From the** *ground-up* **– start with the foundation, create basic structure, embellish with finishing details** 

 **Question: What would you call the basic procedure we have just described?**

*Top-down specification* **of functionality,**  *bottom-up implementation* **of basic blocks**

# Simple Computer "Big Picture"

- **We wish to apply this "top-down, bottom-up" methodology to the design of a simple 8-bit computer (or, microprocessor")**
- **First step: Draw the** *big picture*



### Simple Computer "Floor Plan"

 **Question: In the design of a computer, what is analogous to the** *floor plan* **of a house?**

**The computer's** *programming model* **and** *instruction set*

- **Definition: The** *programming model* **of a computer is the set of "user" registers available to the programmer**
- **Definition: A collection of two or more flip-flops with a common clock (and generally a common purpose) is called a** *register*

**In the simple computer designed here, the programming model will contain a single** *data register* **"where the result accumulates" (the** *accumulator***, or "A register" for short), plus** *condition codes*  **or** *f* 

# Simple Computer "Floor Plan"

- **Definition: The** *instruction set* **of a computer is the set of operations the computer can be programmed to perform on data**
- **Instructions typically consist of several fields that indicate the** *operation* **to be performed ("operation code", or** *opcode***) and the** *data* **on which the operation is to be performed (specified using an** *addressing mode***)**
- **Our 8-bit computer will utilize a 3-bit opcode field (thus allowing 8 different kinds of instructions to be implemented) and a 5-bit address field (thus allowing 32 locations)**

# Simple Computer "Floor Plan"

**Instruction format:**

# **X X X Y Y Y Y Y**

**XXX – indicates operation to perform ("opcode")**

**YYYYY – indicates location of operand ("address")**

**Called a "two address machine" since one operand will be the accumulator ("A") register and the other operand will be obtained from the specified location in memory**

# Simple Computer "Floor Plan"

**Instruction set:**



### Simple Programming Example











### Simple Computer Block Diagram

- **Functional blocks required:**
	- **a place to store the program, operands, and computation results –** *memory*
	- **a way to keep track of which instruction is to be executed next –** *program counter (PC)*
	- **a place to temporarily "stage" an instruction while it is being executed –** *instruction register (IR)*
	- **a way to perform arithmetic and logic operations –** *arithmetic logic unit (ALU)*
	- **a way to coordinate and sequence the functions of the machine –** *instruction decoder and micro-sequencer (IDMS)*













- **Each functional block is "self-contained" (which means each block can be designed and tested** *independently***)**
- **Additional instructions can be added by increasing the number of** *opcode* **bits**
- **Additional memory can be added by increasing the number of** *address* **bits**
- **The numeric range can be expanded by increasing the number of** *data* **bits**



**194**

**196**



- A. accumulator
- B. program counter
- C. instruction register D. microsequencer
- E. none of the above
- 

#### Q2. The place where an instruction fetched from memory is "staged" while it is being decoded and executed is the:

- A. accumulator
- B. program counter C. instruction register

**193**

**195**

- D. microsequencer
- E. none of the above

Q3. If two additional address bits were added to the Simple Computer, the *number of memory locations* the machine could access would increase:

- A. by two locations
- B. by four locations
- C. by two times the original number of locations
- D. by four times the original number of locations
- E. none of the above

#### Q4. The expression  $(10110) \leftarrow (A) + (10110)$  means:

- A. replace the contents of the accumulator with the sum of its current contents plus the contents of memory location 10110
- B. replace the contents of the accumulator with the sum of its current contents plus the constant 10110
- C. replace the contents of memory location 10110 with the sum of its current contents plus the contents of the accumulator
- D. add the constant 10110 to the contents of the accumulator and store the result in memory location 10110
- E. none of the above



#### Reading Assignment: *Meyer Supplemental Text*, pp. 18-24

Learning Objectives:

- **Trace the execution of a computer program, identifying each step of an instruction's microsequence (fetch and execute cycles)**
- **Distinguish between synchronous and combinational system control signals**

### **Outline**

- **Review of top-down specification phase of design process**
	- **Big picture**
	- **Floor plan (instruction set)**
- **Block diagram**
- **Instruction execution tracing**

# Simple Computer "Floor Plan"

**Instruction set:**



# Simple Computer Block Diagram

- **Functional blocks required:**
	- **a place to store the program, operands, and computation results –** *memory*
	- **a way to keep track of which instruction is to be executed next –** *program counter (PC)*
	- **a place to temporarily "stage" an instruction while it is being executed –** *instruction register (IR)*
	- **a way to perform arithmetic and logic operations –** *arithmetic logic unit (ALU)*
	- **a way to coordinate and sequence the functions of the machine –** *instruction decoder and micro-sequencer (IDMS)*



#### **Addr Instruction Comments**  00000 LDA 01011<br>00001 ADD 01100 **00001 ADD 01100** Add contents of location **01100** to A **00010 STA 01101** Store contents of A at location **01101 00011 LDA 01011** Load A with contents of location **01011 00100 AND 01100** AND contents of **01100** with contents of A **00101 STA 01110** Store contents of A at location **01110 00110 LDA 01011** Load A with contents of location **01011 00111 SUB 01100** Subtract contents of location **01100** from A **01000 STA 01111** Store contents of A at location **01111 01001 HLT** Stop – discontinue execution Simple Programming Example

















### Notes About Instruction Tracing

- **The clock edges drive the** *synchronous* **functions of the computer (e.g., increment program counter, load instruction register)**
- **The decoded states (here, fetch and execute) enable the**  *combinational* **functions of the computer (e.g., turn on tristate buffers)**



### Reading Assignment:

*Meyer Supplemental Text*, pp. 24-42

- Learning Objectives: **Describe the operation of memory and the function of its control signals: MSL, MOE, and MWE**
- **Describe the operation of the program counter (PC) and the function of its control signals: ARS, PCC, and POA**
- **Describe the operation of the instruction register (IR) and the function of its control signals: IRL and IRA**
- **Describe the operation of the ALU and the function of its control signals: ALE, ALX, ALY, and AOE Describe the operation of the instruction decoder/microsequencer and derive the**
- **system control table**
- **Describe the basic hardware-imposed system timing constraints**
- Discuss how the instruction register can be loaded with the contents of the memory<br>location pointed to be the program counter *and* the program counter can be<br>incremented on the same clock edge

### **Outline**

- **Bottom-up Realization Phase of Design Process**
	- **Memory**
	- **Program Counter**
	- **Instruction Register**
	- **Arithmetic Logic Unit**
	- **Instruction Decoder and Microsequencer**
- **System Data Flow and Timing Analysis**

### Bottom-Up Implementation

- **Having finished the "top-down"** *specification* **phase of the design process, we are now ready to** *implement* **each block identified from the "bottom-up"**
- **Note that, in practice, an important aspect of this process is to** *independently test* **(and debug) each block (or module) of the system** *as it is implemented*
- **If each module is independently tested and verified as it is implemented, then – when the modules are assembled together into a system – there is a much higher probability that it will "work the first time"!**

#### Simple Computer Block Diagram **Functional blocks required:**

- **a place to store the program, operands, and computation results –** *memory*
- **a way to keep track of which instruction is to be executed next –** *program counter (PC)*
- **a place to temporarily "stage" an instruction while it is being executed –** *instruction register (IR)*
- **a way to perform arithmetic and logic operations –** *arithmetic logic unit (ALU)*
- **a way to coordinate and sequence the functions of the machine –** *instruction decoder and micro-sequencer (IDMS)*



### Read/Write Memory (RWM)

- **The name** *read/write memory* **(RWM) is given to memory arrays in which we can store and retrieve information at any time**
- **Most of the RWMs used in digital systems are** *randomaccess memories* **(RAMs), which means that the time it takes to read or write a bit of memory is independent of the bit's location in the RAM**
- **In a static RAM (SRAM), once data is written to a given location, it remains stored as long as** *power* **is applied to the chip**
- **If power is removed, data is lost this is referred to as a**  *volatile memory*

### Read/Write Memory (RWM)

- **An SRAM has three (typically** *active low***) control inputs:** – **a** *chip select* **(CS) signal that serves as the overall** 
	- **enable for the memory chip** – **an** *output enable* **(OE) signal that tells the memory chip**
	- **to drive the data output lines with the contents of the memory location specified on its address lines**
	- **a** *write enable* **(WE) signal that tells the memory chip to write the data supplied on its data input lines at the memory location specified on its address lines**

# Read/Write Memory (RWM)

- **SRAM normally has two** *access operations:*
	- **READ: An address is placed on the address lines while CS and OE are asserted; the latch outputs for the selected location are output on the data lines**
	- **WRITE: An address is placed on the address lines, data is placed on the data lines, then CS and WE are asserted; the latches of the selected location open, and the data is stored**

# Read/Write Memory (RWM)

 **Each bit of memory (or SRAM cell) in a static RAM behaves as the circuit depicted below**



- **When the SEL input is asserted, the stored data is placed on the cell's output**
- **When both SEL and WR are asserted, the latch is open and a new data bit is stored**
- **SRAM cells are combined in an array with additional control logic to form a complete static RAM**



# **Some things to note:** Read/Write Memory (RWM)

- **during** *read* **operations, the output data is a combinational function of the address inputs, so no "harm" is done by changing the address while CS and OE are asserted**
- **during** *write* **operations, data is stored in latches – this means that data must meet certain** *setup and hold times* **with respect to the** *negation* **of the WE signal**
- **also during** *write* **operations, the address lines must be stable for a certain setup time** *before* **WR is asserted internally and for a hold time** *after* **WR is negated**

# Read/Write Memory (RWM)

 **Most SRAMs utilize a** *bi-directional data bus* **(i.e., the same data pins are used for reading and writing data)**



# Simple Computer Memory

- **The memory for our simple computer will contain 32 locations (5-bit address), each 8 bits wide (i.e., a "32x8" memory)**
- **The memory subsystem will have three control signals:** – **MSL: Memory SeLect**
	- **MOE: Memory Output Enable**
	- **MWE: Memory Write Enable**

**NOTE: For simplicity (and clarity) all system control signals as well as address and data bus signals will be assumed to be ACTIVE HIGH**



Q1. When a set of control signals are said to be mutually exclusive, it means that:

- A. all the control signals may be asserted simultaneously
- B. only one control signal may be asserted at a given instant
- C. each control signal is dependent on the others
- D. any combination of control signals may be asserted at a given instant
- E. none of the above

**228**

#### A. MSL and MOE Q2. For the memory subsystem, the set of signals that are mutually exclusive is:

- B. MSL and MWE
- C. MOE and MWE
- D. MSL, MOE, and MWE
- E. none of the above

# Simple Computer Block Diagram

- **Functional blocks required:** – **a place to store the program, operands, and computation results –** *memory*
	- **a way to keep track of which instruction is to be executed next –** *program counter (PC)*
	- **a place to temporarily "stage" an instruction while it is being executed –** *instruction register (IR)*
	- **a way to perform arithmetic and logic operations –** *arithmetic logic unit (ALU)*
	- **a way to coordinate and sequence the functions of the machine –** *instruction decoder and micro-sequencer (IDMS)*



### Program Counter

**229**

- **The** *program counter* **(PC) is basically a binary "up" counter with tri-state outputs**
- **The functions and corresponding control signals required are as follows:**
	- **ARS: Asynchronous ReSet**
	- **PCC: Program Counter Count enable**
	- **POA: Program counter Output on Address bus tri-state buffer enable**







# Instruction Register

- **The** *instruction register* **(IR) is basically an 8-bit data register, with tri-state outputs on the lower 5 bits**
- **Note that the upper 3 bits (opcode field) are output directly to the instruction decoder and microsequencer**
- **The functions and corresponding control signals required are as follows:**
- **IRL: Instruction Register Load enable**
- **IRA: Instruction Register Address field tri-state output enable**





# Arithmetic Logic Unit

- **The** *arithmetic logic unit* **(ALU) is a multi-function register that performs all the arithmetic and logical (Boolean) operations necessary to implement the instruction set**
- **The functions and corresponding control signals required are as follows:**
	- **ALE: ALU Enable**
	- **ALX: ALU "X" function select**
	- **ALY: ALU "Y" function select**
	- **AOE: A register tri-state Output Enable**





















- **a place to store the program, operands, and computation results –** *memory*
- **a way to keep track of which instruction is to be executed next –** *program counter (PC)*
- **a place to temporarily "stage" an instruction while it is being executed –** *instruction register (IR)*
- **a way to perform arithmetic and logic operations –** *arithmetic logic unit (ALU)*
- **a way to coordinate and sequence the functions of the machine –** *instruction decoder and micro-sequencer (IDMS)*



### Instruction Decoder and Microsequencer

- **The** *instruction decoder and microsequencer* **(IDMS) is a state machine that orchestrates the activity of all the other functional blocks**
- **There are two basic steps involved in "processing" each instruction of a program (called a** *micro-sequence***):**
	- *fetching* **the instruction from memory (at the location pointed to by the PC), loading it into the IR, and incrementing the PC**
	- *executing* **the instruction staged in the IR based on the opcode field and the operand address field**

#### Instruction Decoder and Microsequencer

- **Since there are only** *two states* **(fetch and execute), a single flip-flop can be used to implement the** *state counter* **("SQ")**
- **The control signals that need to be asserted during the**  *fetch* **cycle include:**
	- **POA: turn on PC output buffers**
	- **MSL: select memory**
	- **MOE: turn on memory output buffers**
	- **IRL: enable IR load**
	- **PCC: enable PC count**

**NOTE: The** *synchronous* **functions (IRL and PCC) will take place on the clock edge that causes the** *state co* **transition from the FETCH state to the EXECUTE state <sup>253</sup>**

#### Instruction Decoder and Microsequencer

- **The control signals that need to be asserted during an**  *execute* **cycle for the synchronous ALU functions (ADD, SUB, LDA, AND) are:**
- **IRA: turn on operand address output buffers**
- **MSL: select memory**
- **MOE: turn on memory data output buffers**
- **ALE: enable ALU operation**
- **ALX, ALY: select ALU function**
- **The control signals that need to be asserted during an**  *execute* **cycle for STA are:**
	- **IRA: turn on operand address output buffers**
	- **MSL: select memory**
	- **MWE: enable memory write**
	- **AOE: turn on A register output buffers <sup>254</sup>**



#### Instruction Decoder and Microsequencer

- **In order to** *stop* **execution (i.e., disable all the functional blocks) when a HLT instruction is executed, an additional flip-flop will be used (called "RUN") , as follows:**
	- **when the START pushbutton is pressed, the RUN flipflop will be** *asynchronously set*
	- **when a HLT instruction is executed, the RUN flip-flop will be** *asynchronously cleared*
	- **the RUN signal will be ANDed with the synchronous system enable signals, thus effectively** *halting* **execution when a HLT instruction is executed**







# System Data Flow and Timing Analysis

# **General procedure**

- **Understand operation of individual functional units**
- **Memory**
	- **Program counter**
	- **Instruction register**
	- **Arithmetic logic unit**
	- **Instruction decoder and microsequencer**
	- **(New functional blocks to be added)**

# System Data Flow and Timing Analysis

- **General procedure**
	- **Understand function ("data processing") performed by each instruction**
	- **Identify address and data flow required to execute each instruction**
	- **Identify micro-operations required to execute each instruction**
	- **Identify control signals that need to be asserted to generate the required sequence of micro-operations for each instruction**
	- **Examine timing relationship of control signals**

### System Data Flow and Timing Analysis

- **Basic hardware-imposed constraints**
	- **Only** *one device* **is allowed to drive a bus during any machine cycle (i.e., "bus fighting" must be avoided)**
	- **Data cannot pass through** *more than one* **(edge-triggered) flip-flop or latch per cycle**







**267**

**269**

Q2. The program counter (PC) can be incremented on the same cycle that its value is used to fetch an instruction from memory because:

- A. the synchronous actions associated with the IRL and PCC control signals occur on different fetch cycle phases
- B. the IRL and PCC control signals are not asserted simultaneously by the IDMS
- C. the load of the instruction register is based on the data bus value prior to the system CLOCK edge, while the increment of the PC occurs after the CLOCK edge
- D. the load of the instruction register occurs on the negative CLOCK edge, while the increment of the PC occurs on the positive CLOCK edge
- E. none of the above

A. the memory will ignore the new address the PC places on the address bus Q3. Incrementing the program counter (PC) on the same clock edge that loads the instruction register (IR) does not cause a problem because:

- B. the output buffers in the PC will not allow the new PC value to affect the address bus until the next fetch cycle
- C. the IR will be loaded with the value on the data bus prior to the clock edge while the contents of the PC will increment after the clock edge
- D. the value in the PC will change in time for the correct value to be output on the address bus (and fetch the correct instruction), before the IR load occurs
- E. none of the above



- B. the system clock has limited driving capability
- C. the flip-flops that comprise a register do not change state simultaneously, so additional time must be provided before the register's output can be used
- D. for a D flip-flop with clocking period  $\Delta$ , Q(t+ $\Delta$ )=D(t)
- E. none of the above



**268**

#### Reading Assignment: *Meyer Supplemental Text*, pp. 42-50

#### Learning Objectives:

- **Modify a reference ALU design to perform different functions (e.g., shift and rotate)**
- **Describe how input/output (IN/OUT) instructions can be added to the base machine architecture**
- **Describe the operation of the I/O block and the function of its control signals: IOR and IOW**
- **Compare and contrast the operation of OUT instructions with and without a transparent latch as an integral part of the I/O block**
- **Compare and contrast "jump" and "branch" transfer-of-control instructions along with the architectural features needed to support them**
- **Distinguish between conditional and unconditional branches**
- **Describe the basis for which a conditional branch is "taken" or "not taken"**

# **Outline**

- **Overview**
- **Adding shift instructions**
- **Adding input/output (I/O) instructions**
- **Adding transfer of control instructions**

#### **Overview**

 **We will use the two available opcodes (110 and 111) to add new instructions to the basic machine, a pair at a time**



#### **Overview**

 **We will add** *rows* **to the system control table to add the new instructions and add** *columns* **to add new control signals**



### Adding Shift Instructions

- **Definition: A shift instruction** *translates* **the bits in a register (here, the "A" register) one place to the** *left* **or to the** *right*
- **Definition: An** *end off* **shift discards the bit that gets shifted out**
- **Definition: A** *preserving* **shift retains the bit shifted out (typically in the CF condition code bit)**
- **Definition: A** *logical* **shift is a "zero fill" shift**
- **Definition: An** *arithmetic* **shift is a "sign preserving" shift (i.e., the sign bit gets replicated as the data is shifted right)**

### Shift Instruction Examples

- **Given: (A) = 10011010**
- **After** *logical shift left:* **00110100 CF=1**
- **After** *logical shift right:* **01001101 CF=0**
- **After** *arithmetic shift left:* **00110100 CF=1**
- **After** *arithmetic shift right:* **11001101 CF=0**

**Note: An arithmetic left shift is identical to a logical left shift**



### Modified Instruction Set



Modified System Control Table

| S <sub>0</sub><br>S <sub>1</sub> | -<br><b>HLT</b> | н | н |   | н | н |   |   |   |   |   |   |  |
|----------------------------------|-----------------|---|---|---|---|---|---|---|---|---|---|---|--|
|                                  |                 |   |   |   |   |   | н |   |   |   |   |   |  |
|                                  |                 | L |   |   | L |   | L |   |   | L |   |   |  |
| S <sub>1</sub>                   | LDA addr        | H | H |   |   |   |   | H |   | H |   |   |  |
| S1                               | LSR             |   |   |   |   |   |   |   |   | н |   | н |  |
| S1                               | <b>ASL</b>      |   |   |   |   |   |   |   |   | H | н |   |  |
| S1                               | <b>ASR</b>      |   |   |   |   |   |   |   |   | H | н | н |  |
| S <sub>1</sub>                   | STA addr        | H |   | H |   |   |   | H | H |   |   |   |  |
| <b>S1</b>                        |                 |   |   |   |   |   |   |   |   |   |   |   |  |
| S1                               |                 |   |   |   |   |   |   |   |   |   |   |   |  |





























**296**

Q1. If the output port pins are latched, data written to the port will remain on its pins:

- A. only during the execute cycle of the OUT instruction
- B. only when the clock signal is high
- C. until another OUT instruction writes different data to the port
- D. until the next instruction is executed
- E. none of the above

#### Q2. If the output port pins are not latched, data written to the port will remain on its pins:

- A. only during the execute cycle of the OUT instruction
- B. only when the clock signal is high
- C. until another OUT instruction writes different data to the port
- D. until the next instruction is executed
- E. none of the above

**295**

### Adding Transfer of Control Instructions

- **There are two basic types of** *transfer-of-control* **instructions:**
	- **absolute: the operand field contains the address in memory at which execution should continue ("jump")**
	- **relative: the operand field contains the signed offset that should be added to the PC to determine the location at which execution should continue ("branch")**
- **Jumps or branches can be** *unconditional* **("always happen") or** *conditional* **("happen only when a specified condition is met" – which is usually a function of the condition codes)**

#### Adding Transfer of Control Instructions

- **For the purpose of illustration, we will add an** *unconditional jump* **("JMP") instruction to our simple computer along with a single** *conditional* **jump ("J***cond***")**
- **The conditional jump illustrated will be a** *jump if ZF set* **("JZF") instruction**
- **To execute a jump instruction, the operand field (from the IR) will be loaded into the PC via the** *address bus* **when PLA is asserted**
- **Note that, for the conditional jump instruction, that if the condition is not met, the execute cycle will be a "***no-operation***" (or, "NOP") cycle**











#### Q2. Whether or not a conditional branch is *taken* or *not taken* depends on:

- A. the value of the program counter
- B. the state of the condition code bits
- C. the cycle of the state counter
- D. the value in the accumulator
- E. none of the above

#### Thought Questions

**What would be required to add a** *jump if less than* **("JLT") or a** *jump if greater than or equal to* **("JGE") instruction?**



### Thought Questions

 **How could a** *compare* **("CMP") instruction be implemented? How is this different than a** *subtract* **("SUB")?**

**A "CMP" works the same as "SUB"** *except* **that the result of (A) – (***addr***) is** *not stored* **in the A register (i.e.,** *only* **the** *flags* **are affected)**



#### Reading Assignment: *Meyer Supplemental Text*, pp. 50-64

Learning Objectives:

- **describe the changes needed to the instruction decoder/microsequencer in order to dynamically change the number of instruction execute cycles based on the opcode**
- 
- **compare and contrast the machine's asynchronous reset ("START") with the synchronous state counter reset ("RST")**
- **describe the operation of a stack mechanism (LIFO queue)**
- **describe the operation of the stack pointer (SP) register and the function of its control signals: ARS, SPI, SPD, SPA**
- **compare and contrast the two possible stack conventions: SP pointing to the top stack item vs. SP pointing to the top stack item**
- **describe how stack manipulation instructions (PSH/POP) can be added to the base machine architecture**
- **discuss the consequences of having an unbalanced set of PSH and POP instructions in a given program**

#### Reading Assignment: *Meyer Supplemental Text*, pp. 50-64

307

Learning Objectives, continued:

- **discuss the reasons for using a stack as a subroutine linkage mechanism: arbitrary nesting of subroutine calls, passing parameters to subroutines, recursion, and reentrancy**
- **describe how subroutine linkage instructions (JSR/RTS) can be added to the base machine architecture**
- **analyze the effect of changing the stack convention utilized (SP points to top stack item vs. next available location) on instruction cycle counts**

### Outline

- **Modifications to state counter and instruction decoder to accommodate instructions with multiple execute cycles**
- **Introduction of a stack mechanism**
- **Use of a stack to implement "push" (PSH) and "pop" (POP) instructions**
- **Identification of micro-operations that can be overlapped**
- **Modifications to the system architecture necessary to implement subroutine linkage instructions**
- **Implementation of subroutine "call" (JSR) and "return" (RTS) instructions**

#### State Counter Modifications

- **All of the "simple computer" instructions discussed thus far have required only two states: a fetch cycle followed by a** *single* **execute cycle**
- **Many "real world" instructions require more than a single execute state to achieve their desired functionality**
- **We wish to extend the state counter (and the instruction decoder) to accommodate instructions that require** *multiple* **execute cycles (here, up to** *three***)**

### State Counter Modifications

- **In the process of adding this capability, we want to make sure our original "shorter" instructions do not incur a "penalty" (i.e., execute** *slower***)**
- **To accomplish this, we need to design the state counter so that the number of execute cycles can be** *dynamically changed* **based on the opcode of the instruction being executed**
- **To provide up to three execute cycles, we will replace the state counter flip-flop with a** *two-bit binary counter* **that has a** *synchronous reset* **input (in addition to an asynchronous clear)**

#### State Counter Modifications

#### **The "state names" will now be:**

- **S0 (fetch)**
- **S1 (first execute)**
- **S2 (second execute)**
- **S3 (third execute)**
- **We will also add a new system control signal "RST" (connected to the synchronous reset of the binary counter) that will be asserted on the** *last execute state* **of each instruction, thereby** *synchronously resetting* **the state counter to zero (so that the next cycle will be a "fetch")**









**320**

Q1. The state counter in the "extended" machine's instruction decoder and micro-sequencer needs both a synchronous reset (RST) and an asynchronous reset (ARS) because:

- A. we want to make sure the state counter gets reset
- B. the ARS signal allows the state counter to be reset to the "fetch" state when START is pressed, while the RST allows the state counter to be reset when the last execute cycle of an instruction is reached
- C. the RST signal allows the state counter to be reset to the "fetch" state when START is pressed, while ARS allows the state counter to be reset when the last execute cycle of an instruction is reached
- D. the state counter is not always clocked
- E. none of the above

Q2. Adding a **third bit** to the state counter would allow execute states:

- A. 3
- B. 5
- C. 7
- D. 8

**319**

E. none of the above

#### Stack Mechanism

- **Definition: A** *stack* **is a** *last-in, first-out* **(LIFO) data structure**
- **Primary uses of stacks in computers:**
	- **subroutine linkage**
		- *saving return address*
		- *parameter passing*
	- **saving machine** *context* **(or** *state***) especially when processing** *interrupts* **or** *exceptions*
	- **expression evaluation**

### Stack Mechanism

- **Conventions:**
	- **the stack area is generally placed at the "top" of memory (i.e., starting at the** *highest address* **in memory)**
	- **a stack pointer (SP) register is used to indicate the address of the** *top stack item*
	- **stack** *growth* **is toward** *decreasing* **addresses (note that this is in contrast to** *program growth***, which is toward**  *increasing addresses***)**
	- **Note: An** *alternate* **convention for the stack could also be used, namely, to have the SP register point to the**  *next available location*



### Illustration of Stack Growth

**After first item pushed onto stack:**





















#### Adding Stack Manipulation Instructions

- **The most common stack operations are "push" and "pop" – here, the value that will be** *pushed* **and** *popped* **is the A register**
	- **PSH save value in A register on stack**
		- **Step 1:** *Decrement SP register*
		- **Step 2: Store value in A register at the location pointed to by SP register**
	- **POP load A with value on stack**
	- **Step 1: Load A register from memory location pointed to by SP register**
	- **Step 2:** *Increment SP register*

**Note: The PSH and POP instructions can be used to**   $implement$  expression expression experts

### Adding Stack Manipulation Instructions

 **The opcodes that will be used for PSH and POP are as follows:**



### Adding Stack Manipulation Instructions

- **At first glance, it would appear that** *two execute cycles* **are required to implement both the PSH and POP instructions**
- **As** *astute computer engineers***, however, we always need to be on the lookout for operations that can be**  *overlapped* **(subject, of course, to the "rules" we learned earlier):** 
	- *Only one device is allowed to drive a bus during any machine cycle (i.e., "bus fighting" must be avoided)*
	- *Data cannot pass through more than one (edge-triggered) flip-flop or latch per cycle*

**Instruction Decoder** Program and Micro-Sequencer Counter Start <sup>O</sup>  $Clock$ Opcode Addre  $SP = SP-1$ **Bus Instruction** Address Register **Memory** Flags ALU  $\frac{a}{6}$ Data **Add** Data Bus Τ PSH: Step 1









# © 2019 by D. G. Meyer 57

### Adding Stack Manipulation Instructions

 **Implementation of POP requires only** *one* **execute cycle:** – **first execute cycle: output value of SP on address bus (SPA), enable a memory read operation (MSL and MOE), tell the ALU to load the A register with the value on the data bus (ALE and ALX), and tell the SP register to increment\* (SPI)**

 $\mathbf{F}$ : The SP register will be incremented *after* the  $\mathbf{F}$ **A register is loaded with the contents of the location pointed to by the SP register, i.e., the SP increment is** *overlapped* **with the fetch of the next instruction**

Adding Stack Manipulation Instructions **Modified system control table: Note: Recall that the RST signal is used to** *synchronously reset* **the state counter on the final example Decoded State Instruction Mnemonic**  $\mathbf{g}$ **MOE MWE PCC POA IRL**  $\mathbf{S}$ **AOE ALE ALX ALY** <u>ទី</u> <u>ទី</u> **SPA RST S0 – H H H H H S1 HLT L LL L S1 LDA** addr **H** H H H H **S1 ADD addr HH H H H S1 SUB addr HH H H H H S1 AND addr HH H HHH H S1 STA addr H H HH H S1 PSH H S1 POP HH HH H HH S2 PSH H H H HH**







**345**

### Adding Subroutine Linkage Instructions

- *Why use a stack as a subroutine linkage mechanism?*  **There are several important capabilities that a stack affords:**
	- *arbitrary nesting* **of subroutine calls**
	- *passing parameters* **to subroutines**
	- *recursion* **(the ability of a subroutine to call itself) – made possible by passing parameters via the stack**
	- *reentrancy* **(the ability of a code module to be shared among quasi-simultaneously executing tasks) – made possible by storing temporary local variables on the stack**

#### Adding Subroutine Linkage Instructions

 **The opcodes that will be used for JSR ("jump to subroutine") and RTS ("return from subroutine") are as follows:**







#### Adding Subroutine Linkage Instructions

- **The** *astute* **student will realize at this point that the program counter needs to be** *modified* **– a way to save its value on the stack, and subsequently restore it, must be provided**
- **Two new PC control signals need to be implemented:** – **POD: output value of PC on data bus** – **PLD: load PC with value on data bus**













### Adding Subroutine Linkage Instructions

- **Implementation of JSR requires** *three* **execute cycles:**
	- **first execute cycle: decrement SP register (SPD)**
	- **second execute cycle: output "new" value of SP on address bus (SPA), enable a memory write operation (MSL and MWE), and tell the PC register to output its value on the data bus (POD)**
	- **third execute cycle: tell IR register to output its operand field on the address bus (IRA), and tell the PC register to load the value on the address bus (PLA)**





#### Adding Subroutine Linkage Instructions

- **At first glance, it would appear that** *two execute cycles* **are required to implement the RTS instruction**
- **As** *astute* **students, however, we always need to be on the lookout for operations that can be** *overlapped* **(subject, of course, to the "rules" we learned earlier):**
- *Only one device is allowed to drive a bus during any machine cycle (i.e., "bus fighting" must be avoided)*
- *Data cannot pass through more than one (edge-triggered) flip-flop or latch per cycle*



 **Implementation of RTS requires only** *one* **execute cycle:** – **first execute cycle: output value of SP on address bus (SPA), enable a memory read operation (MSL and MOE), tell the PC to load the value on the data bus (PLD), and tell the SP register to increment\* (SPI)**

**\*Note: The SP register will be incremented** *after* **the PC register is loaded with the contents of the location pointed to by the SP register, i.e., the SP increment is**  *overlapped* **with the fetch of the next instruction**







**369**

# Q2. If a program contains **more RTS instructions than JSR instructions,** the following is likely to occur:

- A. stack overflow (stack collides with end of program space) B. stack underflow (stack collides with beginning of program
- space) C. program counter overflow (program counter wraps to beginning of program space)
- D. program counter underflow (program counter wraps to end of program space)
- E. none of the above

### Fun Things to Think About...

- **What kinds of new instructions would be useful in writing "real" programs?**
- **What new kinds of registers would be good to add to the machine?**
- **What new kinds of addressing modes would be nice to have?**
- **What would we have to change if we wanted "branch" transfer-of-control instructions instead of "jump" instructions?**

**These are all good reasons to "continue your 'digital life' beyond this course"! 370**