# Instruction Execution --- CS 130 // 2025-11-05 ## Assignment 6 Reminder - [Assignment 6](../../assignments/assignment-6/): do these too - `and`, `or`, `xor`, `andi`, `ori`, `xori`, `slt`, `slti` - Extra Credit: do `lw`, `sw`, `beq`, and/or `bne` - Or... design another instruction set and archtecture of your choosing # Datapath and Control Review ## Example: `addi` ```mips addi $8, $0, 5 ``` 001000 00000 01000 0000000000000101
Using this picture, highlight all datapath lines that are used and indicate what values they should have Show the value each control line should have
```mips addi $8, $0, 5 ``` 001000 00000 01000 0000000000000101
ALUop: ALUSrc: Branch: MemRead: MemWrite: MemtoReg: RegWrite: RegDst:
# Supporting More Instructions: Load ```mips lw $8, 4($9) ``` 100011 01001 01000 0000000000000100
ALUop: ALUSrc: Branch: MemRead: MemWrite: MemtoReg: RegWrite: RegDst:
# Exercises: Store and Branch ```mips sw $8, 4($9) ``` 101011 01001 01000 0000000000000100
ALUop: ALUSrc: Branch: MemRead: MemWrite: MemtoReg: RegWrite: RegDst:
```mips beq $8, $9, 5 #jump ahead 5 instructions ``` 000100 01000 01001 0000000000000101
ALUop: ALUSrc: Branch: MemRead: MemWrite: MemtoReg: RegWrite: RegDst:
# Performance Issues ## Performance Issues - Longest delay determines clock period - Some stages of the datapath are idle waiting for others to finish - Can improve performance by **pipelining** # Pipelining #### Breaking down instruction execution - Five stages: 1. **IF**: Instruction Fetch - read it from instruction memory 2. **ID**: Instruction Decode and Register Read - split instruction into parts, read register data 3. **EX**: Execute - ALU calculates result 4. **MEM**: Memory Access - read from or write to memory 5. **WB**: Write back - put new data back into a register ## Pipeline Analogy - Suppose you need to do four loads of laundry - Each load of laundry needs to be 1. Washed via the washing machine 2. Dried via the dryer 3. Folded 4. Put away in the closet - For simplicity, assume that each task takes 30 mins ## Pipeline Analogy - How long does it take to complete four loads? - One approach uses only one stage at a time and does nothing in parallel:  - Notice that the washer is unused 3/4 of the time ## Pipeline Analogy - Another approach is harnessing parallelism by running independent stages simultaneously  - How much of a speedup does this approach give us? + $8/3.5 = 2.3\times$ speedup + $2n/0.5n = 4\times$ speedup if running continuously ## Pipelined Datapath
## Pipelined Datapath - Five stages: 1. **IF**: Instruction Fetch 2. **ID**: Instruction Decode 3. **EX**: Execute 4. **MEM**: Memory access 5. **WB**: Write back ## Pipeline Performance - Assume time for stages is: + `$100\text{ps}$` for register read/write + `$200\text{ps}$` for other stages  ## Without a Pipeline  - Why must the clock be set to `$800\text{ps}$` when some instructions like `beq` could be completed in `$500\text{ps}$`? + Clock speed is limited by **slowest** instruction: `lw` ## With a Pipeline  - How much of a speedup does this approach give us? + `$2400/1400 = 1.7\times$` speedup + `$800n/200n = 4\times$` if running continuously ## Pipeline Performance - Does using a pipeline increase the efficiency of executing **individual** instructions? + No, it slows them down from `$800\text{ps}$` to `$1000\text{ps}$` + Performance benefits come from increased **throughput** due to the parallelism ## Why MIPS is Good for Pipelining - All MIPS instructions are the **same length** + Easy to fetch instruction in cycle 1 + Easy to decode instruction in cycle 2 - MIPS has only **a few instruction formats** + Registers will always be in same location + Easy to decode instructions