CS 458 Notes

Table of Contents

1 CS 458

1.1 Day 1 (May 31, 2017)

  • Class discusses single processor system, not Fifth generation.
  • Chapter 1 handout
  • Discuss the chip manufacturing process

1.1.1 Historical Survey

  1. Fst Gen: Vac Tube Tech (1945-1953)
    1. ENIAC - Electronic Numerical Integration Cmptr
      • Made of 1800 vaccum tubes & 1500 relays
      • Performed 5000 addition or subtraction per second
      • used 1000 bits of core memory
      • power usage: 170kw of power (v hot)
      • wired up for specific computations.
        • It had no programming or operating system.
        • The idea of having a "program stored" in memory is attributed to "von neumann" where and the originators of ENIAC designed the first program-stored computer
  2. Second Generation: Transistor Technology (1954-65)
    • Made of transistors
      • Source, Drain, Ground
      • G=0 - open, G=1 - closed
    • Vaccuum tube replaced by transistor As a result we see computer becomes smaller, faster, uses less power, & is more reliable (does not fail as often).
  3. Third Generation: Integrated Circuits Technology (1965-1980)
    • Also called Microchips or ICs
    • Early ICs allowed a dozen transistors on a single chip
    • Impact was computers became faster, more reliable, smaller, used less power.
  4. Fourth Generation: VLSI Technology (1980-now)
    • The trend to include more transistors on the same microchip microchip continued over time for making faster, more reliable, smaller computers that also used less power.
    • This is because each gate/circuit has propagation time (delta t). So placing them closer to one another reduces this delay.
  5. Fifth Generation: All non-von neumann parallel archictecture.
    • Examples
      • Multicore Archictecture
      • grid comuting
      • and so on
    • System made up of many processors, memory modules, and I/O devices
    • Not just a simple bus, but an interconnection network
  6. Beyond Fifth Generation: Low power processor design
    • power = Ps + Pd
      • V = voltage, F = frequency, C = capacitance, I = current
      • static power = Ps = V*I
      • Dynamic power = Pd = C*V2*F

1.1.2 Von neumann Architecture

  • Data line vs control line
  • Bus
  • Parts
    • CPU
      • Control Unit
        • Has most/all the control signals (control lines), so it directs all the components to do things like load/store/input/output device control
      • ALU (+-*/)
    • Input
    • Output
    • Memory (load/store into CPU)

1.1.3 Integrated Circuits

  • (VLSI microchips) Are cut from silicon wafers, circuits arranged in a grid.
  • Classes of ICs (sorted by density)
    • SSI (Small Scale Integration): 10-100 transistors on a microchip, 1x1 inch
    • MSI (Medium Scale Integration): 100-1000 transistors on a microchip
    • LSI (Large Scale Integration): 1000 to 10000 transistors on a microchip
    • VLSI (Very Large Scale Integration): More than 10000 transistors on a microchip
    • ULSI (Ultra Large Scale Integration): More and more transistors
  • Moore's Law
    • Intel founder - Gordon Moore
    • Stated number of transistors on a microchip doubles every 18 months.

1.1.4 Digital Logic Review

  • Logical gates
    • AND F = A*B
    • OR F = A+B
    • NOT F = ~A
    • NAND F = ~(A*B)
    • NOR F = ~(A+B)
    • XOR F = A⊕B
    • XNOR F = ~(A⊕B)
  • Circuits
    • Decoders (take binary code set one output bit line according to code)
      • 2x4 means 2 input, 4 output
      • n inputs
      • 2n outputs
    • Encoders (opposidet of decoder)
      • 2n inputs
      • n outputs
      • At any time only one of the inputs is one
    • Multiplexers
      • 2n inputs
      • n select lines
      • one output
    • Demultiplexor
      • Single input
      • 2n outputs
      • n select lines
  • Flip-Flops
    • A flip-flop stores one bit of information
    • Types
      • D Flip-Flop
        • D input
        • Q and Q' outputs
        • CP - clock pulse
        • Clear and inputs
          • When P=1 then Q=1
          • When C=1 Q=0
      • T Flip-Flop
        • T Input
        • Q and Q' outputs
        • CP - clock pulse
      • SR Flip-Flop
        • S, R inputs
        • Q and Q' outpus
        • CP - clock pulse
      • JK Flip-Flop
        • J, K inputs
        • Q and Q' outputs
        • CP - clock pulse
  • Registers
    • A n-bit register stores a n-bitt data.
    • Types
      1. Register with parallel input & parallel output Loads and stores all at once-
      2. Register with serial input & serial output (shift register) Moves lower bit to higher bit on every clock pulse
  • Memory
    • Simply: made up of a decoder that selects the address line from the address registers (AR), in conjunction with data registers (DR)
    • Types
      1. ROM - Read Only Memory
      2. RAM - Random Access Memory

1.2 Day 2 (June 05, 2017)

1.2.1 RAM

  • Memory Cell A memory cell stores one bit of information.
  • Read/write memory
  1. RAM Types
    1. Dynamic RAM
      • Denser (more bits per chip)
      • Cheaper
      • Slower
      • Requires refresh logic for periodic refresh. Has a capacitance that needs to be charged periodically. Adds to the slowness.
      • Used to make main memory
    2. Static RAM
      • Less dense
      • Expensive
      • Faster
      • Used for cache memory

1.2.2 Applications of ROM and RAM

  • N.B. DRAM uses one transistor, SRAM uses four transistors.
  1. ROM
    • System software
    • Things that won't change over time
  2. RAM
    • User program and data is stored here.
    • Things that will change over time

1.2.3 Timing

  • Memory Access Time (ta) - The time between initiation of a memory read or write request to the availability of the data
  • Memory Cycle Time (tc)
  • Memory Recovery Time (tc)
  • Memory Bandwidth w=1/tc (number of memory reads or writes per second)

1.2.4 Chapter 4 - Register Transfer Language (RTL)

  • RTL is part of HDL (Hardware Description Language)
  • Used to describe a hardware block diagram at the register level (level above logic gate level) (level below system level)
  • Hardware Description Languages
    • Verilog HDL
    • VHDL
  • To describe a system at register level
    1. Identify its registers and their functions Examples:
      • PC (program counter)
      • SP (stack pointer)
      • et al.
    2. Identify micro-operations performed on those registers Example: PC <- PC+1 (increment)
    3. Identify control functions that initiate those micro-operations Example: control signal (c): PC <- PC+1 When C=1, then increment PC
  • Register Transfer modes
    1. Parallel mode P: A <- B (when control signal P=1, then transfer content P=1, then transfer content of B into A.) (a.k.a. mov)
    2. Serial mode Refer to notes
  • Logical shift operations
    1. Logical Shift Left Operation L: Shl A When control signal L=1, then do logical shift left of register A
    2. Logical Shift Right Operation R: Shr A When contral signal R=1, then do logical shift right of register A
    3. Rotate left (Circular shift left) operation q: Cil A When control signal q=1, then rotate left the contents of register A
    4. Rotate right (Circular shift right) operation q: Cir A When control signal q=1, then rotate right contents of register A
  • Tri-State Device
  • Bus Transfer
    • Set of lines between registers
    • Often uses a common set of lines to connect several registers
    • Refer drawing
    • Bus <- Register transfer
      1. Through Tri-State devices
      2. Through multiplexors
    • Register <- Bus transfer
      1. Through Decoders
      2. Through Demultiplexor
  • Arithmetic Operations
    • Addition P: EA <- A+B (when control signal P=1, then EA=A+B E is a carry bit register.
  • Logical Operations Important: at any time, only one of the control signals can be one, but all can be zero.

    P1: A<-0 P2: A<-not(A) P3: A<-xor(A,B) P4: A<-A*B

  • Memory Operations
    1. Read operation
      • R: MBR<-M or R: MBR<-M[MAR] When read signal R=1, then transfer contens of memory word into MBR.
      • MAR - Memory address register
      • MBR - memory buffer register
    2. Write operation
      • W: M<-MBR or W: M[MAR]<-MBR When W=1, then write into memory word.
  • Generation of Control Functions

    Example: Increment Program Counter (PC) when (R=0 and T1=1) or (F=1 and T2=0)

    not(R)*T1+F*not(T2): pc <- pc+1

  • Vonn Neumann Architecture

    We talked about the purpose of program counter and how the system loads the word from memory at pc into an instruction register. And then the control unit sends out control signals.

1.3 Day 3 (June 07, 2017)

1.3.1 Instruction Set

  • Properties
    1. Complete Instruction set should be complete such that can write an assembly to evaluate any function
    2. Efficient Instruction set should be efficient such that frequently needed instructions can be implemeted by one or few instructions
    3. Compatible Instruction Set should be complete such that old programs can still on on new versions of the processor
  • Types In order for ISA to be complete it must implement most of the following instructions. Refer to handout on D2L.
    1. Data transfer instructions Move data from and to memory, between registers
      • move
      • load
      • store
    2. Arithemtic Instructions Perform basic arithmetic operations
      • Add
      • Subtract
      • Multiply
      • Divide
    3. Logical Instructions These instructions perform boolean operations
      • AND
      • OR
      • NOT
      • NAND
      • NOR
      • XOR
    4. Program control Instructions These instructions change the control of flow in the program
      • Branch
        • Unconditional (goto)
        • Conditional (if-else,loop)
      • Subroutine calls
      • Flags
        • Carry
        • Sign Bit
        • Overfloppppw
        • Z-Flag
      • Do loops
      • if/then/else
    5. I/O Instructions
      • Input
      • Output
    6. No-operation
      • wait/delay loops
      • Debugging
    7. Special
      • Conversion
        • translate - translate value based on a table correspondence
        • Convert - convert contents of a word from one to another (packed decimal to binary)
  • RISC vs CISC
    • RISC - Reduced Instruction Set Computer
      • Only includes frequently needed instructions
      • Spends less time on decoding
    • CISC - Complete Instruction Set Computer
      • Includes frequently and less-frequently needed instructions
      • Spends more time on decoding, which means more time to run every instruction, which means lower throughput

1.3.2 Instruction Format

  • Fields
    1. Op-code Specifies the type of operation
    2. Mode It specifies the addressing mode (immediate, direct, indirect, register, PC-register, relative)
    3. Operand field(s) Gives data or address of the data in memory or a register
  • Types
    1. Three-address instruction format Three fields Ex: Op C,B,A; C=B Op A Ex: ADD C,B,A; C=B + A
      1. Opcode
      2. Mode
      3. Destination (C)
      4. Source2 (B)
      5. Source1 (A)
    2. Two-address format Two fields Ex: Op B,A; B=B Op A Ex: ADD B,A; B=B + A
      1. Opcode
      2. Mode
      3. Destination/Source2 (B)
      4. Source1 (A)
    3. One-address instruction format One field Ex: Op A; AC=AC Op A Ex: Add A; AC=AC + A N.B. AC is accumulator
      1. Opcode
      2. Mode
      3. Operand (A)
    4. Zero-address instruction format No fields Get the top two elements of the stack and manipulate them, store back onto stack Ex: Op; SP=SP-1 Op SP Ex: Add; SP=SP-1 + SP N.B. SP is stack pointer
      1. Opcode
  • Stack
    • Stack pointer (SP)
    • SP+/-1 refers to one before the first depending on architecture
  • Example X = A+B
    • Three-address ADD X,A,B (ADD|Mode|X|A|B)
    • Two-address MOV X,A (MOVE|MODE|X|A) ADD X,B
    • One-address LOAD A (LOAD|MODE|A) (into AC) ADD B (into AC) STORE X (from AC)
    • Cover zero-address at chapter 8

1.3.3 Instruction Encodings

N.B. N = number of instructions

  1. Fixed-length instructions Faster, used in more modern istruction sets
    • Vertical Format n = ceiling(log2(N))
      Instruction Opcode
      Add 00
      Sub 01
      Mult 10
      Div 11

      Run through a 2x4 decoder with the outputs the opcode, inputs the bits.

    • Horizontal Format Needs N bits, one bit per instruction
      Instruction Opcode
      Add 1000
      Sub 0100
      Mult 0010
      Div 0001
  2. Variable-length instructions This instruction decoding takes more time, because it has to decode sequentially.
    Instructions Opcode
    Add 1
    Sub 01
    Mult 001
    Div 0001

1.3.4 Addressing modes

  1. Implied mode The register/output is implied by the opcode
    AC

    Example: INC; AC=AC+1

  2. Immediate mode
    Opcode Mode Data

    Example: Load #5; AC=5

  3. Direct mode
    Opcode Mode Address

    N.B. M[x] means memory contents at address x Example: LOAD ADR; AC=M[ADR]

  4. Indirect Mode
    Opcode Mode Indirect Address

    Ex: LOAD @ADR; AC=M[M[ADR]] So you get an address from memory, then go to that address. Ex: LOAD @5; AC=M[M[ 5 ]]

  5. Register Mode
    Opcode Mode Register

    Example: LOAD R1; AC=R1

  6. Register Indirect Mode
    Opcode Mode Register

    Example: Load (R2) ; AC=M[R2]

  7. Index Register Mode N.B. Your arch needs an index register (RX). Displacement is added to index register to get memory address.
    Opcode Mode Displacement

    Example: LOAD +DISP(RX); AC=M[ RX + DISP ] Example: LOAD +5(RX); Ac=M[ RX + 5]

  8. Index Register Indirect Mode
    Opcode Mode Displacement

    Example: LOAD (DISP(RX)); AC=M[M[ RX + DISP ]] Example: LOAD (5(RX)); AC=M[M[ RX + DISP ]]

  9. PC Relative Mode Add PC and displacement.
    Opcode Mode Displacement

    LOAD DISP(PC); AC=M[DISP+PC]

  10. PC Relative Indirect Mode Add PC and displacement, get memory twice.
    Opcode Mode Displacement

    LOAD (DISP(PC)); AC=M[M[DISP+PC]]

1.3.5 Putting it all together

Example: Design a processor with the following features

  • Insructions: ADD, SUB, MULT, DIV
  • Addressing modes: register and direct
  • Registers: R0, R1, …, R7
  • Memory size: 1 KBytes
  1. Abstract Design Example

    Identify instruction format to be used (two, one, zero operand instructions). Sppecify the fields, number bits per field, opcodes and codes for modes.

    1. One-address instruction (13-bit)
      Opcode Mode Operand
      2 Bits 1 Bit 10 Bits

      Fixed length vertical format

      Instruction Opcode
      ADD 00
      SUB 01
      MULT 10
      DIV 11
      Addressing modes Binary code
      Register 0
      Direct 1
      Registers Binary Code
      R0 000
      R1 001
      R2 010
      R3 011
      R4 100
      R5 101
      R6 110
      R7 111

      Memory size: 1K words, 1024 words, 210, 10-bit memory address

      Ex: find binary code for following instruction

      • ADD R5
        Opcode Mode Operand
        00 0 0000000101
      • ADD 5; AC=M[ 5 ] + AC
        Opcode Mode Operand
        00 1 0000000101
    2. Two-address (24-bit)
      Opcode Mode Operand 1 Operand 2
      2 bits 2 bits 10 bits 10 bits
      • Add R4, R5; R4 = R4 + R5
        Opcode Mode Operand 1 Operand 2
        00 00 0000000100 0000000101
      • Add 4, 5; M[ 4 ] = M[ 4 ] + M[ 5 ]
        Opcode Mode Operand 1 Operand 2
        00 11 0000000100 0000000101
               
    3. Three-address (35-bit)
      Opcode Mode Operand 1 (Dest) Operand 2 (Src 1) Operand 2 (Src 2)
      2 bits 3 bits 10 bits 10 bits 10 bits
      • Add R3, 5, R1; R3 = m[ 5] + R1
        Opcode Mode Operand 1 (Dest) Operand 2 (Src 1) Operand 2 (Src 2)
        00 010 0000000011 0000000101 0000000001

1.3.6 Instruction Execution Cycle

  1. Fetch the instruction
  2. Decode instruction
  3. Calculate effective address(es)
  4. Fetch the operand(s) (data)
  5. Execute instruction wih data
  6. Decide next instruction address (then go to step #1)

1.3.7 Chapter 5 - Processor Design (concept into practice)

Example design

Registers:

  • PC (program counter), 12 bits
  • A (address register), 12 bits
  • IR (instruction register), 16 bits
  • TR (temporary data register), 16 bits
  • DR (data register), 16 bits
  • AC (general purpose & accumulator), 16 bits
  • INPR (input register), 8 bits
  • OUTR (output register), 8 bits

One address Instruction format

1 bit 3 bits 12 bits
15 14,13,12 11 to 0
I opcode address

Addressing modes:

Direct 0
Indirect 1

Size of memory: 4K words, 4 * 1024 words, 22*210, 212 = 12-bit memory address

Instruction cycle:

  1. Fetch cycle
  2. Indirect cycle
  3. Execute cycle
  4. Interrupt cycle (then go to #1)

Questions:

  1. What page is this handout on or is it available on D2L (chapter 8)
  2. Is this instruction set complete? no, need OR
  3. Are we missing some registers? we'll cover them as we get to them
  4. Notation of number next to line with line through it? means N number lines corresponding to source and dest hookups

1.3.8 Review last homework

Got handout. Refer to it.

Created: 2017-06-07 Wed 20:32

Emacs 25.2.1 (Org mode 8.2.10)

Validate