In this lab series you will design a level 1 cache hierarchy, consisting of a direct mapped instruction cache and a two way associative data cache. The caches provide fast access to addresses that exhibit the properties of ‘locality’. Each cache will have a data store (the cache) and a control component that operates the data store.

Design

lab9__1.png
Figure 1. Cache

The cache diagram shows the general operation of a cache. The address(addr) is used to index into the data array and if an item is found to match the tag a hit is registered. Should there not be an item in the data store that matches the tag the control component will have to retrieve the item from the ram. While the item is being retrieved the cache instructs the processor that it does not yet have the item it requested by not registering a hit. Once the processor sends the halt signal, the cache should flush all dirty values back to update memory.

Tip
The cache size refers to the amount of data stored. It does not include tag bits or state bits.

Please refer to the lecture slides for the cache structure and terminology. A brief set is provided here

  • Block : The lowest granularity of data to be operated on between Cache and Memory. A block may contain multiple words or a single word. Each time you update the cache from memory or write back to it, you have to do it for the whole block.

  • Frame : A frame contains a block, tag, a valid bit and possibly a dirty bit.

  • Set : Refers to the cache entries corresponding to a particular index. A set could have many frames. This depends on the set associativity of the cache.

The frames on a set only share the same index. Everything else is independent of each other. This means that when you have to compare the tag from the input address to the tag fields in all frames corresponding to a set.

The value of the address is used to access the cache, the format can be seen in the following diagram.

lab9__2.png
Figure 2. Address fields

The lower two bits are the byte offset which let you access specific bytes from your word. Since the lowest granularity of data and instruction that our processor operates on is a word, the lower two bits shall always be 00 and can be ignored.

From right to left, the block offset lets you know which word from the cache block to select. The index selects the correct set of the cache. Finally, the tag compares against the set tag to register a hit or a miss in the cache.

Design Specification

The dcache and the icache will require state machines. The dhit and ihit should be asynchronous though. This means that hits in your design should not take more than a cycle.

You are required to use the following interfaces and packages:

Packages
  • CPU types: This contains data types for your processor design.

Interfaces
  • CPU Ram: Connects your cpu to ram.

  • Datapath Cache: Connects your datapath to the caches.

  • Cache Control: Connects your caches to memory_control.

  • System: Connects the system to the testbench and fpga wrapper.

The use of these packages and interfaces is required in your design. These can not be modified, or changed in any way by you the student.

Important
Only the course staff may make changes to the interfaces and provided types. Should changes be necessary, you will be instructed to pull from the git repository to merge these changes.

There are a few policies to control the operation of a cache, you will implement the following:

Policies
  • The write policy will be ‘write-back’.

  • The allocation policy will be ‘allocate on miss’

  • The replacement policy will be ‘least recently used’.

Cache Specifications
Instruction Cache
  • 512bits in size

  • Direct mapped.

  • One word per block.

Data Cache
  • 1Kbits in size.

  • Two way associative.

  • Two words per block.

  • Invalidate cache blocks on halt.

  • 32bit hit counter (address 0x3100) to validate against simulator. Your design will not pass without writing the hit counter.

    • Only initial hits should be counted, misses that turn into hits are considered misses.

    • Do not count the same hit multiple times.

    • This count should be written to address 0x3100.

Setup

For this design you will branch from your pipelined processor.

To do this issue the following commands:

git checkout pipeline

git checkout -b caches

Note
There is no cache branch to pull from on the course repo.

You should now have your processor files for use.

Files

The following files contain the package and interfaces that are required in this design.

  • packages: cpu_types_pkg.vh

  • interfaces: cpu_ram_if.vh, datapath_cache_if.vh, cache_control_if.vh, system_if.vh

You should also have the following component files. These files are templates to guide you in the design of your processor. They contain no functionality.

Processor Components
  • caches.sv

  • memory_control.sv

Testing

Use sim -c to simulate the core with caches. This will generate the correct memsim.hex file to compare against. Use -t too for the trace.

For testasm, use testasm -c for source and testasm -c -s for mapped.

Deliverables

For the first installment, you must have the block diagram for both caches and the HDL implementation as well as a testbench with test cases documented. You can find the evaluation sheet here for lab 8.

The second installment requires you to integrate the caches into your pipelined processor design. You can find the evaluation sheet here for lab 9.

The deliverables for the cache labs:

  • Block diagram of your caches.

    • Electronically generated with diagramming software.

    • All signals and detail present for your design.

  • HDL code for both instruction and data caches.

  • Testbench for both caches.

    • Document test cases in testbench.

    • Comprehensive test cases for design usage.

  • Completed evaluation sheets for the respective labs.

  • Electronic submission of your design.

ABET Objective

Failure to satisify the ABET Objective for this lab (Lab Objective 4) via at least one of the following methods will result in failing this objective.

  • Completion of the appropriate lab 9 sign-offs (on-time)

  • Remediation of the appropriate lab 9 sign-offs by the end of week 12