next up previous
Next: Approach Up: Background Previous: CA Cache

PSA Cache

In Predictive Sequential Associative Cache (PSA-cache), the mechanism used to select probe order is separated from the mechanism used to guide replacement. Each pair of cache blocks uses an MRU entry to implement LRU replacement. The PSA-cache has the same miss rate as the MRU-cache and all other LRU replacement caches.Steering bit table is used to guide data access. When fetching a cache line entry, the effective address is used to index into the actual cache. Similarly,a prediction index is used to select a particular steering bit. As Kessler et al.[6] indicated, steering bits need to be accessed prior to cache access for any real benefit. If we use the effective address to select a steering bit, this may lengthen the cache access time and arguably, if the effective address were available at an earlier stage, cache accesses would be initiated earlier in the pipeline. However, it is not possible to get the effective address earlier which means data close to effective address has to be calculated earlier. Various sources could be used for such a calculation.

Separating the replacement mechanism from the prediction mechanism offers immediate benefits, even for the MRU-Cache design proposed by Kessler et al. [6].Consider an 8KByte cache split into two banks with 128 pairs of 32 byte lines. The MRU-Cache would use a 128 bit table to indicate the most recently used block in each pair. The PSA cache also uses a similar table to implement an LRU replacement policy; however, a much larger table can be used to determine the block that should be probed first when searching for an address. Each entry ``steers'' references to the appropriate cache block. If a 256-entry steering bit table (SBT) was used, alternating data refernces would encounter no penalty in the PSA-cache if different steering bit entries are used. In certain situations, it is also useful to use a rehash bit in the PSA-cache. As in the CA cache, this bit is used to avoid examining another line when the line cannot possibly contain the requested address, but it is not used to guide the replacement policy, since the MRU bit provides more accurate information. Thus three data structures are used to implement three cache mechanisms. The steering bit table determines which block in a set should be probed first, increasing the number of references found during the first probe. The rehash bits reduce the number of probes, allowing the misses to be started earlier or simply reducing the time cache is busy, which is important for architectures that issue multiple loads per cycle. The MRU bit provides a true LRU replacement policy, improving the overall miss rate.

Predicting the address in PSA-cache is important to index the SBT. Some of the specific prediction sources are

1.Effective Address. The effective address is the most accurate prediction source; however, there may not be enough time in some designs to compute the effective address and index the steering bits before the cache access completes. When using the effective address, the PSA-Cache is a simple extension to the MRU-Cache with improved performance from a larger steering bit table.

2.Register Contents and Offset. Computing the effective address involves a full add. Functions without carry propogation take less time to index the steering bit table before the cache access completes. Exclusive-or of the contents and the offset is used to form the prediction address.

Prediction could also be based on the register number/offset and instrution/previous references. As these sources are not so accurate, simulations were performed only for effective address and XOR methods.


next up previous
Next: Approach Up: Background Previous: CA Cache

Annamalai Ramanathan
Fri Apr 4 19:37:16 EST 1997