Decoupled Prediction and Fall Throughs

Next: Only Taken Allocate Up: Improvements to BTB Architectures Previous: Improvements to BTB Architectures

Decoupled Prediction and Fall Throughs

The BTB-based architecture has a low misprediction rate. There are however misses in the BTB, when certain static prediction rules are used. In the paper [1], Calder et al. try to fill up these misses in the BTB, with better solutions other than the default ones used in previous architectures. Calder et al. use the information in the PHT to predict the branch with more accuracy, avoiding some branch mispredict penalties. If a single pattern history register is used, as originally proposed by Pan et al. [11], the PHT can be used to predict the branch whether it is in the BTB or not. In a comparison of prediction methods, Yeh et al. [13] compared this method (which they termed the `GAg' method, and also called with the same name here) and other prediction methods. They found that storing prediction registers in the BTB gave a higher prediction accuracy, but they did not account for the differences between misfetch and misprediction.

In the paper [1], Calder et al. show more detailed metrics for the organization found to have the best prediction accuracy in [13] (PAs(6,16)) and the simpler method that can use the pattern history register even when the branch is not located in the BTB (GAg). In the simulation, if a branch is not in the BTB, Calder et al. use a static backward-taken/forward-not-taken prediction. GAg method has been simulated using the same history table size (GAg(11), 2048 entries) as the PAs method, and one with a larger table (GAg(12), 4096 entries). Although the sum of the misfetched and mispredicted branches is higher for the GAg methods, making them look worse, the GAg methods misfetch more often than they mispredict and mispredicting is more expensive than simply misfetching. Thus, Calder et al. found that the branch execution penalty for their reference architecture was actually smaller for the GAg methods.

It is to be noted that the PowerPC 604 uses the BTB-GAg very similar to the model proposed above. It has a 64-entry fully associative BTB that holds the target address of most recently fallen branches and uses a separate 512 entry PHT to predict the direction of conditional branches [2].

Next: Only Taken Allocate Up: Improvements to BTB Architectures Previous: Improvements to BTB Architectures

Annamalai Ramanathan
Fri Apr 4 20:04:23 EST 1997