Micro-Architecture

 

Software for simulations

Branch predictors packaged for    the 1st Championship Branch Prediction

 

Branch predictors packaged for    the 2nd Championship Branch Prediction

Researches in architecture cover

 

Cache Architecture

Skewed associative caches

The skewed associative cache is a new organization for multi-bank caches. Skewed-associative caches have been shown to have two major advantages over conventional set-associative caches. First, at equal associativity degrees, a skewed-associative cache typically exhibits the same hardware complexity as a set-associative cache, but exhibits lower miss ratio. This is particularly significant for BTBs and L2 caches for which a significant ratio of conflict misses occurs even on 2-way set-associative caches. Second, the behavior of skewed-associative caches is quite insensitive to the precise data placement in memory. Recently, we have shown that the skewed associative structure offers a unique opportunity to build TLBs supporting multiple page sizes.

Minimizing tag implementation costs

Most newly announced microprocessors manipulate 64-bit virtual addresses and the width of physical addresses is also growing. As a result, the relative size of the address tags in caches is increasing. We have proposed hardware solutions to limit the implementation cost of these address tags.

Related publications:

Other works on cache architecture


Processor Organization

The CAPS team is  working on pipeline and superscalar organization in processors. We address the complexity of instruction scheduling through prescheduling. Our work on WSRS architectures (register Write Specialization register Read Specialization) addresses the register file, bypass network and instruction scheduling complexity.

Related publications:


Sequencing and branch prediction

Multiple-block ahead branch prediction

Multiple-block ahead branch prediction is a new branch prediction mechanism. This mechanism provides an efficient way to predict the addresses of two instruction blocks in a single cycle. Such an approach would be very useful for wide dispatch superscalar processors. Recently, we have explored in details the effective design of a complete instruction fetch mechanism.

Related publications:

Skewed branch predictors

Between 1996 and 2000, we have investigated the use of the majority vote as a mean to avoid aliasing impact on global history branch predictors. The 2bcgskew predictor was the basis of the branch predictor of cancelled Alpha EV8.  2bcgskew is often considered in the literature as the most efficient conventional predictor (2-bit counter based) as opposed to neural predictors. Its accuracy is also often underestimated in comparative studies. The parameters described in ``An optimized 2bcgskew branch predictor " should be used in comparative studies.

Related publications:

Understanding global history branch predictors

Improving the perceptron branch predictor

In 2003-2004, we have begun to explore the potential of the use of the perceptron predictor. Our preliminary work showed that this potential was largely underestimated in the pioneer work from Jiménez and Lin. Then we proposed The MAC-RHSP (Multiply Accumulate Contribution Redundant History Skewed Perceptron) predictor with lower hardware complexity than the perceptron predictor, but much better prediction accuracy.

Related publications:

Pushing  limits on global history predictors

In 2004-2005, we have proposed new global history predictors for exploiting very long global history, in the hundred bits range. The OGEHL and the PPM-like predictor were selected for the 1st CBP contest. They both feature a limited number of tables and exploit very long global histories. The PPM-like predictor uses partial tag matching as the prediction selection function while the OGEHL predictor uses a tree adder. OGEHL uses a geometric series of history lengths. TAGE mixes the partial tag matching (and an optimized update policy) with usage of geometric series of history lengths.   TAGE won the 2nd CBP contest   in the  realistic predictors category  in 2006.  

For exploring the limits of branch prediction,  the GTL predictor was defined. GTL essentially combines a TAGE predictor and an OGEHL predictor. GTL won the 2nd CBP contest   in the  idealistic  predictors category  in 2006.

Related publications:

Simultaneous multithreading and multicore processors

Simultaneous multithreading

Simultaneous multithreading (SMT) is an interesting way of maximizing performance by enhancing processor utilization. We have investigated various issues involving the behavior of the memory hierarchy with SMT: branch prediction, memory hierarchy behavior, out-of-order and in-order executions,.. SMT has shown to be quite complex to implement (e.g. Alpha EV8). Recently, we have been exploring an intermediate design point between SMT and CMP, the CASH architecture (for Cmp And Smt Hybrid).

Related publications:


Multicore processors