A dynamic branch prediction simulator built using Intel PIN Tool for analyzing and comparing multiple branch prediction strategies on real program execution traces.
This project instruments binaries at runtime and evaluates how different branch predictors behave for loops, conditional branches, and real execution patterns.
Modern processors use deep pipelines to execute instructions efficiently.
Whenever a branch instruction appears (if, while, for, function returns, etc.), the processor must guess the next instruction before the branch is resolved.
A wrong prediction causes:
- Pipeline flush
- Wasted cycles
- Performance degradation
- Lower IPC (Instructions Per Cycle)
Branch predictors are therefore critical for improving processor performance.
This project simulates multiple branch prediction techniques and compares their prediction accuracy on real workloads using Intel PIN dynamic instrumentation.
Traditional bimodal predictor using:
- Strongly Taken (ST)
- Weakly Taken (WT)
- Weakly Not Taken (WN)
- Strongly Not Taken (SN)
Each branch PC maintains its own 2-bit state machine.
File:
base2bit.cpp
Extends the base predictor using:
- Global History Register (GHR)
- XOR of
PC ^ GHR - Global branch correlation
This improves prediction for correlated branch patterns.
File:
base2bitGshare.cpp
Specialized predictor for loop branches.
Features:
- Detects backward branches
- Learns loop trip counts
- Predicts loop exits accurately
This reduces loop mispredictions significantly.
File:
base2bitloop.cpp
Final combined predictor implementing:
- GShare Predictor
- Loop Predictor
- Branch Target Buffer (BTB)
This gives the highest overall accuracy by combining:
- Global history correlation
- Loop behavior learning
- Target prediction
File:
predictor.cpp
GHR stores outcomes of recently executed branches.
Example:
- Taken ->
1 - Not Taken ->
0
If recent history is:
1011
then future branches can use this history for prediction.
In GShare:
index = PC ^ GHRThis helps identify correlated branch patterns.
BTB stores previously seen branch target addresses.
When a branch prediction says TAKEN:
- BTB predicts where execution should jump.
This reduces target computation delay.
Implemented in:
predictor.cpp
Contains runtime branch prediction information.
Example:
401245 -> 401210 | LOOP | Pred: T | Actual: T (BTB extension for predictor file)
Includes:
- Branch PC
- Target Address
- Branch Type
- Prediction
- Actual Outcome
Generated by:
- All predictors
Stores executed instruction trace.
Example:
401230 mov eax, ebx
401234 cmp eax, 5
Useful for:
- Execution analysis
- Instruction flow study
- Debugging instrumentation
Generated by:
- All predictors
Stores prediction accuracy for every branch PC.
Example:
PC: 401245 - 96.42 %
Useful for:
- Identifying difficult branches
- Comparing branch behaviors
- Microarchitectural analysis
Generated by:
- All predictors
Instruction tracing is disabled by default because it generates very large logs.
To enable it, uncomment this line in the predictor source files:
// INS_AddInstrumentFunction(Trace, 0);Change it to:
INS_AddInstrumentFunction(Trace, 0);This line exists in:
base2bit.cppbase2bitGshare.cppbase2bitloop.cpppredictor.cpp
Best example demonstrating GShare optimization.
Why?
Simple predictors fail when branch behavior alternates:
T N T N T N
GShare uses global history to learn these repeating patterns effectively.
This demonstrates how history-based predictors outperform simple bimodal predictors.
Iterative merge sort benchmark used to compare predictors.
Different implementations produce different branch patterns:
- Loop-heavy behavior
- Nested conditions
- Repeated correlations
Used for comparing:
- Base predictor
- GShare
- Loop predictor
- Hybrid predictor
- Intel PIN Tool Intel PIN Tool Download
- g++
- Linux Environment
make obj-intel64/($predictorVarient).so TARGET=intel64./pin -t obj-intel64/predictor.so -- ./($yourCppExecutable)Where:
$predictorVarient= PIN instrumentation tool$yourCppExecutable= target executable being analyzed
The PIN tool dynamically instruments branch instructions during execution.
For every branch:
- Branch PC is captured
- Predictor generates prediction
- Actual branch outcome is recorded
- Predictor state is updated
- Statistics are logged
This simulates real processor branch prediction behavior.
| Predictor | Accuracy |
|---|---|
| Base 2-Bit Predictor | 89.00% |
| GShare Predictor | 90.69% |
| Loop Predictor | 92.10% |
| Hybrid Predictor (GShare + Loop + BTB) | 94.35% |
Possible extensions:
- Tournament Predictor
- Per-PC Local History
- TAGE Predictor
- Return Address Stack (RAS)
- Perceptron Predictor
- Hybrid Confidence Predictor
- C++
- Intel PIN Tool
- Branch Prediction Algorithms
- Shubham
- Shobhit Gupta