-
Notifications
You must be signed in to change notification settings - Fork 46
Reorder statements to improve spatial locality #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Looks good. However, I would appreciate more explanation on why prioritizing row-wise checks improves spatial locality. |
I have not yet conducted a performance benchmark for this optimization. The improvement was motivated by insights from the following references: After printing the board indices being checked by check_win and check_line_segment_win, we can observe the access patterns: ---- Line type: COL ---- ---- Line type: ROW ---- ---- Line type: PRIMARY ---- ---- Line type: SECONDARY ---- Since the chances of winning via ROW and COLUMN are the same, let's consider a case where the current board already contains a winning sequence like ROW XXX or ROW OOO. Even in this situation, the current implementation still performs a complete COLUMN-major traversal of the board before identifying a winner. Given that the board is only 4×4, the performance loss from accessing non-contiguous memory might not be significant at this scale. However, the access pattern is still worth considering for potential optimization. |
Maybe consider including them in the commit message using the Link: tag?
I agree that, in theory, prioritizing row-wise checks can lead to better cache locality. However, the current commit message only states that spatial locality improves, without explaining WHY. Maybe consider tweaking the commit message to clarify the relationship between row-wise access, memory layout, spatial locality, and potential performance benefits ? |
Since row-wise and column-wise checks are logically equivalent in terms of win probability, this change prioritizes row-wise checks to better align with row-major memory layout and improve cache locality. In C, 2D arrays are stored in row-major order, meaning that accessing elements across rows (e.g., [0][0], [0][1], [0][2]) is more cache-friendly than accessing down columns (e.g., [0][0], [1][0], [2][0]). Reordering the checks to evaluate row directions first results in more sequential memory access patterns, which may lead to better spatial locality and improved performance. This optimization does not alter functional behavior, but aligns better with how memory is physically accessed in most systems. While the current board size is only 4×4, the locality benefit of row-wise access patterns becomes more pronounced as the board size increases, making this change more valuable in scalable scenarios. References supporting this optimization: Link: https://hackmd.io/@sysprog/CSAPP-ch6 Link: https://en.wikipedia.org/wiki/Row-_and_column-major_order
ddf7e45
to
c77acdb
Compare
Instead of appending the references supporting this proposed change, show experimental evidences. |
Experiment 1: Fairly Generated Test Data
To eliminate bias from board position and access order, I applied the Fisher–Yates shuffle to randomly permute the entire board after inserting the winning condition. This ensures the spatial location of the win does not favor any particular access pattern and provides a fair baseline for performance comparison. Since all potential Experiment 2: Biased Sample Distribution (Favoring Rows)
Under this biased condition, the row-major implementation clearly outperforms column-major, with a much wider performance gap. Perf analysis summary:
These results demonstrate a clear performance benefit for row-major storage when handling row-aligned winning conditions. Therefore, if the AI algorithm is designed to favor row-based placements, it can reduce the cost of victory checks and improve overall efficiency. Experiment 3: Simulating Real GameplayThe previous experiments assume "guaranteed wins," which is unrealistic in actual gameplay, where wins often occur only after many turns. For example: O | O | O | O | O
---+---+---+---+---
O | | | X |
---+---+---+---+---
O | | X | |
---+---+---+---+---
O | | X | | X
---+---+---+---+---
X | | | X | In this scenario, scanning from the top-left may require multiple invalid checks before identifying a win at the bottom-left, which can be costly for column-major access. To simulate this, I created two 5 × 5 boards that are transposes of each other:
Both require several invalid checks before locating a winning pattern, representing a more realistic edge-case scenario. O | O | | |
---+---+---+---+---
O | O | | |
---+---+---+---+---
| | | | O
---+---+---+---+---
O | O | | O | O
---+---+---+---+---
O | O | | O | O O | O | | O | O
---+---+---+---+---
O | O | | O | O
---+---+---+---+---
| | | |
---+---+---+---+---
| | | O | O
---+---+---+---+---
| | O | O | O After running 1,000K iterations with perf, the metrics were as follows:
Even under near-realistic conditions, row-major consistently outperformed column-major across almost all metrics. Although its cache miss rate is higher, this is due to having nearly half the total cache references — the overall execution time is still significantly lower. Summary & OutlookAcross all three experiments, row-major storage clearly benefits from better spatial locality, especially when the game favors horizontal wins or involves frequent win-checking logic. As a result, future AI algorithms for board games could benefit from encouraging row-oriented placement strategies. This not only simplifies decision-making but also enhances performance, particularly on large boards or scenarios requiring frequent win evaluations. |
[...]
I'm not sure if "column-major storage" here refers to actually storing the board in column-major order. If so, I'm unclear about the point of this experiment, since we're currently using a 1D array in row-major order and aren't planning to change that.
[...]
I'm also unsure about the purpose of this experiment. For fairness, shouldn't we also test column-based wins and prioritize scanning columns?
[...]
This experiment seems more reasonable, but I'm a bit confused - I expected spatial locality to help by reducing cache misses, but the results show more cache misses. The faster execution seems to come from fewer instructions instead.
I didn't really see a clear benefit from the first experiment.
|
@visitorckw — just wanted to follow up on this PR when you have time. Appreciate your thoughts! |
Since row-wise and column-wise checks are logically equivalent in terms of win probability, evaluating row-wise directions first is more favorable due to better spatial locality.
This reordering does not alter functional behavior but may lead to more efficient execution in practice.