Skip to content
This repository was archived by the owner on Apr 18, 2025. It is now read-only.

FEAT: parallelize witness assignment of bytecode circuit using assign_regions api #530

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

Velaciela
Copy link

@Velaciela Velaciela commented Jun 9, 2023

Type of change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Description

split and parallelize the time-consuming part of bytecode circuit assignment by using assign_regions()


uint test:

cargo test --package zkevm-circuits --lib bytecode_circuit::test::bytecode_circuit_parallel_assignment --features scroll,parallel_syn -- --nocapture

Feature: scroll and parallel_syn is required and cannot be executed together with regular unit tests,
This patch adds Run parallel assignment tests into CI workflow


super circuict benchmarking in scroll-zkevm:

Before: 21.000s

····Start: assign region: assign bytecode with poseidon hash extension
······Start: assign region 1st pass: assign bytecode with poseidon hash extension
······End: assign region 1st pass: assign bytecode with poseidon hash extension 10.638s
······Start: assign region 2nd pass: assign bytecode with poseidon hash extension
······End: assign region 2nd pass: assign bytecode with poseidon hash extension 10.361s
····End: assign region: assign bytecode with poseidon hash extension ...........21.000s

Now: 0.952s

1st pass: 771.092µs (calculate shape & return)
2nd pass: 0.951350769s (parallel assignment)

Speedup: 22x

Since the multi-phase num is 3, the above assignment will be executed three times,
so this patch reduce the time of witness generation(super circuit) by ~60s (20s x 3)


Notes

In super circuit (v0.4), bytecode circuit has 28 witness instances
When using assign_regions() for parallel execution, the load is unbalanced(min: 1, max: 24498)
but further optimization yields less, so here is just a log of profiling result

bytecode.len(): [2801, 17777, 24385, 46, 3943, 4936, 1322, 1822, 3177, 1360, 2164, 3060, 9268, 22143, 22143, 1614, 2152, 22143, 16663, 5333, 1, 16663, 24498, 16663, 1653, 10506, 6379, 12417]
assign bytecode with poseidon hash extension
 CS forked into 28 subCS took 65.971µs
 region assign(regions) bytecode with poseidon hash extension_20 2nd pass synthesis took 16.31µs
 region assign(regions) bytecode with poseidon hash extension_3 2nd pass synthesis took 1.809796ms
 region assign(regions) bytecode with poseidon hash extension_6 2nd pass synthesis took 49.93415ms
 region assign(regions) bytecode with poseidon hash extension_9 2nd pass synthesis took 51.426171ms
 region assign(regions) bytecode with poseidon hash extension_15 2nd pass synthesis took 60.946351ms
 region assign(regions) bytecode with poseidon hash extension_24 2nd pass synthesis took 75.451983ms
 region assign(regions) bytecode with poseidon hash extension_16 2nd pass synthesis took 81.610603ms
 region assign(regions) bytecode with poseidon hash extension_10 2nd pass synthesis took 94.575643ms
 region assign(regions) bytecode with poseidon hash extension_7 2nd pass synthesis took 99.249481ms
 region assign(regions) bytecode with poseidon hash extension_0 2nd pass synthesis took 105.976199ms
 region assign(regions) bytecode with poseidon hash extension_11 2nd pass synthesis took 115.976876ms
 region assign(regions) bytecode with poseidon hash extension_8 2nd pass synthesis took 120.148077ms
 region assign(regions) bytecode with poseidon hash extension_4 2nd pass synthesis took 149.78719ms
 region assign(regions) bytecode with poseidon hash extension_5 2nd pass synthesis took 186.413146ms
 region assign(regions) bytecode with poseidon hash extension_19 2nd pass synthesis took 201.365304ms
 region assign(regions) bytecode with poseidon hash extension_26 2nd pass synthesis took 288.769832ms
 region assign(regions) bytecode with poseidon hash extension_12 2nd pass synthesis took 350.249662ms
 region assign(regions) bytecode with poseidon hash extension_25 2nd pass synthesis took 397.082566ms
 region assign(regions) bytecode with poseidon hash extension_27 2nd pass synthesis took 502.59244ms
 region assign(regions) bytecode with poseidon hash extension_21 2nd pass synthesis took 630.980547ms
 region assign(regions) bytecode with poseidon hash extension_18 2nd pass synthesis took 631.629386ms
 region assign(regions) bytecode with poseidon hash extension_23 2nd pass synthesis took 631.664026ms
 region assign(regions) bytecode with poseidon hash extension_1 2nd pass synthesis took 722.695257ms
 region assign(regions) bytecode with poseidon hash extension_13 2nd pass synthesis took 839.83569ms
 region assign(regions) bytecode with poseidon hash extension_17 2nd pass synthesis took 840.486889ms
 region assign(regions) bytecode with poseidon hash extension_14 2nd pass synthesis took 841.069388ms
 region assign(regions) bytecode with poseidon hash extension_2 2nd pass synthesis took 926.174553ms
 region assign(regions) bytecode with poseidon hash extension_22 2nd pass synthesis took 930.444835ms
 Merge 28 subCS back took 6.24µs
 28 sub_regions of assign(regions) bytecode with poseidon hash extension 2nd pass synthesis took 932.071649ms

@kunxian-xia kunxian-xia added enhancement New feature or request and removed crate-circuit-benchmarks T-bench labels Jun 9, 2023
@Velaciela Velaciela force-pushed the parallel_bytecode_assign branch from a62b586 to ec5d89f Compare June 12, 2023 18:01
@github-actions github-actions bot added the CI label Jun 12, 2023
@kunxian-xia kunxian-xia changed the title FEAT: parallelize bytecode assignment FEAT: parallelize witness assignment of bytecode circuit using assign_regions api Jun 13, 2023
@Velaciela
Copy link
Author

we need to merge scroll-tech/halo2#50 before we can pass all unit tests

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants