feat: Poseidon2 gates for Ultra arithmetisation
Add Poseidon2 gates to the UltraCircuitBuilder which now ensures that recursive verifier instantiated with the Ultra arithmetisation produce the correct number of constraints.
Updates required:
- change verification key length and constant proof length constants across the codebase (two selectors from the new gate whose commitments need to be in the vk and the poseidon relation becomes the one with the highest degree); changes to Prover.toml accordingly
- ensure the ultra recursive verifier still stays constant size now that hashing produces gates
- small modification to solidity verifer to reflect the ones in cpp with the caveat that the UltraKeccak flavor still doesnt support Poseidon gate (changes coming in a followup PR)
Tube circuit changes in # of gates (post finalisation):
- number of gates prior this change, in master: 13947018
- number of gates post this change: 14038982
Closes https://github.com/AztecProtocol/barretenberg/issues/1041
Changes to circuit sizes
Generated at commit: 8d6a989ea79f0c445b115239c3507c703bd15e85, compared to commit: 70e61f973de063a972c726303f579ef34441d85f
🧾 Summary (100% most significant diffs)
| Program | ACIR opcodes (+/-) | % | Circuit size (+/-) | % |
|---|---|---|---|---|
| public_kernel_setup | 0 ➖ | 0.00% | +87,604 ❌ | +4.88% |
| public_kernel_app_logic | 0 ➖ | 0.00% | +87,605 ❌ | +4.88% |
| public_kernel_teardown | 0 ➖ | 0.00% | +87,604 ❌ | +4.88% |
| rollup_merge | 0 ➖ | 0.00% | +94,698 ❌ | +3.63% |
| parity_root | 0 ➖ | 0.00% | +185,085 ❌ | +3.55% |
| rollup_root | 0 ➖ | 0.00% | +100,051 ❌ | +2.53% |
| public_kernel_tail | 0 ➖ | 0.00% | -5,242,812 ✅ | -56.27% |
| rollup_base | 0 ➖ | 0.00% | -5,453,580 ✅ | -60.19% |
Full diff report 👇
| Program | ACIR opcodes (+/-) | % | Circuit size (+/-) | % |
|---|---|---|---|---|
| public_kernel_setup | 214,252 (0) | 0.00% | 1,882,602 (+87,604) | +4.88% |
| public_kernel_app_logic | 216,339 (0) | 0.00% | 1,882,864 (+87,605) | +4.88% |
| public_kernel_teardown | 217,364 (0) | 0.00% | 1,883,586 (+87,604) | +4.88% |
| rollup_merge | 319 (0) | 0.00% | 2,703,744 (+94,698) | +3.63% |
| parity_root | 374 (0) | 0.00% | 5,393,026 (+185,085) | +3.55% |
| rollup_root | 744 (0) | 0.00% | 4,051,186 (+100,051) | +2.53% |
| public_kernel_tail | 947,390 (0) | 0.00% | 4,073,918 (-5,242,812) | -56.27% |
| rollup_base | 347,730 (0) | 0.00% | 3,607,548 (-5,453,580) | -60.19% |
Benchmark results
Metrics with a significant change:
- protocol_circuit_simulation_time_in_ms (private-kernel-inner): 225 (+46%)
- protocol_circuit_simulation_time_in_ms (private-kernel-tail-to-public): 673 (-39%)
- avm_simulation_time_ms (Token:transfer_public): 18.9 (-31%)
- protocol_circuit_proving_time_in_ms (undefined): 78,488 (+18%)
Detailed results
All benchmarks are run on txs on the Benchmarking contract on the repository. Each tx consists of a batch call to create_note and increment_balance, which guarantees that each tx has a private call, a nested private call, a public call, and a nested public call, as well as an emitted private note, an unencrypted log, and public storage read and write.
This benchmark source data is available in JSON format on S3 here.
Proof generation
Each column represents the number of threads used in proof generation.
| Metric | 1 threads | 4 threads | 16 threads | 32 threads | 64 threads |
|---|---|---|---|---|---|
| proof_construction_time_sha256_ms | 5,780 (-1%) | 1,565 | 709 | 750 | 779 (+2%) |
| proof_construction_time_sha256_30_ms | 11,552 (-3%) | 3,080 (-4%) | 1,375 (-2%) | 1,431 | 1,462 |
| proof_construction_time_sha256_100_ms | 44,163 (-3%) | 11,803 (-1%) | 5,416 | 5,401 (-1%) | 5,682 (+6%) |
| proof_construction_time_poseidon_hash_ms | 79.0 | 34.0 | 34.0 | 59.0 | 88.0 |
| proof_construction_time_poseidon_hash_30_ms | 1,534 (-1%) | 424 | 204 | 235 (+6%) | 270 (+1%) |
| proof_construction_time_poseidon_hash_100_ms | 5,664 (-1%) | 1,517 | 677 | 738 (+1%) | 751 (+1%) |
L2 block published to L1
Each column represents the number of txs on an L2 block published to L1.
| Metric | 4 txs | 8 txs | 16 txs |
|---|---|---|---|
| l1_rollup_calldata_size_in_bytes | 4,324 | 7,844 | 14,852 |
| l1_rollup_calldata_gas | 49,720 | 92,414 | 177,632 |
| l1_rollup_execution_gas | 1,373,748 | 2,107,338 | 3,892,612 |
| l2_block_processing_time_in_ms | 254 (-2%) | 440 (-9%) | 798 (-4%) |
| l2_block_building_time_in_ms | 8,894 (-1%) | 17,355 (-1%) | 34,822 |
| l2_block_rollup_simulation_time_in_ms | 8,893 (-1%) | 17,354 (-1%) | 34,821 |
| l2_block_public_tx_process_time_in_ms | 7,505 | 15,859 (-1%) | 33,271 |
L2 chain processing
Each column represents the number of blocks on the L2 chain where each block has 8 txs.
| Metric | 3 blocks | 5 blocks |
|---|---|---|
| node_history_sync_time_in_ms | 2,939 (-6%) | 3,883 (-5%) |
| node_database_size_in_bytes | 12,636,240 | 16,719,952 |
| pxe_database_size_in_bytes | 16,254 | 26,813 |
Circuits stats
Stats on running time and I/O sizes collected for every kernel circuit run across all benchmarks.
| Circuit | simulation_time_in_ms | witness_generation_time_in_ms | input_size_in_bytes | output_size_in_bytes | proving_time_in_ms |
|---|---|---|---|---|---|
| private-kernel-init | 91.1 (+1%) | 401 (+6%) | 21,755 (+1%) | 44,860 | N/A |
| private-kernel-inner | :warning: 225 (+46%) | 702 (+4%) | 72,566 (+1%) | 45,007 | N/A |
| private-kernel-reset-tiny | 464 (-1%) | 869 (+4%) | 65,675 | 44,846 | N/A |
| private-kernel-tail | 195 (-1%) | 157 (+3%) | 50,686 (+1%) | 52,257 | N/A |
| base-parity | 5.62 (+2%) | N/A | 160 | 96.0 | N/A |
| root-parity | 35.2 (+7%) | N/A | 73,948 (+7%) | 96.0 | N/A |
| base-rollup | 2,728 (-1%) | N/A | 189,136 (+1%) | 664 | N/A |
| root-rollup | 40.2 (+5%) | N/A | 58,173 (+7%) | 716 | N/A |
| public-kernel-setup | 83.9 (+2%) | N/A | 105,085 (+1%) | 71,222 | N/A |
| public-kernel-app-logic | 94.7 | N/A | 104,911 (+1%) | 71,222 | N/A |
| public-kernel-tail | 551 | N/A | 410,534 | 16,414 | N/A |
| private-kernel-reset-small | 467 (+2%) | N/A | 66,341 | 45,629 | N/A |
| private-kernel-tail-to-public | :warning: 673 (-39%) | 646 (+2%) | 460,796 | 1,825 (+8%) | N/A |
| public-kernel-teardown | 82.6 (+2%) | N/A | 105,349 (+1%) | 71,222 | N/A |
| merge-rollup | 19.9 (+4%) | N/A | 38,174 (+7%) | 664 | N/A |
| undefined | N/A | N/A | N/A | N/A | :warning: 78,488 (+18%) |
Stats on running time collected for app circuits
| Function | input_size_in_bytes | output_size_in_bytes | witness_generation_time_in_ms |
|---|---|---|---|
| ContractClassRegisterer:register | 1,344 | 11,731 | 342 |
| ContractInstanceDeployer:deploy | 1,408 | 11,731 | 18.3 (+1%) |
| MultiCallEntrypoint:entrypoint | 1,920 | 11,731 | 405 |
| FeeJuice:deploy | 1,376 | 11,731 | 388 (+2%) |
| SchnorrAccount:constructor | 1,312 | 11,731 | 74.4 (-1%) |
| SchnorrAccount:entrypoint | 2,304 | 11,731 | 410 |
| Token:privately_mint_private_note | 1,280 | 11,731 | 104 (+3%) |
| FPC:fee_entrypoint_public | 1,344 | 11,731 | 28.1 (+3%) |
| Token:transfer | 1,312 | 11,731 | 232 (+3%) |
| Benchmarking:create_note | 1,344 | 11,731 | 88.7 |
| SchnorrAccount:verify_private_authwit | 1,280 | 11,731 | 27.5 |
| Token:unshield | 1,376 | 11,731 | 524 (+1%) |
| FPC:fee_entrypoint_private | 1,376 | 11,731 | 705 (+1%) |
AVM Simulation
Time to simulate various public functions in the AVM.
| Function | time_ms | bytecode_size_in_bytes |
|---|---|---|
| FeeJuice:_increase_public_balance | 54.6 (-2%) | 7,739 |
| FeeJuice:set_portal | 10.0 (-21%) | 2,354 |
| Token:constructor | 79.5 (-2%) | 26,051 |
| FPC:constructor | 52.7 (+1%) | 18,001 |
| FeeJuice:mint_public | 39.3 (+3%) | 5,877 |
| Token:mint_public | 69.7 (+2%) | 10,917 |
| Token:assert_minter_and_mint | 40.9 (+3%) | 7,512 |
| AuthRegistry:set_authorized | 38.8 (+9%) | 4,391 |
| FPC:prepare_fee | 231 (+1%) | 7,043 |
| Token:transfer_public | :warning: 18.9 (-31%) | 39,426 |
| FPC:pay_refund | 52.6 (+3%) | 10,234 |
| Benchmarking:increment_balance | 928 (-1%) | 6,563 |
| Token:_increase_public_balance | 40.6 (+6%) | 8,433 |
| FPC:pay_refund_with_shielded_rebate | 63.3 (-4%) | 10,783 |
Public DB Access
Time to access various public DBs.
| Function | time_ms |
|---|---|
| get-nullifier-index | 0.155 (-5%) |
Tree insertion stats
The duration to insert a fixed batch of leaves into each tree type.
| Metric | 1 leaves | 16 leaves | 64 leaves | 128 leaves | 256 leaves | 512 leaves | 1024 leaves |
|---|---|---|---|---|---|---|---|
| batch_insert_into_append_only_tree_16_depth_ms | 2.17 (-4%) | 3.86 (-3%) | N/A | N/A | N/A | N/A | N/A |
| batch_insert_into_append_only_tree_16_depth_hash_count | 16.8 | 31.7 | N/A | N/A | N/A | N/A | N/A |
| batch_insert_into_append_only_tree_16_depth_hash_ms | 0.112 (-4%) | 0.109 (-3%) | N/A | N/A | N/A | N/A | N/A |
| batch_insert_into_append_only_tree_32_depth_ms | N/A | N/A | 11.5 | 17.5 (-5%) | 30.4 (-3%) | 58.4 (-10%) | 112 (-3%) |
| batch_insert_into_append_only_tree_32_depth_hash_count | N/A | N/A | 95.9 | 159 | 287 | 543 | 1,055 |
| batch_insert_into_append_only_tree_32_depth_hash_ms | N/A | N/A | 0.110 | 0.102 (-5%) | 0.0992 (-2%) | 0.101 (-10%) | 0.100 (-3%) |
| batch_insert_into_indexed_tree_20_depth_ms | N/A | N/A | 14.5 (-1%) | 25.2 (-8%) | 43.0 (-3%) | 81.4 (-14%) | 159 (-3%) |
| batch_insert_into_indexed_tree_20_depth_hash_count | N/A | N/A | 109 | 207 | 355 | 691 | 1,363 |
| batch_insert_into_indexed_tree_20_depth_hash_ms | N/A | N/A | 0.110 | 0.102 (-8%) | 0.104 (-2%) | 0.100 (-16%) | 0.102 (-3%) |
| batch_insert_into_indexed_tree_40_depth_ms | N/A | N/A | 16.4 (-5%) | N/A | N/A | N/A | N/A |
| batch_insert_into_indexed_tree_40_depth_hash_count | N/A | N/A | 132 | N/A | N/A | N/A | N/A |
| batch_insert_into_indexed_tree_40_depth_hash_ms | N/A | N/A | 0.105 (-5%) | N/A | N/A | N/A | N/A |
Miscellaneous
Transaction sizes based on how many contract classes are registered in the tx.
| Metric | 0 registered classes | 1 registered classes |
|---|---|---|
| tx_size_in_bytes | 64,779 | 668,997 |
Transaction size based on fee payment method
| Metric | | | - | |