casper-node [Spike] Run opcode benchmark, re-calculate opcode costs

Objective: To run opcode benchmarks and recalculate opcode costs on the Casper Network to ensure optimal performance and accurate gas costs.

To maintain the efficiency and fairness of gas consumption on the Casper Network, we need to run benchmarks for all supported opcodes and recalculate their associated costs. This process will ensure that the gas fees accurately reflect the computational resources used by each opcode, improving network performance and reducing the likelihood of under- or overcharging for operations.

Sep 10 '24 13:09 devendran-m

Hi @mpapierski , As part of backlog clean-up on the 13th of Sep, Ed advised me to assign this ticket to you.

Sep 16 '24 07:09 devendran-m

Sidebar for tomorrow.

Oct 01 '24 15:10 devendran-m

Results of most basic benchmark (executing 1000 * 7000 nops) in comparison:

Gas injector	Stack limiter	Time	Throughput	Comment
false	false	`71.655 µs`	`90.024 Gelem/s`	baseline, nops only
false	true	`81.540 µs`	`85.848 Gelem/s`	compared to baseline: time +8% thrpt: -7.9%
true	false	`224.49 µs`	`31.181 Gelem/s`	compared to baseline: time +148.02%, thrpt: -59.680%
true	true	`212.79 µs`	`32.896 Gelem/s`	time +135.09% thrpt: -57.463% change within noise compared to row above

Observation: The stack height limiter does not change the outcome as much in this benchmark due to the nature of this modification: it modifies function calls, not blocks of code. Where gas injector works with blocks of codes which makes it more noticeable.

The issue I think is with the stack limiter is that it is hard to benchmark because it's cost is amortizing over the number of expected opcodes, but when running on real wasm it is more noticeable due to the depth of calls.

The code written is made into a separate repo and can be reproduced easily and tweaked more. It also has calculated opcode costs for both wasmi and wasmer VMs. The validation logic checks the set of CPU intensive Wasms for run time and gas costs (3.3k CSPR + 16384ms) and the run times are below the target time but it does not hit the time target exactly as expected (i.e. 15s != 16.3s) etc. @AlexanderLimonov @fizyk20 we should meet and review the methodology together

Oct 07 '24 16:10 mpapierski

@mpapierski I should generally be available for meetings anytime during regular US Central working hours next week, with some exceptions

Oct 10 '24 18:10 AlexanderLimonov

After meeting with @fizyk20

Zug should handle slower validator nodes just fine
We should calculate opcodes for 16s. Consensus has an additional timeout for proposals that's ~30s
If the concern is that the benchmark is imperfect and the pessimistic validation Wasms run at different times (i.e. 10s, 15s, etc.) we should measure gas per second and measure also for happy paths (i.e. fib(1), fib(2), ..., fib(50)) and analyze the distribution. Measured and calculated opcodes should be fine as long as the median of the gas per second metric is stable.

Oct 18 '24 13:10 mpapierski

Closing this spike story as the objective is achieved. A follow-up user story #4951 has been created to proceed with the analysis and design based on the findings.

Nov 05 '24 14:11 devendran-m