numaflow
numaflow copied to clipboard
Performance Analysis for Map and Reduce
Performance testing strategy
- latency of pipeline with source, map, reduce and sink vertices
- number of records that can be sent from the source with no sustained back pressure
- cpu of 1 core and memory of 4GB
- udf in map (identity function that extracts the key from payload and returns it back)
- udf in reduce (compute the sum per key)
- source is tickgen (vary the pod count according to the requirement)
- sink is blackhole (no output)
- reduce at parallelism of 2 pods
- map at max pod count of 2 pods
- linearity of throughput by increasing map and reduce
- scaleup jetstream as needed
- repeatable performance test suite (checkin the pipeline configuration)
Performance Data Capture
- grafana dashboards based on prometheus metrics
- capture the timestamps of performance tests
- download raw data using APIs from grafana metric charts from the precanned dashboards
Checklist of performance test
- [ ] throughput benchmark for map for a given cpu/memory combination
- [ ] throughput benchmark for reduce for a given cpu/memory combination
- [ ] single key throughput
- [ ] multi key throughput (100 keys)
- [ ] vary the number of keys (increase up to 100K in logarithmic steps) and observe the throughput
- [ ] replay throughput for reduce benchmark (focus is on WAL)
- [ ] with 100 keys (max throughput)
- [ ] with single key (max throughput)
- [ ] with 10K keys or what ever is the max key throughput (max throughput)
- [ ] grpc throughput tests
- [ ] map grpc throughput max
- [ ] reduce grpc throughput max (vary the density of the stream per key)
- [ ] 5 rpu per key
- [ ] 100 rpu per key
- [ ] 1000 rpu per key
- [ ] up to max rpu for a single key
- [ ] repeat it with multiple keys