vector
vector copied to clipboard
feat(transform): Add redis enrichment transformer
Summary
Adds a transformer that allows users to use a templated value from each event and enrich it with data stored in redis.
Vector configuration
Example Configurations:
No Local Caching
sources:
s3_logs:
type: aws_s3
region: us-west-2
compression: "gzip"
decoding:
codec: "json"
sqs:
queue_url: https://sqs.us-west-2.amazonaws.com/*/tmp-dkutta
transforms:
enrich_data:
inputs: ["s3_logs"]
type: redis
url: "redis://127.0.0.1:6379/0"
key: "{{ application }}"
output_field: "app_metadata"
sinks:
black_hole:
type: blackhole
inputs:
- enrich_data
Enable LRU Caching:
transforms:
enrich_data_cache:
inputs: ["s3_logs"]
type: redis
url: "redis://127.0.0.1:6379/0"
key: "{{ application }}"
output_field: "app_metadata"
cache_max_size: 10000
cache_ttl: 10000
How did you test this PR?
Test Setup:
- Local Machine: M3 Macbook
- Local Redis running in Docker
- ~ 70% of log events had associated keys in Redis with content to enrich the event with.
I ran local builds of vector to validate functionality and performance.
Results:
using cargo run
- No Transform: ~100k logs/s
- Remap Transform: ~90k logs/s (simple just adds a property to the events)
- Redis Transform no Cache: ~70k logs/s
- Redis Transform w/ Cache: ~65k logs/s
- this surprised me, I was expecting to have higher throughput by minimizing network i/o. This will likely be useful for redis clusters that have higher network i/o related to it.
using release built artifact
- No Transform: (~90% cpu | 315 MB)
- Remap Transform: (~130% cpu | 560 MB)
- Redis Transform no cache: (~280% cpu | 618 MB)
- Redis Transform w/ Cache: (~250% cpu | 670 MB)
When running in release mode, the performance differences became negligible when running with a locally hosted redis server. The CPU utilization was roughly double that of a simple remap.
Change Type
- [ ] Bug fix
- [x] New feature
- [ ] Non-functional (chore, refactoring, docs)
- [ ] Performance
Is this a breaking change?
- [ ] Yes
- [x] No
Does this PR include user facing changes?
- [x] Yes. Please add a changelog fragment based on our guidelines.
- [ ] No. A maintainer will apply the
no-changeloglabel to this PR.
References
Notes
- Please read our Vector contributor resources.
- Do not hesitate to use
@vectordotdev/vectorto reach out to us regarding this PR. - Some CI checks run only after we manually approve them.
- We recommend adding a
pre-pushhook, please see this template. - Alternatively, we recommend running the following locally before pushing to the remote branch:
-
make fmt -
make check-clippy(if there are failures it's possible some of them can be fixed withmake clippy-fix) -
make test
-
- We recommend adding a
- After a review is requested, please avoid force pushes to help us review incrementally.
- Feel free to push as many commits as you want. They will be squashed into one before merging.
- For example, you can run
git merge origin masterandgit push.
- If this PR introduces changes Vector dependencies (modifies
Cargo.lock), please runmake build-licensesto regenerate the license inventory and commit the changes (if any). More details here.
Hi there, thanks for this PR. Adding a do not merge from the docs team till after the Vector team approves these changes. Let us know once this is ready :)