feat(transform): Add redis enrichment transformer

Open akutta opened this issue 2 months ago • 1 comments

Summary

Adds a transformer that allows users to use a templated value from each event and enrich it with data stored in redis.

Vector configuration

Example Configurations:

No Local Caching

sources:
  s3_logs:
    type: aws_s3
    region: us-west-2
    compression: "gzip"
    decoding:
      codec: "json"
    sqs:
      queue_url: https://sqs.us-west-2.amazonaws.com/*/tmp-dkutta

transforms:
  enrich_data:
    inputs: ["s3_logs"]
    type: redis
    url: "redis://127.0.0.1:6379/0"
    key: "{{ application }}"
    output_field: "app_metadata"

sinks:
  black_hole:
    type: blackhole
    inputs:
      - enrich_data

Enable LRU Caching:

transforms:
  enrich_data_cache:
    inputs: ["s3_logs"]
    type: redis
    url: "redis://127.0.0.1:6379/0"
    key: "{{ application }}"
    output_field: "app_metadata"
    cache_max_size: 10000
    cache_ttl: 10000

How did you test this PR?

Test Setup:

Local Machine: M3 Macbook
Local Redis running in Docker
- ~ 70% of log events had associated keys in Redis with content to enrich the event with.

I ran local builds of vector to validate functionality and performance.

Results:

using cargo run

No Transform: ~100k logs/s
Remap Transform: ~90k logs/s (simple just adds a property to the events)
Redis Transform no Cache: ~70k logs/s
Redis Transform w/ Cache: ~65k logs/s
- this surprised me, I was expecting to have higher throughput by minimizing network i/o. This will likely be useful for redis clusters that have higher network i/o related to it.

using release built artifact

No Transform: (~90% cpu | 315 MB)
Remap Transform: (~130% cpu | 560 MB)
Redis Transform no cache: (~280% cpu | 618 MB)
Redis Transform w/ Cache: (~250% cpu | 670 MB)

When running in release mode, the performance differences became negligible when running with a locally hosted redis server. The CPU utilization was roughly double that of a simple remap.

Change Type

[ ] Bug fix
[x] New feature
[ ] Non-functional (chore, refactoring, docs)
[ ] Performance

Is this a breaking change?

[ ] Yes
[x] No

Does this PR include user facing changes?

[x] Yes. Please add a changelog fragment based on our guidelines.
[ ] No. A maintainer will apply the no-changelog label to this PR.

References

Notes

Please read our Vector contributor resources.
Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
Some CI checks run only after we manually approve them.
- We recommend adding a pre-push hook, please see this template.
- Alternatively, we recommend running the following locally before pushing to the remote branch:
  - make fmt
  - make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
  - make test
After a review is requested, please avoid force pushes to help us review incrementally.
- Feel free to push as many commits as you want. They will be squashed into one before merging.
- For example, you can run git merge origin master and git push.
If this PR introduces changes Vector dependencies (modifies Cargo.lock), please run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

Dec 05 '25 01:12 akutta

Hi there, thanks for this PR. Adding a do not merge from the docs team till after the Vector team approves these changes. Let us know once this is ready :)

Dec 05 '25 21:12 iadjivon