fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

Amazon S3: Mismatch when reading HTTP header from GCS

Open gouyelliot opened this issue 1 year ago • 0 comments

Bug Report

Describe the bug While configuring our FluentBit instance, I reached a situation when the Amazon S3 Output Plugin would block for several minutes when trying to send the files.

I'm using Google Cloud Storage with a HMAC key (set with env vars), using the endpoint configuration set to https://storage.googleapis.com.

Here is the log when the plugin is trying to send the data:

fluent-bit[1343]: [2024/05/03 03:23:21] [debug] [upstream] KA connection #28 to storage.googleapis.com:443 is connected
fluent-bit[1343]: [2024/05/03 03:23:21] [debug] [http_client] not using http_proxy for header
fluent-bit[1343]: [2024/05/03 03:23:21] [debug] [aws_credentials] Requesting credentials from the env provider..
---
Here FluentBit blocks for 4 minutes...
---
fluent-bit[1343]: [2024/05/03 03:27:21] [error] [http_client] broken connection to storage.googleapis.com:443 ?
fluent-bit[1343]: [2024/05/03 03:27:21] [debug] [upstream] KA connection #28 to storage.googleapis.com:443 is now available
fluent-bit[1343]: [2024/05/03 03:27:21] [debug] [output:s3:s3-bids] PutObject http status=200
fluent-bit[1343]: [2024/05/03 03:27:21] [ info] [output:s3:s3-bids] Successfully uploaded object /source=sspengine/type=improvedigital_bids/year=2024/month=05/day=03/fluentd-aggregator-1-ams-testing-03-N3Fki4QI.json.gz

After digging in the source code, I found that the problem comes from the header_lookup function, which get the value of a header from the HTTP response.

Turns out that Google have a custom HTTP header named x-goog-stored-content-length, which is matched by the header_lookup instead of the Content-Length header here, resulting in the client trying to read from the socket again, and timing out after 4 minutes.

Here a example of HTTP response payload from GCS:

HTTP/1.1 200 OK
ETag: "f75bc68bd2645e669b5208da00ea3e02"
x-goog-generation: 1714721454001939
x-goog-metageneration: 1
x-goog-hash: crc32c=HBNrRA==
x-goog-hash: md5=91vGi9JkXmabUgjaAOo+Ag==
x-amz-checksum-crc32c: HBNrRA==
x-goog-stored-content-length: 1973
x-goog-stored-content-encoding: gzip
Vary: Origin
X-GUploader-UploadID: ABPtcPrl0G_stANkY8LXdqEaWL9nGpZjkCYHNFAyBZYlpvHDqJ0gfRAEkKsEM79BWkfhnoMC56g
Content-Length: 0
Date: Fri, 03 May 2024 07:30:54 GMT
Server: UploadServer
Content-Type: text/html; charset=UTF-8
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

To Reproduce Here my current config

[SERVICE]
    flush        5
    grace        30
    daemon       Off
    log_level    debug
    parsers_file /path/to/parsers.conf

[INPUT]
    Name   tail
    Parser test
    Path   /path/to/test.log

[OUTPUT]
    Name            s3
    Alias           s3-bids
    Match           *
    bucket          my-bucket
    compression     gzip
    upload_timeout  1m
    store_dir       /tmp/fluentbit/log
    use_put_object  On
    retry_limit     3
    total_file_size 250M
    content_type    application/json
    region          auto
    storage_class   STANDARD
    endpoint        https://storage.googleapis.com
    s3_key_format   /source=test/year=%Y/month=%m/day=%d/test-log-%H-$UUID.json.gz

Expected behavior The HTTP client should not use the x-goog-stored-content-length header as the content length of the request.

I'll try to create a PR next week, the bug is actually no hard to fix !

Your Environment

  • Version used: 3.0.3
  • Configuration: See above
  • Environment name and version (e.g. Kubernetes? What version?):
  • Server type and version: x64 Intel CPU
  • Operating System and version: AlmaLinux 9
  • Filters and plugins: No filters, plugins Amazon S3

gouyelliot avatar May 03 '24 10:05 gouyelliot