fluent-plugin-opensearch icon indicating copy to clipboard operation
fluent-plugin-opensearch copied to clipboard

SSL_read: unexpected eof while reading (OpenSSL::SSL::SSLError)

Open ecerulm opened this issue 1 year ago • 2 comments

(check apply)

  • [ ] read the contribution guideline
  • [ ] (optional) already reported 3rd party upstream repository or mailing list if you use k8s addon or helm charts.

Steps to replicate

WIth the following /etc/fluent/fluentd.conf

<source>
  @type forward
  @id input_forward
</source>
<source>
  @type unix
</source>

<match debug.**>
  @type stdout
  @id output_stdout
</match>

<match microservices.**>
  @type opensearch_data_stream
  data_stream_name logs-microservices
  data_stream_template_name logs-template
  ssl_verify true
  ca_file /etc/ssl/certs/ca-certificates.crt
  with_transporter_log true
  reload_on_failure true
  <buffer>
    flush_interval 5s
  </buffer>

  <endpoint>
    url https://opensearch.xxxxxx.com
    region us-east-1
    assume_role_arn arn:aws:iam::xxxxxxx:role/MasterUserRole
    assume_role_session_name microservices-ec2
    refresh_credentials_interval 5h # default is 5h (five hours).
  </endpoint>
</match>

I get on startup the following error

2024-02-22 22:52:52 +0000 [warn]: #0 Could not communicate to OpenSearch, resetting connection and trying again. SSL_read: unexpected eof while reading (OpenSSL::SSL::SSLError)

sometime it solves itself after a some retries, but sometime it gets stuck (on startup for long periods , I never waited for more that 10 minutes though).

If I sudo systemctl stop fluentd.service , fluentd will receive the graceful shutdown but it will take a long time to actually shutdown. It seems to be like it can't shutdown while a "retry is pending".

After the shutdown, if I restart it again then the SSL_read problem will disappear, which suggest to me that the problem is not on the AWS OpenSearch side,

Expected Behavior or What you need to ask

I would expect to succeed, or provide a better description of what is going on .

...

Using Fluentd and OpenSearch plugin versions

  • OS version: Ubuntu 22.04 LTS
  • Bare Metal or within Docker or Kubernetes or others?
  • Fluentd v1.0 or later
    • fluent-package 5.0.2 fluentd 1.16.3 (d3cf2e0f95a0ad88b9897197db6c5152310f114f)
  • OpenSearch plugin version
2024-02-22 23:21:03 +0000 [info]: gem 'fluentd' version '1.16.3'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-calyptia-monitoring' version '0.1.3'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '5.4.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-flowcounter-simple' version '0.1.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-kafka' version '0.19.2'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-metrics-cmetrics' version '0.1.2'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-opensearch' version '1.1.4'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-prometheus' version '2.1.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-prometheus_pushgateway' version '0.1.1'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.1.1'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.4.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-s3' version '1.7.2'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-sd-dns' version '0.1.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.5'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-td' version '1.2.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-utmpx' version '0.5.0'
2024-02-22 23:21:03 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.5.0'
fluent-gem list

*** LOCAL GEMS ***

abbrev (default: 0.1.1)
addressable (2.8.5)
async (1.31.0)
async-http (0.61.0)
async-io (1.38.0)
async-pool (0.4.0)
aws-eventstream (1.2.0)
aws-partitions (1.785.0)
aws-sdk-core (3.178.0)
aws-sdk-kms (1.71.0)
aws-sdk-s3 (1.129.0)
aws-sdk-sqs (1.61.0)
aws-sigv4 (1.6.0)
base64 (0.2.0, default: 0.1.1)
benchmark (default: 0.2.1)
bigdecimal (default: 3.1.3)
bindata (2.4.15)
bundler (default: 2.4.10, 2.3.26)
cgi (default: 0.3.6)
cmetrics (0.3.3)
concurrent-ruby (1.2.2)
console (1.23.2)
cool.io (1.8.0)
csv (default: 3.2.6)
date (default: 3.3.3)
debug (1.7.1)
delegate (default: 0.3.0)
did_you_mean (default: 1.6.3)
digest (default: 3.1.1)
digest-crc (0.6.5)
digest-murmurhash (1.1.1)
drb (default: 2.1.1)
elastic-transport (8.3.0)
elasticsearch (8.8.0)
elasticsearch-api (8.8.0)
english (default: 0.7.2)
erb (default: 4.0.2)
error_highlight (default: 0.5.1)
etc (default: 1.4.2)
excon (0.109.0, 0.104.0)
faraday (2.7.12)
faraday-excon (2.1.0)
faraday-net_http (3.0.2)
faraday_middleware-aws-sigv4 (1.0.1)
fcntl (default: 1.0.2)
ffi (1.15.5)
fiber-annotation (0.2.0)
fiber-local (1.0.0)
fiddle (default: 1.1.1)
fileutils (1.7.2, default: 1.7.0)
find (default: 0.1.1)
fluent-config-regexp-type (1.0.0)
fluent-diagtool (1.0.3)
fluent-logger (0.9.0)
fluent-plugin-calyptia-monitoring (0.1.3)
fluent-plugin-elasticsearch (5.4.0)
fluent-plugin-flowcounter-simple (0.1.0)
fluent-plugin-kafka (0.19.2)
fluent-plugin-metrics-cmetrics (0.1.2)
fluent-plugin-opensearch (1.1.4)
fluent-plugin-prometheus (2.1.0)
fluent-plugin-prometheus_pushgateway (0.1.1)
fluent-plugin-record-modifier (2.1.1)
fluent-plugin-rewrite-tag-filter (2.4.0)
fluent-plugin-s3 (1.7.2)
fluent-plugin-sd-dns (0.1.0)
fluent-plugin-systemd (1.0.5)
fluent-plugin-td (1.2.0)
fluent-plugin-utmpx (0.5.0)
fluent-plugin-webhdfs (1.5.0)
fluentd (1.16.3)
forwardable (default: 1.3.3)
getoptlong (default: 0.2.0)
hirb (0.7.3)
http_parser.rb (0.8.0)
httpclient (2.8.3)
io-console (default: 0.6.0)
io-nonblock (default: 0.2.0)
io-wait (default: 0.3.0)
ipaddr (default: 1.2.5)
irb (default: 1.6.2)
jmespath (1.6.2)
json (default: 2.6.3)
linux-utmpx (0.3.0)
logger (default: 1.5.3)
ltsv (0.1.2)
matrix (0.4.2)
mini_portile2 (2.8.2)
minitest (5.16.3)
msgpack (1.7.2)
multi_json (1.15.0)
mutex_m (default: 0.1.2)
net-ftp (0.2.0)
net-http (default: 0.3.2)
net-imap (0.3.4)
net-pop (0.1.2)
net-protocol (default: 0.2.1)
net-smtp (0.3.3)
nio4r (2.6.1)
nkf (default: 0.1.2)
observer (default: 0.1.1)
oj (3.16.1)
open-uri (default: 0.3.0)
open3 (default: 0.1.2)
opensearch-api (2.2.0)
opensearch-ruby (2.1.0)
opensearch-transport (2.1.0)
openssl (default: 3.1.0)
optparse (default: 0.3.1)
ostruct (default: 0.5.5)
parallel (1.20.1)
pathname (default: 0.2.1)
power_assert (2.0.3)
pp (default: 0.4.0)
prettyprint (default: 0.1.1)
prime (0.1.2)
prometheus-client (2.1.0)
protocol-hpack (1.4.2)
protocol-http (0.25.0)
protocol-http1 (0.16.0)
protocol-http2 (0.15.1)
pstore (default: 0.1.2)
psych (default: 5.0.1)
public_suffix (5.0.4)
racc (default: 1.6.2)
rake (13.1.0, 13.0.6)
rbs (2.8.2)
rdkafka (0.12.0)
rdoc (default: 6.5.0)
readline (default: 0.0.3)
readline-ext (default: 0.1.5)
reline (default: 0.3.2)
resolv (default: 0.2.2)
resolv-replace (default: 0.1.1)
rexml (3.2.6, 3.2.5)
rinda (default: 0.1.1)
rss (0.2.9)
ruby-kafka (1.5.0)
ruby-progressbar (1.13.0)
ruby2_keywords (default: 0.0.5)
rubyzip (1.3.0)
securerandom (default: 0.2.2)
serverengine (2.3.2)
set (default: 1.0.3)
shellwords (default: 0.1.0)
sigdump (0.2.5)
singleton (default: 0.1.1)
stringio (default: 3.0.4)
strptime (0.2.5)
strscan (default: 3.0.5)
syntax_suggest (default: 1.0.2)
syslog (default: 0.1.1)
systemd-journal (1.4.2)
td (0.17.1)
td-client (1.0.8)
td-logger (0.3.28)
tempfile (default: 0.1.3)
test-unit (3.5.7)
time (default: 0.2.2)
timeout (default: 0.3.1)
timers (4.3.5)
tmpdir (default: 0.1.3)
traces (0.11.1)
tsort (default: 0.1.1)
typeprof (0.21.3)
tzinfo (2.0.6)
tzinfo-data (1.2023.3)
un (default: 0.2.1)
uri (0.12.2, default: 0.12.1)
weakref (default: 0.1.2)
webhdfs (0.10.2)
webrick (1.8.1)
yajl-ruby (1.4.3)
yaml (default: 0.2.1)
zip-zip (0.3)
zlib (default: 3.0.0)
  • OpenSearch version : AWS OpenSearch engine_version = "OpenSearch_2.11"
  • OpenSearch template(s) (optional)

ecerulm avatar Feb 22 '24 23:02 ecerulm

I've been trying with

      reconnect_on_error true
      reload_on_failure true
      resurrect_after 5s
      ssl_version TLSv1_3

and still see the problem during startup, I would say that 50% of the time.

But using typhoeus makes the problem disappear.

I installed faraday-typhoeus with

sudo apt install libcurl4-openssl-dev
sudo fluent-gem install faraday-typhoeus

and then added the following to the configuration

      http_backend typhoeus

and that seem to WORK , I restarted fluentd 10 times and I haven't seen the SSL_read: unexpected eof while reading

Configuration :

    <match wordpress.**>
      @type opensearch_data_stream
      data_stream_name logs-wordpress
      data_stream_template_name logs-template

      ssl_verify true
      ssl_version TLSv1_3
      ca_file /etc/ssl/certs/ca-certificates.crt

      with_transporter_log true
      @log_level debug

      reconnect_on_error true
      reload_on_failure true
      resurrect_after 5s

      http_backend typhoeus

      <buffer>
        flush_interval 5s
        retry_type periodic
        retry_wait 10s
        retry_max_times 15
      </buffer>

      <endpoint>
        url https://opensearch.rubenlaguna.com
        region us-east-1
        assume_role_arn arn:aws:iam::625310699998:role/MasterUserRole
        assume_role_session_name wordpress-ec2
        refresh_credentials_interval 5h # default is 5h (five hours).
      </endpoint>
    </match>

ecerulm avatar Feb 23 '24 18:02 ecerulm