ansible/openwhisk.yml fails for waiting kafka server started up
Environment details:
- local deployment
- Ubuntu 16.04
- Docker version 19.03.13, build 4484c46d9d
Steps to reproduce the issue:
- cd tools/ubuntu-setup && ./all.sh
- ansible-playbook setup.yml ; ansible-playbook prereq.yml (with envrionment variable setup for couchDB)
- ./gradlew distDocker
- ansible-playbook initdb.yml ; ansible-playbook wipe.yml
- ansible-playbook openwhisk.yml
Provide the actual results and outputs:
TASK [kafka : wait until the kafka server started up] ***********************************************************************************************************
Tuesday 01 December 2020 14:03:16 -0600 (0:00:27.886) 0:00:49.298 ******
FAILED - RETRYING: wait until the kafka server started up (10 retries left).
FAILED - RETRYING: wait until the kafka server started up (9 retries left).
FAILED - RETRYING: wait until the kafka server started up (8 retries left).
FAILED - RETRYING: wait until the kafka server started up (7 retries left).
FAILED - RETRYING: wait until the kafka server started up (6 retries left).
FAILED - RETRYING: wait until the kafka server started up (5 retries left).
FAILED - RETRYING: wait until the kafka server started up (4 retries left).
FAILED - RETRYING: wait until the kafka server started up (3 retries left).
FAILED - RETRYING: wait until the kafka server started up (2 retries left).
FAILED - RETRYING: wait until the kafka server started up (1 retries left).
fatal: [kafka0]: FAILED! => {"attempts": 10, "changed": true, "cmd": "(echo dump; sleep 1) | nc 172.17.0.1 2181 | grep /brokers/ids/0", "delta": "0:00:01.005511", "end": "2020-12-01 14:04:20.335370", "msg": "non-zero return code", "rc": 1, "start": "2020-12-01 14:04:19.329859", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
[FAILED]
> (echo dump; sleep 1) | nc 172.17.0.1 2181 | grep /brokers/ids/0
non-zero return code
PLAY RECAP ******************************************************************************************************************************************************
kafka0 : ok=9 changed=3 unreachable=0 failed=1
Additional information you deem important:
- docker ps (not sure why kafka keeps restarting??)
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3980b79c4ad0 wurstmeister/kafka:2.12-2.3.1 "start-kafka.sh" 9 minutes ago Restarting (1) Less than a second ago kafka0
471187e2ba20 zookeeper:3.4 "/docker-entrypoint.…" 10 minutes ago Up 10 minutes 0.0.0.0:2181->2181/tcp, 0.0.0.0:2888->2888/tcp, 0.0.0.0:3888->3888/tcp zookeeper0
- Tried this on a fresh Ubuntu 18.04 with same setup steps, no problem found.
any chance you're out of disk space?
you can check the kafka logs - another reason is that kafka isn't able to reach zookeeper - which means networking issue.
try sudo ifconfig lo0 alias 172.17.0.1/24.
I am getting this same error, and it seems to be a problem of kafka not being able to keep a stable connection to zookeeper. Using Ubuntu 16.01
Relevant kafka log section:
[2021-04-01 17:54:53,847] INFO Initiating client connection, connectString=172.17.0.1:2181 sessionTimeout=6000 watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@7e0b85f9 (org.apache.zookeeper.ZooKeeper)
[2021-04-01 17:54:53,892] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2021-04-01 17:54:53,898] INFO Opening socket connection to server 172.17.0.1/172.17.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-04-01 17:54:59,896] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient)
[2021-04-01 17:54:59,902] WARN Client session timed out, have not heard from server in 6012ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-04-01 17:55:00,009] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
[2021-04-01 17:55:00,012] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-04-01 17:55:00,014] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient)
[2021-04-01 17:55:00,019] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:258)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:254)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:112)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1826)
at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:364)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:387)
at kafka.server.KafkaServer.startup(KafkaServer.scala:207)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:84)
at kafka.Kafka.main(Kafka.scala)
[2021-04-01 17:55:00,022] INFO shutting down (kafka.server.KafkaServer)
[2021-04-01 17:55:00,032] INFO shut down completed (kafka.server.KafkaServer)
[2021-04-01 17:55:00,034] ERROR Exiting Kafka. (kafka.server.KafkaServerStartable)
[2021-04-01 17:55:00,039] INFO shutting down (kafka.server.KafkaServer)
I am getting this same error, and it seems to be a problem of kafka not being able to keep a stable connection to zookeeper. Using Ubuntu 16.01
Relevant kafka log section:
[2021-04-01 17:54:53,847] INFO Initiating client connection, connectString=172.17.0.1:2181 sessionTimeout=6000 watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@7e0b85f9 (org.apache.zookeeper.ZooKeeper) [2021-04-01 17:54:53,892] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient) [2021-04-01 17:54:53,898] INFO Opening socket connection to server 172.17.0.1/172.17.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [2021-04-01 17:54:59,896] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient) [2021-04-01 17:54:59,902] WARN Client session timed out, have not heard from server in 6012ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn) [2021-04-01 17:55:00,009] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper) [2021-04-01 17:55:00,012] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn) [2021-04-01 17:55:00,014] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient) [2021-04-01 17:55:00,019] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:258) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:254) at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:112) at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1826) at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:364) at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:387) at kafka.server.KafkaServer.startup(KafkaServer.scala:207) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38) at kafka.Kafka$.main(Kafka.scala:84) at kafka.Kafka.main(Kafka.scala) [2021-04-01 17:55:00,022] INFO shutting down (kafka.server.KafkaServer) [2021-04-01 17:55:00,032] INFO shut down completed (kafka.server.KafkaServer) [2021-04-01 17:55:00,034] ERROR Exiting Kafka. (kafka.server.KafkaServerStartable) [2021-04-01 17:55:00,039] INFO shutting down (kafka.server.KafkaServer)
I didn't dig much into this case since I found no issues on Ubuntu 18.04 (with the same scripts). Maybe you can try with a more up to date OS.
Mian
any chance you're out of disk space? you can check the kafka logs - another reason is that kafka isn't able to reach zookeeper - which means networking issue. try
sudo ifconfig lo0 alias 172.17.0.1/24.
@rabbah I have same issue.
I met this error when try to alia lo, do you know how to fix it?
:/$ sudo ifconfig lo alias 172.17.0.1/24
alias: Host name lookup failure
ifconfig: `--help' gives usage information.
OS: Ubuntu 22.04.1 LTS
ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 18592 bytes 2881713 (2.8 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 18592 bytes 2881713 (2.8 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
According to the logs, you guys need to check the sanity of zookeeper first. Is your zookeeper accessible from other containers?