fulltextsearch icon indicating copy to clipboard operation
fulltextsearch copied to clipboard

`No alive nodes. All the 1 nodes seem to be down.` in `fulltextsearch:reset` but `fulltextsearch:stop` is working

Open cdhermann opened this issue 1 year ago • 4 comments

As described in https://github.com/nextcloud/all-in-one/discussions/4481 my Nextcloud logs are flooded with error messages like Retry 0: cURL error 6: Could not resolve host: nextcloud-aio-fulltextsearch (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://elastic:XXX@nextcloud-aio-fulltextsearch:9200/nextcloud-aio/_search?from=0&size=5&_source_excludes=content.

BUT, the search itself works perfectly fine.

I followed the instructions on https://github.com/nextcloud/all-in-one/discussions/2509 to reset the Elasticsearch container. Unfortunately, the command php occ fulltextsearch:reset gives the following error

# sudo docker exec --user www-data -it nextcloud-aio-nextcloud bash
0aed8be8812d:/var/www/html$ php occ fulltextsearch:stop
stopping all running indexes
0aed8be8812d:/var/www/html$ php occ fulltextsearch:reset
WARNING! You are about to reset your indexed documents:
- provider: ALL
- collection: ALL

Do you really want to reset your indexed documents ? (y/N) y

WARNING! This operation is not reversible.
Please confirm this destructive operation by typing 'reset ALL ALL': reset ALL ALL

In SimpleNodePool.php line 77:

  No alive nodes. All the 1 nodes seem to be down.


fulltextsearch:reset [--output [OUTPUT]] [--provider PROVIDER] [--collection COLLECTION]

The search is still working perfectly fine.

  • How to make the fulltextsearch app itself believe it is working fine?
  • Why is the stop command working if a follow-up reset command doesn't find any alive nodes?

cdhermann avatar Apr 02 '24 13:04 cdhermann

I have also had these two messages in the log file since NextCloud AIO v8.0.0 or maybe later

Update: For me the search doesn't work.

"Warning – fulltextsearch – platform seems down. we will update index next cron tick"

"Error – fulltextsearch_elasticsearch – Retry 0: cURL error 28: Failed to connect to nextcloud-aio-fulltextsearch port 9200 after 133728 ms: Couldn't connect to server (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://elastic:34f24033d69c0c6f926da89ccc316385d69f983115480922@nextcloud-aio-fulltextsearch:9200/nextcloud-aio/_doc/files%3A19974"

Messages have not disappeared with Nextcloud AIO v8.1.0 either

Debian v12.5 als Linux-Container in Proxmox v8.1.10 4 cores, 4GB ram

lachermeierm avatar Apr 08 '24 06:04 lachermeierm

Related: https://github.com/nextcloud/all-in-one/discussions/4290

karthikiyengar avatar Apr 20 '24 01:04 karthikiyengar

Elasticsearch don't like special characters in password. Try use Uppercase Lowercase and numbers. To me, it solved the All the 1 nodes seem to be down issue.

HajasDS avatar Apr 22 '24 20:04 HajasDS

@HajasDS Thank you for the tip. Which password do I need to change? I have only created one user and his password consists only of upper and lower case letters and numbers. Elasticsearch has also worked with a complex password. I tested it and unfortunately it didn't work.

lachermeierm avatar Apr 23 '24 07:04 lachermeierm

I'm having the same issue, discussed here: https://github.com/nextcloud/all-in-one/discussions/5064#discussioncomment-10190373 So far no solution.

qudiqudi avatar Aug 02 '24 12:08 qudiqudi

I tried to install nextcloud-aio manually following the instructions https://github.com/nextcloud/all-in-one/tree/main/manual-install but I got the same error:

image

After googling, I found a similar issue at https://github.com/curl/curl/issues/11104.

In the compose file, I set the FULLTEXTSEARCH_HOST variable to nextcloud-aio-fulltextsearch.nextcloud-aio (nextcloud-aio is the name of the network to which the containers are connected).

After that, Nextcloud was able to successfully connect to the Elasticsearch container.

Lumpiness avatar Sep 11 '24 22:09 Lumpiness

To set the FULLTEXTSEARCH_HOST variable worked with me, too.

Tepitus avatar Sep 19 '24 18:09 Tepitus

Looks like this isn't going to work with a full AIO install (docker-in-docker), only with manual install.

qudiqudi avatar Sep 20 '24 06:09 qudiqudi

I was able to resolve through the UI with a full AIO install.

Under Admin -> Full Text Search, append .nextcloud-aio to the end of the url in the "Address of the Servlet" setting

@nextcloud-aio-fulltextsearch.nextcloud-aio:9200

mlh5599 avatar Sep 21 '24 02:09 mlh5599

I was able to resolve through the UI with a full AIO install.

Under Admin -> Full Text Search, append .nextcloud-aio to the end of the url in the "Address of the Servlet" setting

@nextcloud-aio-fulltextsearch.nextcloud-aio:9200

Yes, that works. Though, changes are reset with a container restart, i.e. at next update. So we really need ~~the devs to step in here~~ to fix it in the container definition. I PR'ed something, not sure if that is enough. Let's see.

qudiqudi avatar Sep 21 '24 10:09 qudiqudi

In the meantime I created a script to detect and correct the setting. I'm simply running it hourly. It is a hack, but simple, and it it works until the root problem is fixed.

#!/bin/bash

current_setting=$(docker exec --user www-data nextcloud-aio-nextcloud php occ config:app:get fulltextsearch_elasticsearch elastic_host)

if [[ $current_setting == *"nextcloud-aio-fulltextsearch:9200"* ]]; then
    corrected_setting="${current_setting/nextcloud-aio-fulltextsearch:9200/nextcloud-aio-fulltextsearch.nextcloud-aio:9200}"
    echo "Replacing substring 'nextcloud-aio-fulltextsearch:9200' with 'nextcloud-aio-fulltextsearch.nextcloud-aio:9200'"
    echo "Before replacement: $current_setting"
    echo "After replacement: $corrected_setting"

    docker exec --user www-data nextcloud-aio-nextcloud php occ config:app:set fulltextsearch_elasticsearch elastic_host --value=$corrected_setting
fi

mlh5599 avatar Sep 22 '24 12:09 mlh5599

Hi, just to be clear, this is a bug in alpines curl implementation: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15690

szaimen avatar Sep 25 '24 10:09 szaimen

There ist an existing workaround which run as a cronjob just fixes the running container: https://gist.github.com/cdhermann/e9ca864c2653b84982c090e20153771a

cdhermann avatar Sep 25 '24 11:09 cdhermann