etcd icon indicating copy to clipboard operation
etcd copied to clipboard

3.4.15 etcd fails to come up after it restarts due to OOM

Open hasethuraman opened this issue 3 years ago • 4 comments

What happened?

Pod restarted due to OOM (due to repeated keys read) but it couldnt come up due the below error in the first restart. And as it kept restarted, it hit error#2 and then error#3 Content of the data folder image

Error#1

 etcdserver: another etcd process is using "/bitnami/etcd/data/member/snap/db"

Error#2

2022-08-16 06:13:01.651461 I | etcdserver: recovered store from snapshot at index 9996582
2022-08-16 06:13:11.843463 W | etcdserver: another etcd process is using "/bitnami/etcd/data/member/snap/db" and holds the file lock, or loading backend file is taking >10 seconds
2022-08-16 06:13:11.843492 W | etcdserver: waiting for it to exit before starting...
2022-08-16 06:14:16.000464 I | etcdserver: restarting member a56f244b7c49cab4 in cluster 84df8dbe8cd6f08d at commit index 9996590
raft2022/08/16 06:14:16 INFO: a56f244b7c49cab4 switched to configuration voters=(3480202197398638468 11777633691344020509 11920786645299350196)
raft2022/08/16 06:14:16 INFO: a56f244b7c49cab4 became follower at term 12
raft2022/08/16 06:14:16 INFO: newRaft a56f244b7c49cab4 [peers: [304c2975aa7acf84,a3728f71133ba41d,a56f244b7c49cab4], term: 12, commit: 9996590, applied: 9996582, lastindex: 9996590, lastterm: 12]
2022-08-16 06:14:16.000696 I | etcdserver/api: enabled capabilities for version 3.4
2022-08-16 06:14:16.000713 I | etcdserver/membership: added member a56f244b7c49cab4 [http://etcd-0.etcd-headless.svc.cluster.local:2380] to cluster 84df8dbe8cd6f08d from store                                                                                                                         2022-08-16 06:14:16.000719 I | etcdserver/membership: added member 304c2975aa7acf84 [http://etcd-1.etcd-headless.svc.cluster.local:2380] to cluster 84df8dbe8cd6f08d from store                                                                                                                         2022-08-16 06:14:16.000724 I | etcdserver/membership: added member a3728f71133ba41d [http://etcd-2.etcd-headless.svc.cluster.local:2380] to cluster 84df8dbe8cd6f08d from store
2022-08-16 06:14:16.000731 I | etcdserver/membership: set the cluster version to 3.4 from store
2022-08-16 06:14:17.111662 C | etcdmain: database file (/bitnami/etcd/data/member/snap/db) of the backend is missing

Error#3

2022-08-16 06:17:04.493452 I | etcdserver: recovered store from snapshot at index 9996582
2022-08-16 06:17:04.622643 C | etcdserver: recovering backend from snapshot error: failed to find database snapshot file (snap: snapshot file doesn't exist)
panic: recovering backend from snapshot error: failed to find database snapshot file (snap: snapshot file doesn't exist)
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xc0050e]

goroutine 1 [running]:
go.etcd.io/etcd/etcdserver.NewServer.func1(0xc0002d8e30, 0xc0002d6d80)
        /tmp/etcd-release-3.4.15/etcd/release/etcd/etcdserver/server.go:334 +0x3e
panic(0xeeed20, 0xc00003acd0)
        /usr/local/go/src/runtime/panic.go:522 +0x1b5
github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc0001a3aa0, 0x10cb052, 0x2a, 0xc0002d6e50, 0x1, 0x1)
        /root/go/pkg/mod/github.com/coreos/[email protected]/capnslog/pkg_logger.go:75 +0x135
go.etcd.io/etcd/etcdserver.NewServer(0xc00004018a, 0xd, 0x0, 0x0, 0x0, 0x0, 0xc000242c00, 0x1, 0x1, 0xc000242d80, ...)
        /tmp/etcd-release-3.4.15/etcd/release/etcd/etcdserver/server.go:464 +0x433c
go.etcd.io/etcd/embed.StartEtcd(0xc000292000, 0xc0002b2000, 0x0, 0x0)
        /tmp/etcd-release-3.4.15/etcd/release/etcd/embed/etcd.go:214 +0x988
go.etcd.io/etcd/etcdmain.startEtcd(0xc000292000, 0x109fe59, 0x6, 0x1, 0xc0001e5140)
        /tmp/etcd-release-3.4.15/etcd/release/etcd/etcdmain/etcd.go:302 +0x40
go.etcd.io/etcd/etcdmain.startEtcdOrProxyV2()
        /tmp/etcd-release-3.4.15/etcd/release/etcd/etcdmain/etcd.go:144 +0x2ef9
go.etcd.io/etcd/etcdmain.Main()
        /tmp/etcd-release-3.4.15/etcd/release/etcd/etcdmain/main.go:46 +0x38
main.main()
        /tmp/etcd-release-3.4.15/etcd/release/etcd/main.go:28 +0x20

What did you expect to happen?

ETCd must have auto recovered and come up

How can we reproduce it (as minimally and precisely as possible)?

  1. Insert around 1Million key values that will take the DB size to 6.5GB (so have backend quota bytes max 8GiB)
  2. Read all key using etcdctl command "ETCDCTL_API=3 ./etcdctl --endpoints=$POD get / --prefix "
  3. Pod will hit OOM and restart. Now you will see the enclosed as log sequence after every restart.

Anything else we need to know?

I will try to repro in 3.5.

Etcd version (please run commands below)

3.4.15 ```console $ etcd --version # paste output here

$ etcdctl version

paste output here


</details>


### Etcd configuration (command line flags or environment variables)

<details>

# paste your configuration here

</details>


### Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)

<details>

```console
$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

No response

hasethuraman avatar Aug 16 '22 07:08 hasethuraman

Looks like the previous etcd wasn't terminated when the new one was running, so it (the new one) can't open the backend db file because only one instance can open it.

ahrtr avatar Aug 18 '22 02:08 ahrtr

No there is no another process running and the last error is also different.

hasethuraman avatar Aug 19 '22 08:08 hasethuraman

Please try to reproduce this issue on main or release-3.5, thx

ahrtr avatar Aug 22 '22 01:08 ahrtr

@ahrtr when I try to repro in 3.5, I am hitting another thing which is stopping me.

I have 3 node cluster (one of the node with ID 304c2975aa7acf84) and inserted 10M key values.

I dont know when did the issue started; the etcd instance with the above mentioned id is restarting continuously. The reason is because it couldnt find itself in the cluster as it got removed and I dont find any clue from other instance logs and this particular instance log is rolled over.

//log from restarting instance {"level":"warn","ts":"2022-09-01T15:55:25.002Z","caller":"rafthttp/http.go:426","msg":"ignored streaming request; ID mismatch","local-member-id":"304c2975aa7acf84","remote-peer-id-stream-handler":"304c2975aa7acf84","remote-peer-id-header":"5e660eb091633ac","remote-peer-id-from":"5d89364353d73c1e","cluster-id":"84df8dbe8cd6f08d"}

I am trying to figure out the steps to recover the cluster at this point. Is there a way without rebuilding the cluster? Ex: by updating the ID or something.

hasethuraman avatar Sep 01 '22 16:09 hasethuraman

Please provide detailed steps on how to reproduce the issue you mentioned in 3.5 above.

ahrtr avatar Oct 13 '22 05:10 ahrtr

I've been watching the issues here as I would like to use etcd for a new project, and it seems to me that it is expected that memory usage increases with load/get activity.

This is not reasonable behaviour, as it a sure-fire way to make whatever service relies on etcd susceptible to DOS by OOM. It's not a small bug you're looking for, but a design/architecture flaw (perhaps allocating memory on each get request is unsound). This was confirmed sometime ago when the developers argued that the GC would eventually free the memory, and therefore it doesn't matter how much memory etcd allocates before the GC event. It almost sounded like a challenge.

I would expect that there should be an upper bound on memory use, regardless of load, which scales with database size (for a given dataset, memory use should never increase above eg, 2x the dataset size). I'm unfamiliar with GO, but in C this would be accomplished by mmap'ing the data, and keeping the indexes on the heap. The data would be sent to the socket directly from the mmap, with no malloc.

My question is: has this behaviour been introduced recently? Is there an older etcd version that did not allocate memory in its main message loop?

drok avatar Dec 03 '22 20:12 drok

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 18 '23 09:03 stale[bot]