Bug in 'may cause the total free space in the cluster to drop below 10%.' with memory engine (FDB 6.2.20)
I believe we've found a bug with the may cause the total free space in the cluster to drop below 10%. check, when using the memory engine, on FDB 6.2.20
Background
Replacing 4 DBs with 4 new DBs with newer AMI but identical FDB version and configuration:
- Instances: 4x
i3.2xlarge(x2 during replacement) - FoundationDB version:
fdbserver --version FoundationDB 6.2 (v6.2.20) source version 77b5171e81754f2fda8869703d662e59d85b7f23 protocol fdb00b062010001
When we attempt to exclude first DB of older 4 DBs fails with this error:
db> exclude 10.19.0.157
ERROR: This exclude may cause the total free space in the cluster to drop below 10%.
Type `exclude FORCE <ADDRESS>*' to exclude without checking free space.
status output before attempting the exclude
Using cluster file `/etc/foundationdb/memory.cluster'.
Configuration:
Redundancy mode - double
Storage engine - memory-2
Coordinators - 3
Cluster:
FoundationDB processes - 24
Zones - 8
Machines - 8
Memory availability - 14.4 GB per process on machine with least available
Retransmissions rate - 1 Hz
Fault Tolerance - 1 machine
Server time - 04/12/21 21:04:17
Data:
Replication health - Healthy (Repartitioning)
Moving data - 0.008 GB
Sum of key-value sizes - 6.249 GB
Disk space used - 134.657 GB
Operating space:
Storage server - 12.1 GB free on most full server
Log server - 287.6 GB free on most full server
Workload:
Read rate - 1783 Hz
Write rate - 31174 Hz
Transactions started - 442 Hz
Transactions committed - 302 Hz
Conflict rate - 0 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
Client time: 04/12/21 21:04:03
Where we think the bug is:
Code: https://github.com/apple/foundationdb/blob/6.2.20/fdbcli/fdbcli.actor.cpp#L2150-L2154
if( ssExcludedCount==ssTotalCount || (1-worstFreeSpaceRatio)*ssTotalCount/(ssTotalCount-ssExcludedCount) > 0.9 ) {
printf("ERROR: This exclude may cause the total free space in the cluster to drop below 10%%.\n"
"Type `exclude FORCE <ADDRESS>*' to exclude without checking free space.\n");
return true;
}
In the scenario described in the Background:
- memory engine:
- ssExcludedCount is 2 (excluded storage server process, if we are doing exclude one DB node at time)
- ssTotalCount is 8 (total storage server processes, 2*4=8)
- which means if the worstFreeSpaceRatio is less than 32.5%, we will see this error message.
But FDB calculated the free space ratio by checking the disk usage (even for memory storage engine type):
worstFreeSpaceRatio = double(free_bytes)/total_bytes
free bytes on disk / total bytes on disk
When we checked the status json output, on the disk section, looks like our calculated worstFreeSpaceRatio is around 30%
"command_line" : "/usr/sbin/fdbserver --class=storage --cluster_file=/etc/foundationdb/memory.cluster --datadir=/mnt/fdb/4600 --knob_max_shard_bytes=100000000 --listen_address=public --logdir=/mnt/logs/fdb --memory=22GiB --public_address=auto:4600 --storage_memory=14GiB",
"disk" : {
"busy" : 0.85438799999999993,
"free_bytes" : 380603060224,
"reads" : {
"counter" : 559764735,
"hz" : 1192.3800000000001,
"sectors" : 48120
},
"total_bytes" : 1869103546368,
"writes" : {
"counter" : 610911231,
"hz" : 1063.79,
"sectors" : 143504
}
},
Summary
- We'd suggest that for
memorystorage engine type, the code could take memory usage into consideration rather than disk usage. - If this is changed/improved in a newer version, please let us know
The memory storage engine also uses disk to store data, so it needs to make sure there is sufficient space on disk when excluding something. It does seem like a problem, though, that it doesn't account for available storage memory at all.
That said, there are some significant issues with the current disk calculation. One is that it assumes that every process in the cluster is using as much space as the most full process. Second, it's not accounting for how much space the processes are using, but only how much free space the disk has. That means if you are using disks that have other data on them, this calculation is going to be even more wrong. Third, files on disk can actually have a bunch of empty pages in them, meaning that when moved they will consume less space.
The net result of this is that this check is very pessimistic. There are many cases where it will warn you about free space that are not actually concerning. This isn't necessarily a show-stopping problem, as you have the ability to force the exclude after evaluating the situation manually, but it would be quite nice if this computation could be made to be more accurate.