foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

Clients fail to send read requests to all replicas

Open ajbeamon opened this issue 5 years ago • 1 comments

We recently encountered a situation where a two-storage-server team was hot with reads. At some point, one of these storage servers started experiencing noticeably more reads than the other. We decided to exclude the worse one, and after data movement completed we ended up in a state where only a single storage server was hot. A subsequent exclude of the new process had the same outcome.

Eventually we bounced the clients, and load started to distribute evenly between the two storage servers again.

ajbeamon avatar Apr 01 '20 21:04 ajbeamon

Is there a way to reproduce this behavior in our simulation test? I'm thinking loud: if we can record in a client which SS sends the data to it and check the number of replies from each SS in a team is "roughly" same, we might be able to reproduce it?

xumengpanda avatar Apr 02 '20 04:04 xumengpanda