m3 icon indicating copy to clipboard operation
m3 copied to clipboard

[DBNode] Peer Bootstrap: Metadata Fetching: Error in fetch request not reported to log anywhere

Open asafm opened this issue 5 years ago • 3 comments

In the Peer Boostrapper, a go function is fired to continuously request metadata, page by page, from a specific peer, for a specific shard, until no more metadata exists. In case of an error, metrics are incremented, but the error it self is never reported to the log - not on the caller side, nor on the server side. The problem is aggravated by the fact that the function will run for-ever if the error is retry-able. Without the log, all you can do is helplessly stare at the error metrics increase without the ability to remedy the situation.

The fix should probably be rate-logging of the error, or by type of error.

Location of problem:

			for condition() {
				var err error
				currPageToken, err = s.streamBlocksMetadataFromPeer(namespace, shardID,
					peer, start, end, currPageToken, metadataCh, resultOpts, progress)

at src/dbnode/client/session.go: 2364

asafm avatar Jan 13 '21 20:01 asafm

@asafm -- sounds good, feel free to propose a fix and we'd be happy to take a look.

gibbscullen avatar Jan 28 '21 16:01 gibbscullen

@gibbscullen Before I sit down to write the code to fix, it seems right a maintainer would verify me, since I'm not proficient at the code-base level currently. I would keep this issue open if you can please as this is a real bug.

asafm avatar Oct 19 '21 07:10 asafm

@asafm Re-opening. We will look into this in the next couple of weeks.

gibbscullen avatar Oct 20 '21 20:10 gibbscullen