prometheus scrape + discovery: Required changes for serverless use

Rebases on top of @ridwanmsharif changes https://github.com/GoogleCloudPlatform/prometheus/pull/122 with following updates:

Simplified discovery, I think we can always skip wait and make it more robust.
Adjusted TestManagerStopAfterScrapeAttempt to simulate bit more how external users might use it.
Removed one test, we test it TestManagerStopAfterScrapeAttempt already
Cleaned up comments.
Fixed lock bug that was here already.

More details on locking:

Checked -race on TestManagerStopAfterScrapeAttempt and indeed without your locks.

It produced RACE, I would argue it's not possible as everything is guarded by mtxScrape:

// reload -> ApplyConfig mtxScrape // stop -> Stop/StopAfterScrapeAttempt/ApplyConfig mtxScrape // sync -> Sync -> manager reload mtxScrape

Turned out - there was bug in our locking (mtxScrape Unlock was too soon). Fixed, and now no more locks are needed.

BTW @ridwanmsharif epic work on DiscoveryReloadOnStartup and SkipInitialWait in discovery. Super nice solutions! I changed it only slightly, but feel free to reject 🤗

Nov 22 '23 14:11 bwplotka

Fuzzing issue seems unrelated, I would ignore.

Nov 22 '23 14:11 bwplotka

Looks like races on my discovery changes. They might also occur on your changes, just we don't test those with this special flag 🤔

Nov 22 '23 14:11 bwplotka

Merged https://github.com/GoogleCloudPlatform/prometheus/pull/122 for now as this needs some love.

The race is on test (using discoveryManager.targets without lock while .Run is till there), but I need to verify we won't have some other consequences for non sidecar cases... t

Nov 22 '23 18:11 bwplotka