bazel-remote Add support for debugging cache misses

I think we could add support for debugging cache misses between two machines to the remote cache.

Start the cache in debug mode
It tells you to run the first build.
You tell it once finished.
It tells you to run the second build.
You tell it once finished.
It then finds the actions that had the same inputs but produced (one or more) different outputs and prints them in a human readable format (i.e. the command, the environment, the input/output file names with hashes).

This can detect and help fix three kinds of errors:

Find non-determinism in the build, either on the same or different machines.
Tell you exactly the command(s) that produced different outputs on different machines. Once one knows which commands produce different outputs it's comes down to checking if the versions of the tools match etc.

Thoughts?

cc: @nicolov @jgavris @BenTheElder

Aug 14 '18 08:08 buchgr

SGTM

Aug 14 '18 17:08 BenTheElder

👍 Sounds great!

Aug 14 '18 17:08 jgavris

I suggest that the comparison should be done based on the output paths, rather than the inputs. By matching outputs across builds, we can:

Compare the input files, both contents and paths. For example, globs might pick up different files on different machines - ask me how I know.
If the inputs match, compare the output contents to find non-determinism in the execution itself.

Aug 16 '18 07:08 nicolov

@nicolov great idea with output paths!

Aug 16 '18 12:08 buchgr

cc @petemounce

Aug 20 '18 16:08 buchgr

I am planning to start working on this this weekend. While this tool would use lots of code from the remote cache, I am thinking it should probably be its own binary. Thoughts?

Aug 23 '18 12:08 buchgr