bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Option to calculate the number of actions in execution phase

Open rahul-malik opened this issue 8 years ago • 20 comments

Description of the problem / feature request / question:

Feature Request: Add an option to calculate the number of actions that will be performed in the execution phase.

Problem: The progress indication from Bazel does not give you a sense of the current progress because the number of actions typically increases throughout the execution phase.

It would be helpful to have a way to know how many actions will be performed so we can develop better UI/UX for progress which is less confusing for developers that work on a code base built with Bazel but are not familiar with it's internals.

If possible, provide a minimal example to reproduce the problem:

Build any large project.

Environment info

  • Operating System: macOS

  • Bazel version (output of bazel info release): 0.5.3

Have you found anything relevant by searching the web?

(e.g. StackOverflow answers, GitHub issues, email threads on the bazel-discuss Google group) No

Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

rahul-malik avatar Aug 18 '17 00:08 rahul-malik

A big +1 for this pain. Our developers say that the ever incrementing action count is really confusing. On Fri, 18 Aug 2017 at 3:09 Rahul Malik [email protected] wrote:

Description of the problem / feature request / question:

Feature Request: Add an option to calculate the number of actions that will be performed in the execution phase.

Problem: The progress indication from Bazel does not give you a sense of the current progress because the number of actions typically increases throughout the execution phase.

It would be helpful to have a way to know how many actions will be performed so we can develop better UI/UX for progress which is less confusing for developers that work on a code base built with Bazel but are not familiar with it's internals. If possible, provide a minimal example to reproduce the problem:

Build any large project. Environment info

Operating System: macOS

Bazel version (output of bazel info release): 0.5.3

Have you found anything relevant by searching the web?

(e.g. StackOverflow answers http://stackoverflow.com/questions/tagged/bazel, GitHub issues https://github.com/bazelbuild/bazel/issues, email threads on the bazel-discuss https://groups.google.com/forum/#!forum/bazel-discuss Google group) No Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/bazel/issues/3582, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUIF4Ts2tfozuC6StKt1kSVfHyz4y5fks5sZNYkgaJpZM4O69N2 .

ittaiz avatar Aug 18 '17 02:08 ittaiz

On Aug 17, 2017 10:53 PM, "Ittai Zeidman" [email protected] wrote:

A big +1 for this pain. Our developers say that the ever incrementing action count is really confusing.

+1

On Fri, 18 Aug 2017 at 3:09 Rahul Malik [email protected] wrote:

Description of the problem / feature request / question:

Feature Request: Add an option to calculate the number of actions that will be performed in the execution phase.

Problem: The progress indication from Bazel does not give you a sense of the current progress because the number of actions typically increases throughout the execution phase.

It would be helpful to have a way to know how many actions will be performed so we can develop better UI/UX for progress which is less confusing for developers that work on a code base built with Bazel but are not familiar with it's internals. If possible, provide a minimal example to reproduce the problem:

Build any large project. Environment info

Operating System: macOS

Bazel version (output of bazel info release): 0.5.3

Have you found anything relevant by searching the web?

(e.g. StackOverflow answers http://stackoverflow.com/questions/tagged/bazel, GitHub issues https://github.com/bazelbuild/bazel/issues,

email threads on the bazel-discuss https://groups.google.com/forum/#!forum/bazel-discuss Google group)

No Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/bazel/issues/3582, or mute the thread <https://github.com/notifications/unsubscribe-auth/ ABUIF4Ts2tfozuC6StKt1kSVfHyz4y5fks5sZNYkgaJpZM4O69N2>

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/bazel/issues/3582#issuecomment-323246088, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAIwqUoPj0g0AOAU6p6CVPIxtpCC4bSks5sZPycgaJpZM4O69N2 .

softprops avatar Aug 18 '17 03:08 softprops

We could probably just change it to report a percentage, if that's less confusing? The "current actions / total actions" should be monotonically increasing already.

philwo avatar Sep 12 '17 13:09 philwo

We've tried converting it to a percentage but it still seems strange because the total actions count is nearly reached before it increases in my experience which gives the false impression the build is almost complete.

rahul-malik avatar Sep 12 '17 14:09 rahul-malik

I agree. The important part (if possible) is that the total number of actions will be fixed at the beginning On Tue, 12 Sep 2017 at 18:20 Rahul Malik [email protected] wrote:

We've tried converting it to a percentage but it still seems strange because the total actions count is nearly reached before it increases in my experience which gives the false impression the build is almost complete.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/bazelbuild/bazel/issues/3582#issuecomment-328873935, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUIFzsflOI0Ec2oRnUIK0tKzAWkNTFmks5shphigaJpZM4O69N2 .

ittaiz avatar Sep 12 '17 18:09 ittaiz

The problem is that Skyframe does not eagerly walk the action graph, but it does it lazily. The reason for that is performance, since the action graph can be rather large and this was previously a blocking operation (where Bazel would just hang for some time). The downside is that all threads that walk the action graph block on actions that they execute, which delays discovery of remaining actions. That's why the number keeps going up during the build.

I'd be interested in seeing an attempt to do a concurrent action discovery, i.e., a thread that has the solve purpose of walking the action graph in parallel to the (many) threads that do execution. That would make the action count go up to the 'final' number more quickly. The question is whether it would add too much code complexity or have a significant performance impact.

Also, even if we do that there'll still be increases due to flaky test re-runs, which we most likely wouldn't count unless we have to do them, and we only know that when a test actually fails.

ulfjack avatar Sep 14 '17 08:09 ulfjack

Another data point: https://github.com/tensorflow/tensorflow/issues/14294

The explanation makes complete sense, but the UI is intuitive in the extreme. I'd rather have a single number with no indication a proportion of progress than two numbers that strongly suggest a proportion of progress and are not.

I feel like I'm doing a Windows file transfer in 2003.

pauldraper avatar Aug 25 '18 19:08 pauldraper

I'd rather have a single number with no indication a proportion of progress than two numbers that strongly suggest a proportion of progress and are not.

The second number is better than nothing: at least you have a lower bound on how many actions are left. This may allow you to decide to go and do something else rather than watching the build.

mhsmith avatar Sep 17 '18 21:09 mhsmith

I have a patch that may address this by making action execution not block skyframe threads - at least, that should make the number go up to the max more quickly. (Well, the patch only adds infrastructure to do so, but should be straightforward to extend.)

ulfjack avatar Nov 19 '18 08:11 ulfjack

Did the patch you mentioned (to speed up discovery of the max number of actions) make it in?

m01 avatar Oct 30 '19 23:10 m01

There's no simple answer to that question, I'm afraid. It isn't a simple patch, but an extensive series of patches. While the new code can be enabled with a flag (--experimental_async_execution), it doesn't do anything in Bazel right now because neither local nor remote execution support async execution, which makes it transparently fall back to legacy semantics. I know exactly what needs to be done, but I've had very little time to work on it for most of this year.

ulfjack avatar Oct 31 '19 09:10 ulfjack

@ulfjack just wondering if you had had any time to look at this this year. My organization is in the process of moving rather a lot of people over from cmake/ninja to bazel, and this is one of the frequent bits of feedback that we're getting, that the moving count is a pain point.

If you find that this isn't something you'll be able to get to, is it something that somebody else could take on? Even a community member like myself? I'm fairly competent at using bazel, and have made some stabs and hacking things about in the core bazel java code, but can make no great claims at being a master of the internals of bazel. All the same, this change I think would be greatly welcomed by a lot of people.

Thanks!

elklein avatar Sep 24 '20 20:09 elklein

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen (or ping me to reopen) if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

sgowroji avatar Feb 17 '23 06:02 sgowroji

@sgowroji This would still be useful to have.

fmeum avatar Feb 17 '23 07:02 fmeum

@coeuvre's work on potentially re-adding async execution might help here as well.

meisterT avatar Feb 21 '23 11:02 meisterT

@coeuvre Do you still plan to add async execution back in some way? If not, it could make sense to explore alternatives to solve this problem.

fmeum avatar May 13 '23 10:05 fmeum

Yes, we are actively working to upgrade the embedded JDK to a modern JDK and will then work on adding async execution with Loom, see https://github.com/bazelbuild/bazel/issues/6394#issuecomment-1541396079

meisterT avatar May 15 '23 07:05 meisterT

The current state is that this can be enabled with --experimental_async_execution, but it comes with a performance penalty.

fmeum avatar Oct 16 '24 14:10 fmeum

It has not been used at Google at all, it's an experimental option, it requires complex code to support and it's very probable that the same functionality will instead be implemented by relying on Project Loom.

RELNOTES[INC]: the --experimental_async_execution flag is now a no-op.

PiperOrigin-RevId: 489938259 Change-Id: If3d23071d833c387998c1759daf5a8c970ee8fbc

Glad to see it happen :)

pauldraper avatar Oct 16 '24 17:10 pauldraper

@pauldraper This is an older commit, the flag has been reimplemented since.

fmeum avatar Oct 16 '24 18:10 fmeum