phobos std/path: make globMatch work with @nogc

Sep 29 '24 08:09 ljmf00-wekaio

Thanks for your pull request and interest in making D better, @ljmf00-wekaio! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Please verify that your PR follows this checklist:

My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
I have provided a detailed rationale explaining my changes
New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.

If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + phobos#9055"

Sep 29 '24 08:09 dlang-bot

@jmdavis Brought this to my attention. I think this code points out a fundamental flaw in how we use @nogc both in Phobos and in general library code. Specifically, this code is technically correct. It is @nogc insofar as the GC is no longer called, but if somebody were to ever use this code in their @nogc hot-path code they'd probably be surprised to find out that it's allocating underneath them.

This is because in D, @nogc has two meanings. The first is the technical meaning of "No GC Allocations". The second is a community-semantic meaning of "no allocations". So in one sense, this code is fibbing about what it does. It is technically correct, but for most users, they will see the @nogc and assume that the implementation does not allocate at all, which is incorrect, and they will only be able to determine the truth by actually reading the code, which of course, nobody does.

Of course, we can always fall-back to "you should just know that @nogc doesn't mean no allocations" but this is poor library design as this would require that we document all allocations regardless of allocation method and then tell everybody to RTFM. Furthermore, this code does not meet even the Phobos 3 standards for @nogc code. All @nogc code in Phobos 3 that requires allocation must request a buffer from the user (or an allocator, once we get those).

The overarching problem here is that @nogc has two different meanings that forces us to carefully consider our design choices. The design standards of Phobos 3 are intended to enforce a design where all @nogc methods are explicit about their allocation patterns in the call-site, which means that user is always aware of what they are doing. Given the present situation with @nogc this is likely the only realistic path forward for our @nogc library support.

As this code fails to meet the Phobos 3 design standards I would request that you either redesign this method to meet them or consider providing an alternative implementation via overload.

Sep 29 '24 21:09 LightBender

Calling a function allocates memory (on the stack).

I think you would need to define what "allocate memory" means more precisely before beginning to discuss whether @nogc allows it or not.

The first is the technical meaning of "No GC Allocations".

Actually that is not quite correct. @nogc means "does not use the GC", which includes other operations beyond allocating GC-owned memory. The point is that @nogc code can be linked into a program which does not include a garbage collector. Note that this is by itself useful in its own way, but quite far away from any perceived meaning of "does not allocate memory".

Sep 29 '24 22:09 CyberShadow

Calling a function allocates memory (on the stack).

I think you would need to define what "allocate memory" means more precisely before beginning to discuss whether @nogc allows it or not.

That's a valid point. We should try to be precise with what we mean. However, I think that it's pretty clear that he meant heap allocations, which is almost always what people mean when talking about allocations. Stack allocations happen all over the place whether you like it or not and aren't avoidable to the point that there's really no point in even discussing them with regards to APIs. The fact that they happen is a given, and they're a non-issue unless you start doing something like creating huge static arrays or run attempt really deep / infinite recursion.

The first is the technical meaning of "No GC Allocations".

Actually that is not quite correct. @nogc means "does not use the GC", which includes other operations beyond allocating GC-owned memory. The point is that @nogc code can be linked into a program which does not include a garbage collector. Note that this is by itself useful in its own way, but quite far away from any perceived meaning of "does not allocate memory".

From the standpoint of actually using @nogc, using it to determine whether you can avoid linking against the GC code is pretty pointless. You'll get that from the linker as soon as you try to link, and anyone looking to avoid the GC to that degree simply won't link in druntime. The whole point of @nogc is to be able to say that no GC allocations occur and that a GC collection can't be triggered, which enables you to reliably use that function in contexts where you can't afford for heap allocations to occur or for a GC collection to trigger. That's done by saying that no GC functions can be called, and in turn, that does technically mean that it will indicate that that function doesn't need symbols from the GC, but if that's your goal, @nogc is completely unnecessary.

Sep 29 '24 22:09 jmdavis

The second is a community-semantic meaning of "no allocations".

No, it's conceptually wrong and I don't think the community follows that either. The whole reason for GC being costly is because of the entire mechanism of tracking the memory, not the allocation mechanism alone. The same for nothrow but with the argument that the exception unwinding code is expensive.

So in one sense, this code is fibbing about what it does. It is technically correct, but for most users, they will see the @nogc and assume that the implementation does not allocate at all

I think those are mostly your assumptions, at least in current Druntime/Phobos. I can prove that with a simple grep in the code. See https://github.com/dlang/phobos/blob/59b371567b65e331b4d219e35c7b4da0540a5970/std/algorithm/sorting.d#L3188 . I would accept arguing this should be documented, because documentation is where you clear out assumptions made by the users.

Take another example, where user assumes X or Y method is pretty fast, but turns out the method does a syscall in the background? What about a specific syscall that is potentially more expensive to the user? I think its not sensible to assume that the method does or does not a syscall nor a specific one, unless its documented properly, and I guess the same goes with allocations and other specific behaviour.

Would be cool to have rules for this, like enforce documenting it? Yes, do we have it currently? No. Can we still document it, yes.

Of course, we can always fall-back to "you should just know that @nogc doesn't mean no allocations"

It's literally the meaning of @nogc, usage of GC and nothing about "no allocations". Would you say, with your last argument, then, that mmap being @nogc it doesn't "allocate" at all? No, of course not. Allocation is also a very broad term. It vastly depends on your environment. Like @CyberShadow mentioned, isn't calling every function that uses stack, allocating?

All @nogc code in Phobos 3 that requires allocation must request a buffer from the user (or an allocator, once we get those).

While I agree with the allocations (we need to redesign the current Phobos allocators tho), why you bringing this PR, which changes a Phobos v2 implementation?

The overarching problem here is that @nogc has two different meanings that forces us to carefully consider our design choices

Sorry, but you seem to fabricate the second meaning. There's only one meaning of the @nogc attribute specified in the language documentation. I can see your point but this is not the current situation, at all.

As this code fails to meet the Phobos 3 design standards I would request that you either redesign this method to meet them or consider providing an alternative implementation via overload.

Why are we trying to make Phobos v2 the Phobos v3 already? We can clearly see this has room for debate, but I don't understand why here. The effort here is to avoid calling a GC collection for such trivial method. I could rework the method to not even use any sort of allocation, which, btw, I think that is the most sensible way, but that's beyond this change. This change itself is an improvement over the previous situation.

Sep 30 '24 01:09 ljmf00-wekaio

I think that we should stop considering that @nogc code is synonymous to 'extremely fast code' (and randomly add restrictions like "no allocations" to its meaning on that basis). You can be @nogc and still have expensive calls to other @nogc functions. @nogc functions simply mean that this function does not allocate using the gc (which by the way, is not that expensive in most situations), however, other functions that call the @nogc function can still use the gc. I don't think that there is a guarantee that @nogc guards you against a "stop the world" event if functions up the call stack have allocated gc memory (but I might be wrong). Anyway, my point is that if performance is what you are interested in, then the best way to do it is to start by using the gc, profile and see where the bottlenecks are. It might be that the gc is not where the problem is. This @nogc misconception only propagates gcphobia, when in fact a better design would be to have the possibility of selecting from a wide range of available gc's, including custom ones. I suspect that that will make gc more attractive for more people. And if your mind is set to not use the gc then you might as well compile with -betterC now that most hooks are templated.

Oct 02 '24 13:10 RazvanN7

I think that we should stop considering that https://github.com/nogc code is synonymous to 'extremely fast code' (and randomly add restrictions like "no allocations" to its meaning on that basis).

Thank you for raising this. I think this is not even consensual among the community either. And I'm pretty sure, we, at Weka, doesn't consider @nogc to be "the very fast implementation" of a GC alternative implementation, but rather an implementation with some considerations regarding runtime behaviour, because, honestly, it doesn't even make any sense to say that.

The boundaries are so vast that we would have to imply an extensive list of rules. Just by saying that syscall's can be worse, tells a lot about the above argument.

To complete some of the Weka concerns on Phobos:

@LightBender There's implementations that are supposed to be @nogc, but we don't use from Phobos, e.g. filter and a lot of range primitives. We have pretty much our own Phobos implementation because:

range API is fundamentally broken and potentially expensive due to exponential inlining cost. Modern languages implement a next that does have all three calls into one. This could be an alternative solution that ranges optionally implement.
std.algorithm most of the times creates a closure when its not needed, that's mainly why we have our own filter,etc ...
a lot of GC usage
a lot of enforce usage that implies throw and again, the GC too
auto decoding is an unfortunate feature too

(which by the way, is not that expensive in most situations)

In our case, using GC, regardless of being a small or large allocation, its potentially costly. I think with the refactor presented in dconf of the GC, this might be way better, but, mainly:

it may trigger an allocation, but if not
it may trigger a collection, but if not
it may trigger a page fault, but if not
it may trigger some syscall

Other reasons to avoid the current GC implementation and some Phobos implementations, for very specific reasons for us, like, but not limited, usage of huge pages, we also need to forbid all kinds of voluntary context switch when possible or usage of fork syscall, which, btw, its on a lot of std.process (which I need to make a PR to use vfork or posix_spawn).

Allocations via malloc are much more different and controllable. It may do a better use of a slab allocator and the fact that you can simply replace it with a drop-in implementation, is far better than the Druntime GC, which assumes a lot of things (which btw, I also suggest us standardise these assumptions), and therefore, not really replaceable.

All of those, in a hot path is costly, for some more than others.

Oct 02 '24 14:10 ljmf00-wekaio

@RazvanN7 Nominally, I agree. However, we have a perception problem with nogc. But this is not the only, or even the most significant argument against replacing the GC with malloc in Phobos. See below.

@ljmf00-wekaio Now lets address your concerns.

First. If a non-allocating implementation exists, why not use that?

Second. The reason to do this to Phobos 3 standards is simple. We're going to have to do it eventually when we pull this code into V3. So why not do it now? This code hits a key V3 design standard on allocations and there is no harm in either rewriting it to not allocate or adding an overload that takes in a buffer per the V3 standard.

Lastly. Consider that the new GC is going to hijack malloc/free so that it can track malloc'd memory in the C standard library. Therefore calling malloc actually will be the same as calling new. This means that under the new GC @nogc is a literal lie. More specifically @nogc will code will silently exhibit GC behaviors. The basic problem here is that "@nogc only means the GC" makes an assumption that is not guaranteed to be true. And in fact anybody can hijack malloc/free and do things beyond just allocating.

This is why the Phobos V3 design standard is "no/reduced allocations" not "@nogc". Allocation/deallocation runtime performance is in fact undefined from the perspective of the standard library.

Honestly, given what is coming with the new GC, and the idea of hijacking malloc/free in general, I think the value of @nogc is extremely limited, to the point where I would advocate removing it from the language entirely. And we're already talking about weakening it to allow allocations when throwing.

As far as I know, Weka likely won't care about any of this, because IIRC you guys do other things and don't use the standard GC. Therefore, while this change actually does deliver what you want, in the general case it, in fact, will do something else entirely. Our job with Phobos is to design to that general case.

Oct 04 '24 08:10 LightBender

to the point where I would advocate removing it from the language entirely

Music to my ears.

Oct 04 '24 08:10 RazvanN7

First. If a non-allocating implementation exists, why not use that?

I'm not against using a non-allocating approach, simply would require a bigger refactor that I'm not addressing on this PR. Could be done later, but should this improvement be a blocker? I don't see a reason.

Consider that the new GC is going to hijack malloc/free so that it can track malloc'd memory in the C standard library.

[...]

Honestly, given what is coming with the new GC, and the idea of hijacking malloc/free in general, I think the value of @nogc is extremely limited, to the point where I would advocate removing it from the language entirely. And we're already talking about weakening it to allow allocations when throwing.

This is not what I heard of it and there's concerns doing it, a lot of concerns frankly. Replacing the GC, I agree 100%, hijacking malloc/free, no. Would malloc trigger an implicit collection? This would be a disaster, IMO. GC would still collect and therefore would still stop the world, but in a way better implementation that triggers less of them. I'm afraid of this approach.

GC is very useful, and to be clear, I don't advocate for GC-fobia, but I think you underestimate the warm that GC causes to performant applications.

As far as I know, Weka likely won't care about any of this, because IIRC you guys do other things and don't use the standard GC.

We use the standard GC, and we planned to fork the GC for reasons that the new GC implementation is going to address, in most cases, observability/logging/tracing.

Oct 23 '24 13:10 ljmf00-wekaio

First. If a non-allocating implementation exists, why not use that?

I'm not against using a non-allocating approach, simply would require a bigger refactor that I'm not addressing on this PR. Could be done later, but should this improvement be a blocker? I don't see a reason.

Consider that the new GC is going to hijack malloc/free so that it can track malloc'd memory in the C standard library. [...] Honestly, given what is coming with the new GC, and the idea of hijacking malloc/free in general, I think the value of @nogc is extremely limited, to the point where I would advocate removing it from the language entirely. And we're already talking about weakening it to allow allocations when throwing.

This is not what I heard of it and there's concerns doing it, a lot of concerns frankly. Replacing the GC, I agree 100%, hijacking malloc/free, no. Would malloc trigger an implicit collection? This would be a disaster, IMO. GC would still collect and therefore would still stop the world, but in a way better implementation that triggers less of them. I'm afraid of this approach.

@ljmf00-wekaio @ljmf00

According to @deadalnix the new GC does not currently collect on malloc, but eventually it will. Presumably to improve the GC's handling of memory in applications that use C libraries. Therefore we cannot absolutely rely on malloc being non-collecting in the future.

While I am not inherently against merging this. We are creating another future piece of lint for us to keep track of for Phobos 3. But I do agree that the future GC behavior creates a bigger discussion around the entire concept of @nogc than can be addressed here.

Oct 27 '24 08:10 LightBender

This PR uses malloc/free in similar ways to other places in druntime and phobos.

If there was a problem with this PR, we would have problems all over the place.

Furthermore, any overriding of malloc will likely be both opt-in and POSIX only.

Therefore I'm in favor of this PR, and I do not understand why we have talked so much about this.

Oct 27 '24 08:10 rikkimax

According to @deadalnix the new GC does not currently collect on malloc, but eventually it will.

Please don't. We really need to think about the consequences of that. I'm sure this is not one-sided by me, and I got to say, if that is assumed to be a default from now on, it would be catastrophic for so many high-performance programs, Weka very much included. You are underestimating what calling a/several syscalls or blocking all threads during important low-latency tasks really means. Allocations (e.g. malloc) alone doesn't do that often (in some applications not even close as often as the GC). Even if we think we could do a really great job on re-implementing the GC, the fundamental design principles doesn't let us reach nearly the same performance as, for example, Go GC (which is one of the top of the line GCs out there for a system language).

There's a reason for why people choose jemalloc, tcmalloc, mimalloc, etc... over the generic libc implementation. Saying, on top, @nogc will be eventually removed because of that decision is a very bold claim too.

Furthermore, any overriding of malloc will likely be both opt-in and POSIX only.

Being opt-in sounds better.

Oct 27 '24 18:10 ljmf00-wekaio

According to @deadalnix the new GC does not currently collect on malloc, but eventually it will. Presumably to improve the GC's handling of memory in applications that use C libraries. Therefore we cannot absolutely rely on malloc being non-collecting in the future.

I have (privately) given the feedback that this is never going to work. Our testing is showing signs that this is indeed problematic (because doing so requires knowing about all threads in the programs, so you also have to hijack pthread_create, which is its own can of worms). I would not make any decision based on this. However, I would love for scope array allocations to be @nogc, which we can do by lowering them to alloca.

Oct 27 '24 19:10 Geod24

hijack pthread_create, which is its own can of worms)

Right, I didn't thought of that, it can get worse easily if pthread_create on some linked code is hidden or putting it in very simple terms, people just using the clone syscall directly with sharing address space flag can cause problems. I think there's a lot of cans of worms there to open. It's simply not feasible to implement a drop-in replacement of malloc with the thread locking mechanism the GC has. And regardless, its catastrophic by the things I mentioned above.

scope array allocations to be @nogc, which we can do by lowering them to alloca.

LDC already sort of does that. https://github.com/ldc-developers/ldc/blob/master/gen/passes/GarbageCollect2Stack.cpp

Oct 27 '24 23:10 ljmf00-wekaio

LDC already sort of does that.

I'm aware, but it's an optimization pass, not something the frontend is aware of, and not something that can be enforced AFAIK.

Oct 28 '24 11:10 Geod24