runtime: Mildly over-allocate when growing slices
This refactors runtime.sliceAppend() and runtime.sliceGrow() so, not only sliceAppend() uses sliceGrow() if needed, sliceGrow() only mildly over-allocates a new buffer using a sequence similar to what CPython uses for list objects.
CC @dgryski
Some quick testing shows this makes the multiplication factor 1.125. However the expression doesn't seem to work for newCap <= 8 so I think we need to make sure the edge cases are handled properly.
I'm not sure this actually saves us anything. I threw together a quick simulation and it seems that the new algorithm requires more calls to "grow" (since we're not growing as much, we need to do it more frequently, requiring more CPU) and ends up generating more garbage, because the additional calls to grow means we have to throw more stuff away.
You can play with my sample code here: https://go.dev/play/p/QcHiMqnGlym
Yup, that seems to be the case from some of the testing I did last week to understand why some tests were failing. I'm going to send a new version that does pretty much the same thing as before but keeps the cleanup.
On Mon, Jul 22, 2024, at 11:53 AM, Damian Gryski wrote:
I'm not sure this actually saves us anything. I threw together a quick simulation and it seems that the new algorithm requires more calls to "grow" (since we're not growing as much, we need to do it more frequently, requiring more CPU) and ends up generating more garbage, because the additional calls to grow means we have to throw more stuff away.
You can play with my sample code here: https://go.dev/play/p/QcHiMqnGlym
— Reply to this email directly, view it on GitHub https://github.com/tinygo-org/tinygo/pull/4287#issuecomment-2243609999, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADVGKW2UBXOAKSZRH5APLZNVIKBAVCNFSM6AAAAABI7FV6YCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBTGYYDSOJZHE. You are receiving this because you authored the thread.Message ID: @.***>
Alright, @dgryski, pushed a new version, rebased on the current dev branch, that goes back to the "align to the next power of two" code, but using a branchfree version instead.
The failure inside the reflect test is very weird.
The Index field of "want" for the reflect field F looks incorrect, it is field 1, but want is 0,0.
The Index field of "want" for the reflect field
Flooks incorrect, it is field 1, but want is0,0.
Right, and I have no idea why! I've spent some time in the reflect code, and @dgryski too, but we don't yet know what's happening here. Our best guesses is that there's either memory corruption somewhere, or that, although the code seems correct, its behavior is different from before, and the reflect code is probably depending on that somehow.
Hmm, I was suggesting/agreeing with @dgryski that the test may be faulty and the old implementation matched. I think the expectation looks incorrect, though I have very little experience or deep knowledge of this code.
Hmm, I was suggesting/agreeing with @dgryski that the test may be faulty and the old implementation matched. I think the expectation looks incorrect, though I have very little experience or deep knowledge of this code.
If I understood the test correctly, it seems correct. The index is a slice of indices, starting with the current type, and following any nested type. So an index of []int{0} means the field is the first in that struct; and an index of []int{0, 1} means it's the second of the first of that struct.
The test defines two structs: Rec1 and Rec2; Rec1 has a pointer to Rec2, and Rec2 has a F string and a pointer to Rec1. To find F from Rec1, you need to go through the first element, and then find the first element ([]int{0, 0}), which seems to agree with the test.
I might be mistaken about how this works, though.
Reading more of #tinygo-dev, feel free to ignore me. I didn't look closely enough at the Index type in the test.
So I copied the failing test to its own file, and reflect.VisibleFields() gives the right output, agreeing with Big Go. So the problem seems to be in the actual test driver somewhere.
Fixed the reflect test; let's see if CI likes it.
Alright, archive/zip failing is probably the reflect.DeepEqual() calls inside the tests, most likely some unbounded recursion or something of the sort.
@lpereira Ok, so it does seem like the bug is somewhere in reflect then. I'm going to try to hunt down the bug looking at that function.
This fixes the reflect test:
diff --git a/src/reflect/type.go b/src/reflect/type.go
index 1356f67c..32bf68d7 100644
--- a/src/reflect/type.go
+++ b/src/reflect/type.go
@@ -774,7 +774,7 @@ func (t *rawType) rawFieldByNameFunc(match func(string) bool) (rawStructField, [
if match(name) {
found = append(found, result{
rawStructFieldFromPointer(descriptor, field.fieldType, data, flagsByte, name, offset),
- append(ll.index, int(i)),
+ append(ll.index[:len(ll.index):len(ll.index)], int(i)),
})
}
@@ -787,7 +787,7 @@ func (t *rawType) rawFieldByNameFunc(match func(string) bool) (rawStructField, [
nextlevel = append(nextlevel, fieldWalker{
t: embedded,
- index: append(ll.index, int(i)),
+ index: append(ll.index[:len(ll.index):len(ll.index)], int(i)),
})
}
I'll get this in as another PR and you can rebase against that with only the slice allocation changes.
https://github.com/tinygo-org/tinygo/pull/4367
I rebased this on top of #4387 (please merge it first!), and removed the commit changing reflect. CI is still failing with some error I was unable to track down.
I'll look at the archive/zip failure this evening.
The archive/zip failure seems to be related to the stepped doubling logic function.
Nope; failure seems intermittent.
This is due to memory fragmentation which (annoyingly) has different behaviour with the new growth algorithm. A smarter allocator that has free lists for small allocation sizes to reduce fragmentation would help here, but that's outside the scope of this PR.
A smarter allocator that has free lists for small allocation sizes to reduce fragmentation would help here, but that's outside the scope of this PR.
Agreed. I've rebased on the current dev branch and changed the code to have only one overallocation strategy until we have a better allocator. This won't reduce the amount of memory used by slices, but it simplifies the code quite a bit, which is already a gain IMHO.
fragmentation because of not taking headers/meta into account was my point in https://github.com/tinygo-org/tinygo/pull/4287#discussion_r1692116535
sizediff failure is unrelated.
Merging. Will close my PR that has the reflect fix.