tinygo runtime: Mildly over-allocate when growing slices

This refactors runtime.sliceAppend() and runtime.sliceGrow() so, not only sliceAppend() uses sliceGrow() if needed, sliceGrow() only mildly over-allocates a new buffer using a sequence similar to what CPython uses for list objects.

CC @dgryski

Jun 07 '24 19:06 lpereira

Some quick testing shows this makes the multiplication factor 1.125. However the expression doesn't seem to work for newCap <= 8 so I think we need to make sure the edge cases are handled properly.

Jun 26 '24 05:06 dgryski

I'm not sure this actually saves us anything. I threw together a quick simulation and it seems that the new algorithm requires more calls to "grow" (since we're not growing as much, we need to do it more frequently, requiring more CPU) and ends up generating more garbage, because the additional calls to grow means we have to throw more stuff away.

You can play with my sample code here: https://go.dev/play/p/QcHiMqnGlym

Jul 22 '24 18:07 dgryski

Yup, that seems to be the case from some of the testing I did last week to understand why some tests were failing. I'm going to send a new version that does pretty much the same thing as before but keeps the cleanup.

On Mon, Jul 22, 2024, at 11:53 AM, Damian Gryski wrote:

I'm not sure this actually saves us anything. I threw together a quick simulation and it seems that the new algorithm requires more calls to "grow" (since we're not growing as much, we need to do it more frequently, requiring more CPU) and ends up generating more garbage, because the additional calls to grow means we have to throw more stuff away.

You can play with my sample code here: https://go.dev/play/p/QcHiMqnGlym

— Reply to this email directly, view it on GitHub https://github.com/tinygo-org/tinygo/pull/4287#issuecomment-2243609999, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADVGKW2UBXOAKSZRH5APLZNVIKBAVCNFSM6AAAAABI7FV6YCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBTGYYDSOJZHE. You are receiving this because you authored the thread.Message ID: @.***>

Jul 22 '24 18:07 lpereira

Alright, @dgryski, pushed a new version, rebased on the current dev branch, that goes back to the "align to the next power of two" code, but using a branchfree version instead.

Jul 22 '24 20:07 lpereira

The failure inside the reflect test is very weird.

Jul 23 '24 20:07 lpereira

The Index field of "want" for the reflect field F looks incorrect, it is field 1, but want is 0,0.

Jul 25 '24 15:07 cee-dub

The Index field of "want" for the reflect field F looks incorrect, it is field 1, but want is 0,0.

Right, and I have no idea why! I've spent some time in the reflect code, and @dgryski too, but we don't yet know what's happening here. Our best guesses is that there's either memory corruption somewhere, or that, although the code seems correct, its behavior is different from before, and the reflect code is probably depending on that somehow.

Jul 25 '24 16:07 lpereira

Hmm, I was suggesting/agreeing with @dgryski that the test may be faulty and the old implementation matched. I think the expectation looks incorrect, though I have very little experience or deep knowledge of this code.

Jul 25 '24 20:07 cee-dub

Hmm, I was suggesting/agreeing with @dgryski that the test may be faulty and the old implementation matched. I think the expectation looks incorrect, though I have very little experience or deep knowledge of this code.

If I understood the test correctly, it seems correct. The index is a slice of indices, starting with the current type, and following any nested type. So an index of []int{0} means the field is the first in that struct; and an index of []int{0, 1} means it's the second of the first of that struct.

The test defines two structs: Rec1 and Rec2; Rec1 has a pointer to Rec2, and Rec2 has a F string and a pointer to Rec1. To find F from Rec1, you need to go through the first element, and then find the first element ([]int{0, 0}), which seems to agree with the test.

I might be mistaken about how this works, though.

Jul 25 '24 20:07 lpereira

Reading more of #tinygo-dev, feel free to ignore me. I didn't look closely enough at the Index type in the test.

Jul 25 '24 21:07 cee-dub

So I copied the failing test to its own file, and reflect.VisibleFields() gives the right output, agreeing with Big Go. So the problem seems to be in the actual test driver somewhere.

Jul 26 '24 16:07 lpereira

Fixed the reflect test; let's see if CI likes it.

Jul 26 '24 17:07 lpereira

Alright, archive/zip failing is probably the reflect.DeepEqual() calls inside the tests, most likely some unbounded recursion or something of the sort.

Jul 26 '24 17:07 lpereira

@lpereira Ok, so it does seem like the bug is somewhere in reflect then. I'm going to try to hunt down the bug looking at that function.

Jul 26 '24 18:07 dgryski

This fixes the reflect test:

diff --git a/src/reflect/type.go b/src/reflect/type.go
index 1356f67c..32bf68d7 100644
--- a/src/reflect/type.go
+++ b/src/reflect/type.go
@@ -774,7 +774,7 @@ func (t *rawType) rawFieldByNameFunc(match func(string) bool) (rawStructField, [
                                if match(name) {
                                        found = append(found, result{
                                                rawStructFieldFromPointer(descriptor, field.fieldType, data, flagsByte, name, offset),
-                                               append(ll.index, int(i)),
+                                               append(ll.index[:len(ll.index):len(ll.index)], int(i)),
                                        })
                                }

@@ -787,7 +787,7 @@ func (t *rawType) rawFieldByNameFunc(match func(string) bool) (rawStructField, [

                                        nextlevel = append(nextlevel, fieldWalker{
                                                t:     embedded,
-                                               index: append(ll.index, int(i)),
+                                               index: append(ll.index[:len(ll.index):len(ll.index)], int(i)),
                                        })
                                }

I'll get this in as another PR and you can rebase against that with only the slice allocation changes.

Jul 26 '24 18:07 dgryski

https://github.com/tinygo-org/tinygo/pull/4367

Jul 26 '24 19:07 dgryski

I rebased this on top of #4387 (please merge it first!), and removed the commit changing reflect. CI is still failing with some error I was unable to track down.

Jul 26 '24 22:07 lpereira

I'll look at the archive/zip failure this evening.

Jul 27 '24 02:07 dgryski

The archive/zip failure seems to be related to the stepped doubling logic function.

Jul 27 '24 06:07 dgryski

Nope; failure seems intermittent.

Jul 27 '24 22:07 dgryski

This is due to memory fragmentation which (annoyingly) has different behaviour with the new growth algorithm. A smarter allocator that has free lists for small allocation sizes to reduce fragmentation would help here, but that's outside the scope of this PR.

Jul 31 '24 18:07 dgryski

A smarter allocator that has free lists for small allocation sizes to reduce fragmentation would help here, but that's outside the scope of this PR.

Agreed. I've rebased on the current dev branch and changed the code to have only one overallocation strategy until we have a better allocator. This won't reduce the amount of memory used by slices, but it simplifies the code quite a bit, which is already a gain IMHO.

Jul 31 '24 18:07 lpereira

fragmentation because of not taking headers/meta into account was my point in https://github.com/tinygo-org/tinygo/pull/4287#discussion_r1692116535

Jul 31 '24 18:07 ldemailly

sizediff failure is unrelated.

Jul 31 '24 19:07 lpereira

Merging. Will close my PR that has the reflect fix.

Jul 31 '24 20:07 dgryski