go-git Bisected regression runs genOffsetHash leading to wasteful loop when processing ofs-delta

I noticed that go-git is using a ton of CPU and memory when doing simple operations like iterating through a tree on large repositories.

Test program

package main

import (
	"log"
	"os"

	"github.com/go-git/go-git/v5"
)

func main() {
	log.SetFlags(log.LstdFlags | log.Lmicroseconds)
	repo, err := git.PlainOpen(os.Args[1]); if err != nil { panic(err) }
	headRef, err := repo.Head(); if err != nil { panic(err) }
	commit, err := repo.CommitObject(headRef.Hash()); if err != nil { panic(err) }
	tree, err := commit.Tree(); if err != nil { panic(err) }
	log.Println("Iterating tree")
	for _, entry := range tree.Entries {
		size, err := tree.Size(entry.Name); if err != nil { panic(err) }
		log.Println(entry.Name, size)
	}
	log.Println("Done")
}

Test environment

This commit from Asahi Linux

go version go1.23.6 linux/arm64

M1 MacBook Air (2020)

Expected behavior

The entire process completes in about 0.1s.

2025/03/08 20:54:07.990596 Iterating tree
2025/03/08 20:54:07.990964 .get_maintainer.ignore 194
2025/03/08 20:54:07.990993 .gitattributes 105
2025/03/08 20:54:07.991031 .gitignore 2180
2025/03/08 20:54:07.991156 .mailmap 42849
2025/03/08 20:54:07.991184 .rustfmt.toml 369
2025/03/08 20:54:07.991209 COPYING 496
...
2025/03/08 20:54:07.992446 Done

Actual behvior

It takes about 1.3s.

2025/03/08 20:52:52.634226 Iterating tree
...
2025/03/08 20:52:52.635310 .get_maintainer.ignore 194
2025/03/08 20:52:52.635343 .gitattributes 105
2025/03/08 20:52:53.781688 .gitignore 2180  <-- spends a lot of time here
2025/03/08 20:52:53.783122 .mailmap 42849
2025/03/08 20:52:53.783330 .rustfmt.toml 369
2025/03/08 20:52:53.783372 COPYING 496
...
2025/03/08 20:52:53.809643 Done

In my scenario it's .gitignore that's being slow to tree.Size on.

Profiling

It looks like the main problem is how genOffsetHash uses maps. I'm no expert when it comes to the Go runtime so I'll leave it here.

First bad commit

commit bb5c35196ee55b82f6506d69257de0e59cd8c41d
Date:   Sun Nov 17 00:34:11 2024 +0000

    plumbing: packfile, Refactor Parser and Scanner logic
    The changes focus on increasing thread-safety, simplifying the code base,
    enabling support for sha256 and improving general time and space complexities.

    (snip: some statistics showing fewer allocs)

Call stack

Interestingly, genOffsetHash is only called occasionally, and it takes a long time whenever it is called.

github.com/go-git/go-git/v5/plumbing/format/idxfile.(*MemoryIndex).genOffsetHash(0x400016ed88)
	/home/runxiyu/go-git/plumbing/format/idxfile/idxfile.go:230 +0x308
github.com/go-git/go-git/v5/plumbing/format/idxfile.(*MemoryIndex).FindHash(0x400016ed88, 0x4ec70ae3)
	/home/runxiyu/go-git/plumbing/format/idxfile/idxfile.go:191 +0x1d0
github.com/go-git/go-git/v5/plumbing/format/packfile.(*Packfile).getByOffset(0x40000a3180, 0x4ec70ae3)
	/home/runxiyu/go-git/plumbing/format/packfile/packfile.go:175 +0x34
github.com/go-git/go-git/v5/plumbing/format/packfile.(*Packfile).getMemoryObject(0x40000a3180, 0x40001b4820)
	/home/runxiyu/go-git/plumbing/format/packfile/packfile.go:310 +0x224
github.com/go-git/go-git/v5/plumbing/format/packfile.(*Packfile).objectFromHeader(0x40000a3180?, 0xf357138a8184676e?)
	/home/runxiyu/go-git/plumbing/format/packfile/packfile.go:281 +0x44
github.com/go-git/go-git/v5/plumbing/format/packfile.(*Packfile).getByOffset(0x40000a3180, 0x73450f85)
	/home/runxiyu/go-git/plumbing/format/packfile/packfile.go:189 +0xb0
github.com/go-git/go-git/v5/plumbing/format/packfile.(*Packfile).GetByOffset(0x40000a3180, 0x73450f85)
	/home/runxiyu/go-git/plumbing/format/packfile/packfile.go:81 +0xd0
github.com/go-git/go-git/v5/plumbing/format/packfile.(*Packfile).GetSizeByOffset(0x40000a3180, 0x73450f85)
	/home/runxiyu/go-git/plumbing/format/packfile/packfile.go:91 +0x34
github.com/go-git/go-git/v5/storage/filesystem.(*ObjectStorage).encodedObjectSizeFromPackfile(0x4000160398, {0xa6, 0x1e, 0x47, 0x78, 0xd0, 0x11, 0xcf, 0x70, 0x6e, ...})
	/home/runxiyu/go-git/storage/filesystem/object.go:326 +0x228
github.com/go-git/go-git/v5/storage/filesystem.(*ObjectStorage).EncodedObjectSize(0x4000160398, {0xa6, 0x1e, 0x47, 0x78, 0xd0, 0x11, 0xcf, 0x70, 0x6e, ...})
	/home/runxiyu/go-git/storage/filesystem/object.go:340 +0xb0
github.com/go-git/go-git/v5/plumbing/object.(*Tree).Size(0x400015e410, {0x400011ce40?, 0x2?})
	/home/runxiyu/go-git/plumbing/object/tree.go:100 +0x88
main.main()
	/home/runxiyu/crap/test-go-git.go:25 +0x2d8

func (idx *MemoryIndex) genOffsetHash() error {
	defer idx.mu.Unlock()
	idx.mu.Lock()

	count, err := idx.Count()
	if err != nil {
		return err
	}

	idx.offsetHash = make(map[int64]plumbing.Hash, count)
	idx.offsetHashIsFull = true

	var hash plumbing.Hash
	i := uint32(0)
	for firstLevel, fanoutValue := range idx.Fanout {
		mappedFirstLevel := idx.FanoutMapping[firstLevel]
		for secondLevel := uint32(0); i < fanoutValue; i++ {
			fmt.Println("secondLevel", secondLevel)          <-- [1]
			copy(hash[:], idx.Names[mappedFirstLevel][secondLevel*objectIDLength:])
			offset := int64(idx.getOffset(mappedFirstLevel, int(secondLevel)))
			idx.offsetHash[offset] = hash
			secondLevel++
		}
	}

	return nil
}

The second-level loop at [1] seems to be called idx.Count() times, which is 10950545 in my case. Quite reasonably this slows things down.

genOffsetHash was only called for the file that slowed things down. It wasn't called at all before the problematic commit.

Now, I think setting a map 10950545 times sounds problematic... but this seems to have always been the behavior here. I'm looking into what caused genOffsetHash to be called.

func (idx *MemoryIndex) FindHash(o int64) (plumbing.Hash, error) {
	var hash plumbing.Hash
	var ok bool

	fmt.Println(o) // I added these random debug prints because I can't use debuggers properly
	fmt.Println(idx.offsetHash[o])

	idx.mu.RLock()
	if idx.offsetHash != nil {
		if hash, ok = idx.offsetHash[o]; ok {
			idx.mu.RUnlock()
			return hash, nil
		}
	}
	idx.mu.RUnlock()

	// Lazily generate the reverse offset/hash map if required.
	if !idx.offsetHashIsFull || idx.offsetHash == nil {
		if err := idx.genOffsetHash(); err != nil {
			return plumbing.ZeroHash, err
		}

		hash, ok = idx.offsetHash[o]
	}

	if !ok {
		return plumbing.ZeroHash, plumbing.ErrObjectNotFound
	}

	return hash, nil
}

This has previously returned

1933905797
a61e4778d011cf706e6784818a1357f392f3a669
2025/03/08 21:38:56.182156 .gitignore 2180

but after the problematic commit it returns

1933905797
a61e4778d011cf706e6784818a1357f392f3a669
1933905797
a61e4778d011cf706e6784818a1357f392f3a669
1321667299
0000000000000000000000000000000000000000

Previously FindHash is only called one time per file, but now it's being called at least two times for each file, and for .gitignore it's called three times (with the last one returning a null hash).

Notes

It's on main but not in any releases yet

Mar 08 '25 12:03 runxiyu

The 'assign to a map a million times' behaviour could be seen in commits as old as 83649a1c08e0dcc103de54652ed5b9c33d301326 when running genOffsetHash e.g. when fetching blob contents from packfiles

Additionally, memory usage also explodes

Mar 09 '25 12:03 runxiyu

I believe I'm experiencing the same issue. In my case I'm cloning a repository to obtain a file at a known path on a given revision (could be any tag, branch, or sha). It seems the only way to do do this is using go-git is to clone the repository in full (if there is a way to do so without a full checkout that would be awesome!) For example I attempted to clone this 145M repository running the following code. The process swelled to over 25GB of memory over the course of >5 minutes before I had kill it .

package main

import (
	"log"

	"github.com/go-git/go-billy/v5/memfs"
	"github.com/go-git/go-git/v5"
	"github.com/go-git/go-git/v5/storage/memory"
)

func main() {
	cloneOpts := &git.CloneOptions{
		URL: "https://github.com/red-hat-data-services/RHOAI-Build-Config.git",
		NoCheckout: true,
	}
	_, err := git.Clone(memory.NewStorage(), memfs.New(), cloneOpts)
	if err != nil {
		log.Fatalf("clone error: %w", err)
	}

}

It's worth noting as well, though this may warrant opening a separate issue, that git.CloneContext with a timeout does not allow users to stop unbound time/memory during cloning. The bulk of the memory and time are spent in Repository.updateReferences which does not receive the context passed to its caller. Given that fetchAndUpdateReferences can be and expensive and time-consuming operation, it's unfortunate that supplying a Context to use during cloning will only have any effect if the context is cancelled during the network IO.

Mar 17 '25 02:03 aThorp96

Not sure what you mean by a "full checkout"... it works with bare repos so that's definitely not required. Are you referring to shallow clones (which would produce a much smaller packfile)?

Mar 17 '25 02:03 runxiyu

@runxiyu yes, sorry for the confusion. “Full Checkout” was intended to mean CloneOptions.Depth: 0. “Full Clone” would have been less confusing, but as you said my issue is worsened by being unable to shallow clone (in case the user’s revision is not at the head of any branch/tag)

Mar 17 '25 02:03 aThorp96

@aThorp96 maybe a sparse-checkout would make things slightly better on your case, as you don't need to go through the entire worktree.

Mar 23 '25 11:03 pjbgf

Previously FindHash is only called one time per file, but now it's being called at least two times for each file, and for .gitignore it's called three times (with the last one returning a null hash).

@runxiyu this does not seem to be the case with the code you provided. Are you taking about a different code snippet? There is certainly room for improvement in the index handling, but in terms of processing time, taking ~1s to process the repository and loop the tree, considering this is a repository with over a million commits and over 80k files, does not sound too bad. Was it the same code snippet you used to time the execution before and after that commit or the 0.1s was a comparison against the git cli?

Mar 23 '25 11:03 pjbgf

This is the behavior that I observe on an aarch64 linux machine. The 0.1s was right before the mentioned commit, and the 1s is the mentioned commit

Mar 23 '25 11:03 runxiyu

@runxiyu I timed the specific ops in your code and tested with the latest released version (without the "bad commit") and against the latest v6. The difference in performance is neglegible:

Operation	v5.14.0	v6 (v6-transport)
PlainOpen	64.15µs	43.561µs
Head	35.646µs	29.424µs
CommitObject	58.574624ms	61.600751ms
Tree	1.188076258s	1.16559348s

As far as I can see, the changes pointed out do not have an impact on the execution path.

That being said, you mentioned aarch64. One thing that would certainly have a worse performance on that architecture is the sha1cd implementation, as that is not as optimised outside amd64. Can you try calling hash.RegisterHash(crypto.SHA1, sha1.New) as the first thing in your code and check whether there is an impact on the execution? Please note that this removes collision detection and should not be used against servers you don't trust.

Mar 23 '25 14:03 pjbgf

https://github.com/user-attachments/assets/67fc6936-4433-40fd-a527-22d7a801dd26

I haven't investigated sha1cd yet (though I probably don't need it, as I am a server myself, and I handle pushes through git-receive-pack instead of go-git) But here's a demo of how things seem to slow down between v5.4.0 and main

Mar 23 '25 22:03 runxiyu

@runxiyu thanks for sharing the video. This aligns with what I mentioned before as sha1cd was introduced on v5.5.0. Can you confirm what version of the sha1cd dependency you have in your go.mod please?

The last benchmarks results I have on arm64 are a few years old. Do you mind running this just to see whether you getting an even worse performance than expected?

Mar 24 '25 09:03 pjbgf

sha1cd benchmark:

go test -benchmem -run=^$ -bench ^Benchmark ./...
PASS
ok  	github.com/pjbgf/sha1cd	0.001s
?   	github.com/pjbgf/sha1cd/cgo	[no test files]
?   	github.com/pjbgf/sha1cd/internal	[no test files]
goos: linux
goarch: arm64
pkg: github.com/pjbgf/sha1cd/test
BenchmarkCalculateDvMask/generic-8         	1000000000	        0.0000004 ns/op	      0 B/op	       0 allocs/op
BenchmarkCalculateDvMask/cgo-8             	1000000000	        0.0000014 ns/op	      0 B/op	       0 allocs/op
BenchmarkHash8Bytes/sha1-8                 	18426642	       58.86 ns/op	135.92 MB/s	       0 B/op	       0 allocs/op
BenchmarkHash8Bytes/sha1cd_generic-8       	 712749	     1911 ns/op	  4.19 MB/s	     192 B/op	       2 allocs/op
BenchmarkHash8Bytes/sha1cd_cgo-8           	 236654	     4585 ns/op	  1.74 MB/s	    2688 B/op	       1 allocs/op
BenchmarkHash320Bytes/sha1-8               	6731809	      189.4 ns/op	1689.87 MB/s	       0 B/op	       0 allocs/op
BenchmarkHash320Bytes/sha1cd_generic-8     	 453511	     3388 ns/op	 94.46 MB/s	     192 B/op	       2 allocs/op
BenchmarkHash320Bytes/sha1cd_cgo-8         	 145783	     7714 ns/op	 41.48 MB/s	    2688 B/op	       1 allocs/op
BenchmarkHash1K/sha1-8                     	2288988	      507.4 ns/op	2018.08 MB/s	       0 B/op	       0 allocs/op
BenchmarkHash1K/sha1cd_generic-8           	 346070	     5330 ns/op	192.11 MB/s	     192 B/op	       2 allocs/op
BenchmarkHash1K/sha1cd_cgo-8               	 100228	    14346 ns/op	 71.38 MB/s	    2688 B/op	       1 allocs/op
BenchmarkHash8K/sha1-8                     	 339747	     3707 ns/op	2209.64 MB/s	       0 B/op	       0 allocs/op
BenchmarkHash8K/sha1cd_generic-8           	  43082	    29421 ns/op	278.44 MB/s	     192 B/op	       2 allocs/op
BenchmarkHash8K/sha1cd_cgo-8               	  76657	    23501 ns/op	348.57 MB/s	    2688 B/op	       1 allocs/op
BenchmarkHashWithCollision/sha1cd_generic-8         	 297627	     5786 ns/op	110.62 MB/s	     192 B/op	       2 allocs/op
BenchmarkHashWithCollision/sha1cd_cgo-8             	  91353	    16493 ns/op	 38.81 MB/s	    2688 B/op	       1 allocs/op
PASS
ok  	github.com/pjbgf/sha1cd/test	25.041s
?   	github.com/pjbgf/sha1cd/ubc	[no test files]

Mar 24 '25 11:03 runxiyu

https://github.com/user-attachments/assets/4d2ba4ef-4941-422b-9c0a-7f2ee61c198a

I think this pretty conclusively shows that this commit is the issue here

commit bb5c35196ee55b82f6506d69257de0e59cd8c41d
Author: Paulo Gomes <[email protected]>
Date:   Sun Nov 17 00:34:11 2024 +0000

    plumbing: packfile, Refactor Parser and Scanner logic
    The changes focus on increasing thread-safety, simplifying the code base,
    enabling support for sha256 and improving general time and space complexities.

Mar 24 '25 11:03 runxiyu

As far as I can see, it's something happening in packfile/indexfile handling rather than sha1cd

Mar 24 '25 12:03 runxiyu

@runxiyu can you please test #1523 again, I believe the latest changes should make an impact on the issue you are experiencing.

May 29 '25 08:05 pjbgf

@pjbgf Still stuck for about a second on the .gitignore file in that Asahi Linux repo, as of

commit f9f2ae49b16909af2adee3c26e03dc54d8e5fe10
tree 183fe4feacf2c9aebc8e4884e67544cb389e14f2
parent 0b2ab6d34260bc70733a67460da5f3bebbea9488
author Paulo Gomes <[email protected]> 1748507032 +0100
committer Paulo Gomes <[email protected]> 1748507032 +0100
gpgsig -----BEGIN SSH SIGNATURE-----
 U1NIU0lHAAAAAQAAAEoAAAAac2stc3NoLWVkMjU1MTlAb3BlbnNzaC5jb20AAAAgSDBGoe
 bYEec9307bC2ANGbdfPx+1JwqiGybGfpGHa7IAAAAEc3NoOgAAAANnaXQAAAAAAAAABnNo
 YTUxMgAAAGcAAAAac2stc3NoLWVkMjU1MTlAb3BlbnNzaC5jb20AAABAtJgwBjRlCxoYPF
 ASIYKxt1SeuIAdZbDac6vSRbSX5nOu1hFakGuHIkhcssKejqrgK/9iWwOn+9xoQsnJmv+q
 BgEAAKx9
 -----END SSH SIGNATURE-----

    plumbing: transport, Use KeepDescriptors by default
    In filesystem storage, the packfile caching is only used if KeepDescriptors
    is enabled. This should make overall operations using filesystem storage more
    efficient.
    
    Signed-off-by: Paulo Gomes <[email protected]>

May 31 '25 13:05 runxiyu