Keep single global object cache; Automatically repack all objects into the cloned repo
This project looks like it's almost exactly what I'm looking for in an object cacheing solution, but I have RTFS'd and found two problems (correct me if I'm wrong):
- You seem to use a separate repo per domain (so if I clone the linux kernel from kernel.org, and then clone android's repo, I still end up downloading duplicated blobs)
- Damaged caches will screw you up, as described in: http://randyfay.com/node/119
Assuming that I don't care about disk usage (only network), I would like to propose the following modifications (I will call the locally checked out repo LOCAL_REPO):
- Switch to using a single global
~/.git_cached/global_cache.git/ - Automatically repack all objects into the locally checked out repo (as suggested by randyfay) so that it works as a standalone repo.
- After cloning, in
~/.git_cached/global_cache.git/, rungit remote add -f local_$(hash $LOCAL_REPO) $LOCAL_REPO. This makes updating the global cache simply a case of runninggit remote updatein~/.git_cached/global_cache.git
Do these improvements sound sane, or am I off my rocker?
Sorry for hijacking your issues page as a work log. I submitted:
https://github.com/alsuren/git/commit/d49a04e49255c70951fb01c343931f44c4e69566
to #git on freenode, and got the following feedback:
[21:20]
[21:20]
[21:19] <alsuren_> cmn: the eventual idea of the patch is to help you create a global object cache in ~/.git_object_cache or similar, so that you can clone the linux kernel from upstream, and then clone it from android and not have to think about whether you're downloading it too many times
[21:19]
[22:04] <alsuren_> grawity: cmn: so if I specify --object-cache or --reference, my client will send up to 256 * "have ${obj_id}\n" (in 32 line blocks) until it receives "ACK ${obj_id} ready", and then the server will send it a pack with all of the objects that the client doesn't have
[22:05]
I'm thinking that if we added an extension to protocol-capabilities.txt that asks the server for a list of root commits in the initial handshake (as well as the tip refs), we could use it as the basis of an efficient object cache/lookup/negotiation mechanism.
Will sleep on it and tell you if I have any further brainwaves.
Hey thanks for the interest but keep in mind that I did this to scratch my own itch which is this.
Checkout multiple branches of a project at once without having to pull all the objects x number of times. My main concern was speed. Size is somewhat of a concern but not as much as the speed.
Your conversation is way over my head. This is the first bash script I created and I looked up just enough to be able to accomplish the above points.
If you have ideas to make it even better, I'm all for it. As long as I can keep using it like I'm doing now.
I'd love to look into your suggestion but I'll be very busy for a while but keep the ideas coming and maybe some code? :) Thanks!
You seem to use a separate repo per domain (so if I clone the linux kernel from kernel.org, and then clone android's repo, I still end up downloading duplicated blobs)
I just did it to be safe. It's probably not necessary.
Damaged caches will screw you up, as described in: http://randyfay.com/node/119
It can but there is a repair and no-cache command. The latter simply repacks.