Using IPFS in atomic data
IPFS (or other content-addressing protocols) is a very interesting technology, especially for linked data, as it helps make static resources highly available. Atomic Data Properties are examples of where this is very important: it is essential that these resolve, and it can be harmful is the owner decides to change a datatype, for example.
Relates to #64
Howdy, this question came up in my own protocol R&D. I do think content-addressing is important, for resilience, for cheaper faster hosting, for applications that require verifiability, and even for offline-first app development (I think it'll simplify things by allowing the creator of a datum to already know its global ID before having published it). Immutability isn't an issue. For mutable data, the content-address would link to an immutable document that describes its type, the initial state, the update permission rules, and the preferred hosts. Edits are stored as part of a chain linking back to that definition document.
But I doubt that IPFS, or even an IPLD (ipfs's standard formats) format, is what we want.
Here's a bsky thread where I lament the fact that IPLD dag-cbor keys can't be CIDs (ipfs links), meaning that you wont be able to use ipld maps with links to the type definition of your properties as your keys/property names. So that already kind of settles it, you can't use IPLD.
There are ways of working around it, but why do we want IPLD, again? I had a brief look at the current top decentralized storage systems, and it wasn't clear that any of them used IPLD. So I was only considering using IPLD for the sake of ATProto compatability, but I'm not sure what the point is in being just a little bit compatible with ATProto in that way given that it's not going to make us actually compatible with ATProto. And, if something like atomic data took off, people shouldn't really keep using ATProto! It would be a much weaker protocol!
I also sense that having a robust distributed type system/self-describing typed data might make some of the features and limitations of IPLD formats unnecessary. I'm tempted to standardize just type tags, with wasm implementations of the types, and in that situation there wouldn't need to be a standard codec or even link format, the data would be an opaque blob, and the host would use the type ID to fetch implementations of trace(self)-> Iterator<Resource> that tell them how to crawl the blob for links to the other blobs that they need to cache with it, and that would be all they need to know about it. But I'm not as sure about this. Whichever storage services crawl the data will run a lot slower if they have to fetch and run a menagerie of wasm implementations to trace the data. (But how fast does a crawler need to be?)
As for using IPFS specifically, I've heard it's slow, and I don't expect a purely p2p storage protocol to ever be fast, so I would only want one as a fallback. But there are other protocols in the IPLD scene like Iroh that augment the base p2p protocol with paid hosting, and so aren't slow, which is the right way of doing it.
Hi @makoConstruct! Thanks for sharing your thoughts here!
Idea of how to add iroh support to AtomicServer
- Add the
irohlibraries toAtomicServer. - Use that storage engine as KV store for resources. Or perhaps as a duplicate / backup, depending on how big the perf hit will be. We still keep
sled+tantivyfor indexes! - Also store files on
irohas default - How do we map atomic
Resourcestoirohconstructs likeBlobandDocument? Since documents are where authorization happens, we should probably think ofResourcesasDocuments. Everyiroh::Documentis a bundle of Key-Value combinations, similar toatomic::Resource. -
Read access to a document is granted by sharing the document public key. Write access to a document is granted by sharing the document private key.This means that if we refer to an IPFS document using its public key, we make it public. This is very different from how Atomic Server deals with authentication. How do we deal with this? Well, one way is to only use thepublicKeyas a property inside the resource. That way, if you can read the Resource, you can view the Document. - I'd like AtomicServer to a
local-firstapp. It would be nice if it worked without any server, yet still allowed sharing things through P2P. Ideally, we'd use iroh IDs for things like atomic data core properties etc.
This means that if we refer to an IPFS document using its public key, we make it public. This is very different from how Atomic Server deals with authentication. How do we deal with this?
Mm, keypairs aren't a good auth system for most applications.
Access to documents should generally be revocable in the sense that new reads or writes can be prevented if access is abused. (for this reason, although object capability security might make sense as a paradigm on the implementation level, the UX for managing access will usually end up being an access control list)
And using keypairs for identity in general is not great. Keys get lost or stolen. A solid auth system needs some sort of key rotation or identity recovery process. That can be done in a decentralized way, but it's currently quite rare. I don't know if I can see incompatabilities between keypair auth and the quite good recoverable migratory keyset identity systems like holochain::deepkey or did:keri. It's possible we'll need simple keypair stuff to implement the recoverable identity systems.