Big File Support Spec
Please provide the specification for the big file support feature. If you don't know where to start, here are some example topics to write about.
How big files are hashed. How merkle trees are made.
Piece size, hashing algorithm, number of leaf nodes, etc.
How a big file is represented in content.json
Piece field format, hashing algorithm, keywords, etc.
Does big file support introduce changes to the network protocol?
Are big files transmitted over the current network protocol? Is the network protocol changed?
What are the preferred ways to store an incomplete big file?
How did you do it.
The status of the current specification.
Draft, pending review, recently revised, final version, etc.
Relevant code snippets from BigfilePlugin.py
content["files_optional"][file_relative_path] = {
"sha512": merkle_root,
"size": upload_info["size"],
"piecemap": piecemap_relative_path,
"piece_size": piece_size
}
# ...
return {
"merkle_root": merkle_root,
"piece_num": len(piecemap_info["sha512_pieces"]),
"piece_size": piece_size,
"inner_path": inner_path
}
# ...
def hashBigfile(self, file_in, size, piece_size=1024 * 1024, file_out=None):
# method source code...
Relevant code snippets from the unit tests.
merkle_root, piece_size, piecemap_info = site.content_manager.hashBigfile(...)
piecemap_info["sha512_pieces"][0].encode("hex")
msgpack.pack({file_name: piecemap_info}, stream)
assert file_node["piecemap"] == inner_path + ".piecemap.msgpack"
assert piecemap["sha512_pieces"][0].encode("hex") == \
"a73abad9992b3d0b672d0c2a292046695d31bebdcb1e150c8410bbe7c972eff3"
Thanks for the suggestions it really helped :) I have added the informations to the docs: https://zeronet.readthedocs.io/en/latest/help_zeronet/network_protocol/#bigfile-plugin
How big files are hashed. How merkle trees are made.
https://zeronet.readthedocs.io/en/latest/help_zeronet/network_protocol/#bigfile-merkle-root
How a big file is represented in content.json
https://zeronet.readthedocs.io/en/latest/help_zeronet/network_protocol/#bigfile-piecemap
Does big file support introduce changes to the network protocol?
Yes, getPieceFields and setPieceFields
What are the preferred ways to store an incomplete big file?
The ZeroNet client creates a sparse file with the final size of the big file at the beginning of the download process. (requires to use fsutil command on windows)
The status of the current specification.
It's already implemented and being used, but nothing is written in stone :)
Packed format: Turns the string to an list of int by counting the repeating characters starting with 1. Example:
1110000001to[3, 6, 1],0000000001to[0, 9, 1],1111111111to[10]
Checker-board pattern?
| Y | N | Y |
|---|---|---|
| 3 | 6 | 1 |
| Y | N | Y |
|---|---|---|
| 0 | 9 | 1 |
| Y |
|---|
| 10 |
Yes it assumes most of picefield will not be fragmented and users will download pieces in batches.