archweb icon indicating copy to clipboard operation
archweb copied to clipboard

Support v2 and hybrid v1&v2 torrents

Open nl6720 opened this issue 3 years ago • 2 comments

From what I can see, archweb doesn't support v2 and hybrid v1&v2 torrents in release.json. I'm getting a not a valid bencoded string when adding a base64 encoded hybrid or v2 torrent as torrent_data in releng/fixtures/release.json.

See https://blog.libtorrent.org/2020/09/bittorrent-v2/ for details on BitTorrent protocol v2.

The SHA256-based info hash v2 needs to be exposed in templates/releng/release_detail.html and as part of magnet links.

nl6720 avatar Oct 07 '22 10:10 nl6720

Quick check, we should change: https://github.com/archlinux/archweb/blob/e864c90936157be5452e81521a717208761ca056/releng/models.py#L6

And use libtorrent instead, as it's more likely to be up-to-date. It also has libtorrent.bdecode() which does a much better job (and is V2 compliant). Only downside is that it generates bytes objects in both keys and values of the dict, so if that is an issue we need to use something like:

import datetime
import json

def jsonify(obj):
	"""
	Converts objects into json.dumps() compatible nested dictionaries.
	"""

	compatible_types = str, int, float, bool, bytes
	if isinstance(obj, dict):
		return {
			jsonify(key): jsonify(value)
			for key, value in obj.items()
			if isinstance(key, compatible_types)
		}
	if isinstance(obj, bytes):
		return obj.decode('UTF-8', errors='replace')
	if isinstance(obj, (datetime.datetime, datetime.date)):
		return obj.isoformat()
	if isinstance(obj, (list, set, tuple)):
		return [jsonify(item) for item in obj]

	return obj

class JSON(json.JSONEncoder, json.JSONDecoder):
	def encode(self, obj) -> str:
		return super().encode(jsonify(obj))
json.dumps(libtorrent.bdecode(libtorrent.bencode(torrent)), cls=JSON)

The above is a crude snippet, and it might only sove the not a valid bencode string.

Torxed avatar Nov 12 '23 16:11 Torxed

So luckily we save the Release torrent_data as base64 encoded file. So that should allow a switch. Another bencode user is:

    def torrent(self):
        try:
            data = b64decode(self.torrent_data.encode('utf-8'))
        except (TypeError, binascii.Error):
            return None
        if not data:
            return None
        data = bdecode(data)
        # transform the data into a template-friendly dict
        info = data.get('info', {})
        metadata = {
            'comment': data.get('comment', None),
            'created_by': data.get('created by', None),
            'creation_date': None,
            'announce': data.get('announce', None),
            'file_name': info.get('name', None),
            'file_length': info.get('length', None),
            'piece_count': len(info.get('pieces', '')) / 20,
            'piece_length': info.get('piece length', None),
            'url_list': data.get('url-list', []),
            'info_hash': None,
        }
        if 'creation date' in data:
            metadata['creation_date'] = datetime.fromtimestamp(data['creation date'], tz=timezone.utc)
        if info:
            metadata['info_hash'] = hashlib.sha1(bencode(info)).hexdigest()

jelly avatar Nov 13 '23 10:11 jelly