Support v2 and hybrid v1&v2 torrents
From what I can see, archweb doesn't support v2 and hybrid v1&v2 torrents in release.json.
I'm getting a not a valid bencoded string when adding a base64 encoded hybrid or v2 torrent as torrent_data in releng/fixtures/release.json.
See https://blog.libtorrent.org/2020/09/bittorrent-v2/ for details on BitTorrent protocol v2.
The SHA256-based info hash v2 needs to be exposed in templates/releng/release_detail.html and as part of magnet links.
Quick check, we should change: https://github.com/archlinux/archweb/blob/e864c90936157be5452e81521a717208761ca056/releng/models.py#L6
And use libtorrent instead, as it's more likely to be up-to-date. It also has libtorrent.bdecode() which does a much better job (and is V2 compliant). Only downside is that it generates bytes objects in both keys and values of the dict, so if that is an issue we need to use something like:
import datetime
import json
def jsonify(obj):
"""
Converts objects into json.dumps() compatible nested dictionaries.
"""
compatible_types = str, int, float, bool, bytes
if isinstance(obj, dict):
return {
jsonify(key): jsonify(value)
for key, value in obj.items()
if isinstance(key, compatible_types)
}
if isinstance(obj, bytes):
return obj.decode('UTF-8', errors='replace')
if isinstance(obj, (datetime.datetime, datetime.date)):
return obj.isoformat()
if isinstance(obj, (list, set, tuple)):
return [jsonify(item) for item in obj]
return obj
class JSON(json.JSONEncoder, json.JSONDecoder):
def encode(self, obj) -> str:
return super().encode(jsonify(obj))
json.dumps(libtorrent.bdecode(libtorrent.bencode(torrent)), cls=JSON)
The above is a crude snippet, and it might only sove the not a valid bencode string.
So luckily we save the Release torrent_data as base64 encoded file. So that should allow a switch. Another bencode user is:
def torrent(self):
try:
data = b64decode(self.torrent_data.encode('utf-8'))
except (TypeError, binascii.Error):
return None
if not data:
return None
data = bdecode(data)
# transform the data into a template-friendly dict
info = data.get('info', {})
metadata = {
'comment': data.get('comment', None),
'created_by': data.get('created by', None),
'creation_date': None,
'announce': data.get('announce', None),
'file_name': info.get('name', None),
'file_length': info.get('length', None),
'piece_count': len(info.get('pieces', '')) / 20,
'piece_length': info.get('piece length', None),
'url_list': data.get('url-list', []),
'info_hash': None,
}
if 'creation date' in data:
metadata['creation_date'] = datetime.fromtimestamp(data['creation date'], tz=timezone.utc)
if info:
metadata['info_hash'] = hashlib.sha1(bencode(info)).hexdigest()