umbra-scientia

Results 2 issues of umbra-scientia

It's not clear how (or if) `tokenizers.models.BPE` is meant to be used with GPT-2 tokenization. We failed to find an answer in the API documentation, so we developed an ugly...

Added a Python script under `scripts/xor-codec/xor_codec.py` for encoding and decoding model weight deltas. (With optional gzip compression.) Encode: `xor_codec.py output_dir/ model_dir/ llama_dir/ --encode --compress` Decode: `xor_codec.py output_dir/ delta_dir/ llama_dir/ --compress`