Load, save isn't idempotent on bundles
Code
bundle = UnityPy.load("path/to/StreamingAssets/mybundle_assets_all.bundle")
bundle.save("lz4", ".")
Error No error is thrown.
In the tested case, the original bundle file (Unity version 2021.3.35f1) is 114141 bytes, the saved file however is largely different, and has a size of 115811.
Bug It is expected that both files should remain the same, as no operation has been done on them.
Observations
When reading back the two bundles, the DirectoryInfoFS object is also different:
original: [DirectoryInfoFS(offset=0, size=385396, flags=4, path='CAB-74024cf513ad89d33b8720f05cb38d47')]
saved: [DirectoryInfoFS(offset=0, size=385928, flags=4, path='CAB-74024cf513ad89d33b8720f05cb38d47')]
The start offsets (and only them) of each object are also different:
Original:
[{'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -3359372211164733253, 'byte_start_offset': (7307, 8), 'byte_start': 7488, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7315, 4), 'byte_size': 99320, 'type_id': 1, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=1, script_id=<memory at 0x7e0a4ecdd6c0>, old_type_hash=<memory at 0x7e0a4ecddb40>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c241bd0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -758580119899371814, 'byte_start_offset': (7331, 8), 'byte_start': 7296, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7339, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7e0a4ecddfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c2405f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1, 'byte_start_offset': (7355, 8), 'byte_start': 106808, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7363, 4), 'byte_size': 316, 'type_id': 2, 'serialized_type': SerializedType(class_id=142, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7e0a4ecdd180>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c243930>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 142, 'type': <ClassIDType.AssetBundle: 142>, '_read_until': 107124}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1366330914152756664, 'byte_start_offset': (7379, 8), 'byte_start': 7392, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7387, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7e0a4ecddfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c2405f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1737477282785665211, 'byte_start_offset': (7403, 8), 'byte_start': 107128, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7411, 4), 'byte_size': 278268, 'type_id': 3, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=0, script_id=<memory at 0x7e0a4ecdd0c0>, old_type_hash=<memory at 0x7e0a4ecdda80>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c24ddb0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}]
Saved:
[{'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -3359372211164733253, 'byte_start_offset': (7307, 8), 'byte_start': 7296, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7315, 4), 'byte_size': 99844, 'type_id': 1, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=1, script_id=<memory at 0x7228b67ad6c0>, old_type_hash=<memory at 0x7228b67adb40>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3df9bd0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -758580119899371814, 'byte_start_offset': (7331, 8), 'byte_start': 107144, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7339, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7228b67adfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3df85f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1, 'byte_start_offset': (7355, 8), 'byte_start': 107240, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7363, 4), 'byte_size': 316, 'type_id': 2, 'serialized_type': SerializedType(class_id=142, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7228b67ad180>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3dfb930>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 142, 'type': <ClassIDType.AssetBundle: 142>, '_read_until': 107556}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1366330914152756664, 'byte_start_offset': (7379, 8), 'byte_start': 107560, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7387, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7228b67adfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3df85f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1737477282785665211, 'byte_start_offset': (7403, 8), 'byte_start': 107656, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7411, 4), 'byte_size': 278268, 'type_id': 3, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=0, script_id=<memory at 0x7228b67ad0c0>, old_type_hash=<memory at 0x7228b67ada80>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3e09db0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}]
To Reproduce Python 3.12.3 UnityPy 1.22.2
Tests
Tested with the bundles in tests/samples, it is confirmed that the load-save operation is not idempotent between the original file and the saved file, but is idempotent between the saved file and saved-again file.
Test script:
import UnityPy
import os, shutil, hashlib
def get_file_size_and_md5(path):
with open(path, "rb") as f:
data = f.read()
return len(data), hashlib.md5(data).hexdigest()
def test_idempotent(original_path):
try:
print("\nTesting:", original_path)
print("Original file: ", *get_file_size_and_md5(original_path))
save_path = "./idempotent_test.bin"
shutil.copy(original_path, save_path)
for i in range(1, 4):
env = UnityPy.load(save_path)
env.save("lz4", ".")
print(f"After {i} load-save:", *get_file_size_and_md5(save_path))
except:
pass
if __name__ == "__main__":
for root, _, files in os.walk("tests/samples"):
for file in files:
test_idempotent(os.path.join(root, file))
Before the test, remove this line in environment.py:
def save(self, pack="none", out_path="output"):
"""Saves all changed assets.
Mark assets as changed using `.mark_changed()`.
pack = "none" (default) or "lz4"
"""
for fname, fitem in self.files.items():
- if getattr(fitem, "is_changed", False):
with open(
self.fs.sep.join([out_path, ntpath.basename(fname)]), "wb"
) as out:
out.write(fitem.save(packer=pack))
Test result:
Testing: tests/samples\atlas_test
Original file: 76771 a6cdf9b81decf06d7fbde25c51a40ceb
After 1 load-save: 76774 d8dbf4661e8cfe42ae04a2ab3207ddc1
After 2 load-save: 76774 d8dbf4661e8cfe42ae04a2ab3207ddc1
After 3 load-save: 76774 d8dbf4661e8cfe42ae04a2ab3207ddc1
Testing: tests/samples\banner_1
Original file: 34683 cc4277240ee933fb53a2ffc3f8515ee5
After 1 load-save: 34682 a071be83ebdf2e8b72d758ddd0a6e52c
After 2 load-save: 34682 a071be83ebdf2e8b72d758ddd0a6e52c
After 3 load-save: 34682 a071be83ebdf2e8b72d758ddd0a6e52c
Testing: tests/samples\char_118_yuki.ab
Original file: 704951 009f989fcf97c62ab56f6316ba2cfe01
After 1 load-save: 704953 a9cd34dbe70336d6a4f4788b0a4e4649
After 2 load-save: 704953 a9cd34dbe70336d6a4f4788b0a4e4649
After 3 load-save: 704953 a9cd34dbe70336d6a4f4788b0a4e4649
Testing: tests/samples\xinzexi_2_n_tex
Original file: 634283 34e62f953c0536db319c4319a28b4175
After 1 load-save: 854904 9b9f2edd801a7e8028e2a6609ea976a0
After 2 load-save: 854904 9b9f2edd801a7e8028e2a6609ea976a0
After 3 load-save: 854904 9b9f2edd801a7e8028e2a6609ea976a0
In my opinion
UnityPy currently does not adopt an export method consistent with the Unity Editor. Even in the future, it will not attempt to implement such a consistent export approach. This is both because maintaining consistency in export methods is not useful, and because different versions of the Unity Editor have varying export logic.
Minor differences in file size are acceptable. If significant changes in file size are observed, it may be due to inconsistencies in the compression algorithms used.
It's basically as @isHarryh said. Going a bit further, I even say that it's not possible to achieve the same results as Unity, simply due to using different libraries for compression the data. Even if Unity's own libraries would be used, the compression results would likely only be identical for the specific Unity version the libs are from, as different versions of compression libraries can also have different results at identical settings.
I was able to get a repacked 1:1 match by setting compression to 12
https://github.com/K0lb3/UnityPy/blob/a52f70d1020ba0e1b5325626070b433c31c8d22a/UnityPy/helpers/CompressionHelper.py#L105
I was able to get a repacked 1:1 match by setting compression to 12
That's just a coincidence, regarding your resource file. :)