UnityPy icon indicating copy to clipboard operation
UnityPy copied to clipboard

Load, save isn't idempotent on bundles

Open vinceh121 opened this issue 9 months ago • 4 comments

Code

bundle = UnityPy.load("path/to/StreamingAssets/mybundle_assets_all.bundle")
bundle.save("lz4", ".")

Error No error is thrown.

In the tested case, the original bundle file (Unity version 2021.3.35f1) is 114141 bytes, the saved file however is largely different, and has a size of 115811.

Bug It is expected that both files should remain the same, as no operation has been done on them.

Observations

When reading back the two bundles, the DirectoryInfoFS object is also different:

original: [DirectoryInfoFS(offset=0, size=385396, flags=4, path='CAB-74024cf513ad89d33b8720f05cb38d47')]
saved: [DirectoryInfoFS(offset=0, size=385928, flags=4, path='CAB-74024cf513ad89d33b8720f05cb38d47')]

The start offsets (and only them) of each object are also different:

Original:
[{'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -3359372211164733253, 'byte_start_offset': (7307, 8), 'byte_start': 7488, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7315, 4), 'byte_size': 99320, 'type_id': 1, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=1, script_id=<memory at 0x7e0a4ecdd6c0>, old_type_hash=<memory at 0x7e0a4ecddb40>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c241bd0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -758580119899371814, 'byte_start_offset': (7331, 8), 'byte_start': 7296, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7339, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7e0a4ecddfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c2405f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1, 'byte_start_offset': (7355, 8), 'byte_start': 106808, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7363, 4), 'byte_size': 316, 'type_id': 2, 'serialized_type': SerializedType(class_id=142, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7e0a4ecdd180>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c243930>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 142, 'type': <ClassIDType.AssetBundle: 142>, '_read_until': 107124}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1366330914152756664, 'byte_start_offset': (7379, 8), 'byte_start': 7392, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7387, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7e0a4ecddfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c2405f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7e0a4c22c1d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1737477282785665211, 'byte_start_offset': (7403, 8), 'byte_start': 107128, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7411, 4), 'byte_size': 278268, 'type_id': 3, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=0, script_id=<memory at 0x7e0a4ecdd0c0>, old_type_hash=<memory at 0x7e0a4ecdda80>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7e0a4c24ddb0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}]

Saved:
[{'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -3359372211164733253, 'byte_start_offset': (7307, 8), 'byte_start': 7296, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7315, 4), 'byte_size': 99844, 'type_id': 1, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=1, script_id=<memory at 0x7228b67ad6c0>, old_type_hash=<memory at 0x7228b67adb40>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3df9bd0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': -758580119899371814, 'byte_start_offset': (7331, 8), 'byte_start': 107144, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7339, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7228b67adfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3df85f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1, 'byte_start_offset': (7355, 8), 'byte_start': 107240, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7363, 4), 'byte_size': 316, 'type_id': 2, 'serialized_type': SerializedType(class_id=142, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7228b67ad180>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3dfb930>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 142, 'type': <ClassIDType.AssetBundle: 142>, '_read_until': 107556}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1366330914152756664, 'byte_start_offset': (7379, 8), 'byte_start': 107560, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7387, 4), 'byte_size': 96, 'type_id': 0, 'serialized_type': SerializedType(class_id=115, is_stripped_type=False, script_type_index=-1, script_id=None, old_type_hash=<memory at 0x7228b67adfc0>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3df85f0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 115, 'type': <ClassIDType.MonoScript: 115>}, {'assets_file': <SerializedFile>, 'reader': <UnityPy.streams.EndianBinaryReader.EndianBinaryReader_Memoryview_LittleEndian object at 0x7228b3de41d0>, 'data': b'', 'version': (2021, 3, 35, 1), 'version2': 22, 'platform': <BuildTarget.StandaloneLinux64: 24>, 'build_type': BuildType(build_type='f'), 'path_id': 1737477282785665211, 'byte_start_offset': (7403, 8), 'byte_start': 107656, 'byte_header_offset': 7296, 'byte_base_offset': 175, 'byte_size_offset': (7411, 4), 'byte_size': 278268, 'type_id': 3, 'serialized_type': SerializedType(class_id=114, is_stripped_type=False, script_type_index=0, script_id=<memory at 0x7228b67ad0c0>, old_type_hash=<memory at 0x7228b67ada80>, node=<UnityPy.helpers.TypeTreeNode.TypeTreeNode object at 0x7228b3e09db0>, m_ClassName=None, m_NameSpace=None, m_AssemblyName=None, type_dependencies=()), 'class_id': 114, 'type': <ClassIDType.MonoBehaviour: 114>}]

To Reproduce Python 3.12.3 UnityPy 1.22.2

vinceh121 avatar Apr 18 '25 12:04 vinceh121

Tests

Tested with the bundles in tests/samples, it is confirmed that the load-save operation is not idempotent between the original file and the saved file, but is idempotent between the saved file and saved-again file.

Test script:

import UnityPy
import os, shutil, hashlib

def get_file_size_and_md5(path):
    with open(path, "rb") as f:
        data = f.read()
        return len(data), hashlib.md5(data).hexdigest()

def test_idempotent(original_path):
    try:
        print("\nTesting:", original_path)
        print("Original file:    ", *get_file_size_and_md5(original_path))
        save_path = "./idempotent_test.bin"
        shutil.copy(original_path, save_path)
        for i in range(1, 4):
            env = UnityPy.load(save_path)
            env.save("lz4", ".")
            print(f"After {i} load-save:", *get_file_size_and_md5(save_path))
    except:
        pass

if __name__ == "__main__":
    for root, _, files in os.walk("tests/samples"):
        for file in files:
            test_idempotent(os.path.join(root, file))

Before the test, remove this line in environment.py:

    def save(self, pack="none", out_path="output"):
        """Saves all changed assets.
        Mark assets as changed using `.mark_changed()`.
        pack = "none" (default) or "lz4"
        """
        for fname, fitem in self.files.items():
-            if getattr(fitem, "is_changed", False):
                with open(
                    self.fs.sep.join([out_path, ntpath.basename(fname)]), "wb"
                ) as out:
                    out.write(fitem.save(packer=pack))

Test result:

Testing: tests/samples\atlas_test
Original file:     76771 a6cdf9b81decf06d7fbde25c51a40ceb
After 1 load-save: 76774 d8dbf4661e8cfe42ae04a2ab3207ddc1
After 2 load-save: 76774 d8dbf4661e8cfe42ae04a2ab3207ddc1
After 3 load-save: 76774 d8dbf4661e8cfe42ae04a2ab3207ddc1

Testing: tests/samples\banner_1
Original file:     34683 cc4277240ee933fb53a2ffc3f8515ee5
After 1 load-save: 34682 a071be83ebdf2e8b72d758ddd0a6e52c
After 2 load-save: 34682 a071be83ebdf2e8b72d758ddd0a6e52c
After 3 load-save: 34682 a071be83ebdf2e8b72d758ddd0a6e52c

Testing: tests/samples\char_118_yuki.ab
Original file:     704951 009f989fcf97c62ab56f6316ba2cfe01
After 1 load-save: 704953 a9cd34dbe70336d6a4f4788b0a4e4649
After 2 load-save: 704953 a9cd34dbe70336d6a4f4788b0a4e4649
After 3 load-save: 704953 a9cd34dbe70336d6a4f4788b0a4e4649

Testing: tests/samples\xinzexi_2_n_tex
Original file:     634283 34e62f953c0536db319c4319a28b4175
After 1 load-save: 854904 9b9f2edd801a7e8028e2a6609ea976a0
After 2 load-save: 854904 9b9f2edd801a7e8028e2a6609ea976a0
After 3 load-save: 854904 9b9f2edd801a7e8028e2a6609ea976a0

In my opinion

UnityPy currently does not adopt an export method consistent with the Unity Editor. Even in the future, it will not attempt to implement such a consistent export approach. This is both because maintaining consistency in export methods is not useful, and because different versions of the Unity Editor have varying export logic.

Minor differences in file size are acceptable. If significant changes in file size are observed, it may be due to inconsistencies in the compression algorithms used.

isHarryh avatar Apr 18 '25 14:04 isHarryh

It's basically as @isHarryh said. Going a bit further, I even say that it's not possible to achieve the same results as Unity, simply due to using different libraries for compression the data. Even if Unity's own libraries would be used, the compression results would likely only be identical for the specific Unity version the libs are from, as different versions of compression libraries can also have different results at identical settings.

K0lb3 avatar Apr 18 '25 22:04 K0lb3

I was able to get a repacked 1:1 match by setting compression to 12

https://github.com/K0lb3/UnityPy/blob/a52f70d1020ba0e1b5325626070b433c31c8d22a/UnityPy/helpers/CompressionHelper.py#L105

0o120 avatar Sep 09 '25 06:09 0o120

I was able to get a repacked 1:1 match by setting compression to 12

That's just a coincidence, regarding your resource file. :)

isHarryh avatar Sep 09 '25 09:09 isHarryh