tinytag icon indicating copy to clipboard operation
tinytag copied to clipboard

Fix MP4 parsing for very large files with extended atom sizes

Open aw-was-here opened this issue 5 months ago • 1 comments

This patch fixes tinytag's inability to parse very large MP4 files by properly handling extended atom size formats defined in the ISO base media file format specification.

Changes:

  • Handle atom size = 1 (extended size): Read next 8 bytes as 64-bit size
  • Handle atom size = 0 (extends to EOF): Skip these atoms appropriately
  • Fix atom_size accounting for VERSIONED_ATOMS and FLAGGED_ATOMS by subtracting the 4-byte version/flags offset from the remaining data size

fixes #259

aw-was-here avatar Oct 05 '25 01:10 aw-was-here

Prior to this patch:

(wnp) aw-mbp-m1:tinytag aw$ ls -l patch/batman.mp4 
-rwxr--r--@ 1 aw  staff  499475085 Oct  4 17:31 patch/batman.mp4
(wnp) aw-mbp-m1:tinytag aw$ TINYTAG_DEBUG=1 python3 tinytag/__main__.py patch/batman.mp4 
     pos: 0 atom: b'ftyp' len: 24
     pos: 24 atom: b'free' len: 241808
     pos: 241848 atom: b'\x06\x00\t\x80' len: 29
     pos: 241877 atom: b'\x00\x00\xd7\xe9' len: 50333824
     pos: 50575701 atom: b'\xd9\xd9\xc3\x8f' len: 2392434868
     pos: 0 atom: b'ftyp' len: 24
     pos: 24 atom: b'free' len: 241808
     pos: 241848 atom: b'\x06\x00\t\x80' len: 29
     pos: 241877 atom: b'\x00\x00\xd7\xe9' len: 50333824
     pos: 50575701 atom: b'\xd9\xd9\xc3\x8f' len: 2392434868
{
  "filename": "patch/batman.mp4",
  "filesize": 499475085
}

Post this patch:

(wnp) aw-mbp-m1:tinytag aw$ TINYTAG_DEBUG=1 python3 tinytag/__main__.py patch/batman.mp4 
     pos: 0 atom: b'ftyp' len: 24
     pos: 24 atom: b'free' len: 241808
     pos: 241840 atom: b'mdat' len: 498855323
     pos: 499097163 atom: b'moov' len: 377922
         pos: 499097171 atom: b'mvhd' len: 108
         pos: 499097279 atom: b'trak' len: 45267
         pos: 499142546 atom: b'trak' len: 75521
         pos: 499218067 atom: b'udta' len: 257018
             pos: 499218075 atom: b'meta' len: 257010
                 pos: 499218087 atom: b'ilst' len: 256998
                     pos: 499218095 atom: b'covr' len: 256091
                         pos: 499218103 atom: b'data' len: 256083
                         FIELD:  images.front_cover
....

aw-was-here avatar Oct 05 '25 02:10 aw-was-here

Superseded by #272

mathiascode avatar Dec 15 '25 01:12 mathiascode