php-serialize icon indicating copy to clipboard operation
php-serialize copied to clipboard

Unable to unserialize this string

Open antifarben opened this issue 13 years ago • 4 comments

The following error is thrown:

TypeError: Unable to unserialize type ';'
php-serialize-1.1.0/lib/php_serialize.rb:308:in `do_unserialize'
a:9:{s:7:"xxxxxxx";s:12:"xxxxxxxxxäxx";s:11:"description";s:12:"xxxxxxxxxäxx";s:8:"xxxxxxxx";s:7:"xxxxxxx";s:7:"xxxxxxx";s:0:"";s:8:"xxxxxxxx";s:0:"";s:6:"xxxxxx";a:1:{i:1;s:15:"xxxxxxxxxxxxxxx";}s:7:"xxxxxxx";a:1:{i:0;N;}s:4:"xxxx";a:9:{i:0;s:7:"xxxxxxx";i:1;s:8:"xxxxxxxx";i:2;s:6:"xxxxxx";i:3;s:8:"xxxxxxxx";i:4;s:6:"xxxxxx";i:5;s:10:"xxxxxxxxxx";i:6;s:4:"xxxx";i:7;s:5:"xxxxx";i:8;s:6:"xxxxxx";}s:6:"xxxxxx";a:25:{s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:7:"xxxxxxx";a:4:{s:7:"xxxxxxx";s:11:"Üxxxxxxxxxx";s:6:"xxxxxx";s:4:"xxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:10:"subxxxxxxx";a:4:{s:7:"xxxxxxx";s:11:"xxxxxxxxxxx";s:6:"xxxxxx";s:4:"xxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:7:"xxxxxxx";a:4:{s:7:"xxxxxxx";s:4:"xxxx";s:6:"xxxxxx";s:8:"xxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:13:"xxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:15:"xxxxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:17:"xxxxxxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:10:"xxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:13:"xxxxxxxx-xxxx";s:6:"xxxxxx";s:17:"xxxxxxxxxxxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:11:"xxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:13:"xxxxxxxx-xxxx";s:6:"xxxxxx";s:8:"xxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:10:"xxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:15:"xxxxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:16:"xxxx-xxxxxxxxxxx";s:6:"xxxxxx";s:4:"xxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:4:"xxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:11:"xxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:12:"xxxxxx-xxxxx";s:6:"xxxxxx";s:8:"xxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:9:"xxxxxxxxx";a:4:{s:7:"xxxxxxx";s:12:"xxxxxx-xxxxx";s:6:"xxxxxx";s:17:"xxxxxxxxxxxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:7:"xxxxxxx";a:4:{s:7:"xxxxxxx";s:14:"xxxxxxx-xxxxxx";s:6:"xxxxxx";s:6:"xxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:5:"xxxxx";a:4:{s:7:"xxxxxxx";s:5:"xxxxx";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}}}

antifarben avatar Jan 26 '13 12:01 antifarben

Ah - for other people having issues with php-serialize. The gem "php-serialization" / "ruby-php-serialization" (https://rubygems.org/gems/php-serialization) has no problem unserializing this.

antifarben avatar Jan 26 '13 19:01 antifarben

@antifarben i tried this with "php-serialization" gem, but it still fails. at least with the current version.

the only one which seems to work is this: https://rubygems.org/gems/k-php-serialize

panaak avatar Sep 03 '18 13:09 panaak

@panaak okay. I had this issue more than five years ago and it isn't relevant for me anymore. Your hint might be relevant for others though.

antifarben avatar Sep 03 '18 14:09 antifarben

@antifarben for what it's worth: I got the same error and tried the other gems to no avail. However, .force_encoding(Encoding::ISO_8859_1) did the trick in my case.

stefankreitmayer avatar Mar 29 '20 19:03 stefankreitmayer

We use this gem everyday. Would be nice to maintainer result both UTF-8 issues posted so we don't have to patch.

ddarbyson avatar Sep 04 '23 17:09 ddarbyson

The serialization you provided seems to be malformed in a some critical ways, even PHP won't parse it as presented:

  1. There are duplicate entries for keys in some of the associative arrays. Specifically the a:25 array contains the key s:7:"xxxxxxx"; repeated 3 times by my count. Don't do that.

  2. As @stefankreitmayer noted, the string is very likely this is ISO-8859-1 encoded, because the strings show sizes which are clearly not UTF-8 when values include the ä(reference) and Ü(reference) characters. Both of these encode as 2 bytes even in UTF-8, in ISO-8859-1 they are single byte which seems to match your serialization.

Here are the strings copied from your example and run through php's serialize:

echo(serialize("xxxxxxxxxäxx"));
# s:13:"xxxxxxxxxäxx";

echo serialize("Üxxxxxxxxxx");
# s:12:"Üxxxxxxxxxx";

These both report longer by one character when serialized by PHP, suggesting that they are now actually UTF-8 values, probably due to automatic transcoding when posted to the web.

a_umlat_serialized = 's:12:"xxxxxxxxxäxx";'  # Copied from example
u_umlat_serialized = 's:11:"Üxxxxxxxxxx";'   # Copied from example
assoc_serialized = 'a:1{' << a_umlat_serialized << u_umlat_serialized << '}'

assoc_serialized.encoding.name
# => "UTF-8"

PHP.unserialize(assoc_serialized)
# => lib/php_serialize.rb:270:in `do_unserialize': Unable to unserialize type ';' (TypeError)

Which reproduces the reported issue. Then we can address it by recreating the characters as they were originally encoded, and not mangling the encoding.

a_umlat = 228.chr  # ä in ISO-8859-1
u_umlat = 220.chr  # Ü in ISO-8859-1

a_umlat_serialized = "s:12:\"xxxxxxxxx#{a_umlat}xx\";".force_encoding("ISO-8859-1")
u_umlat_serialized = "s:11:\"#{u_umlat}xxxxxxxxxx\";".force_encoding("ISO-8859-1")
assoc_serialized = 'a:1{' << a_umlat_serialized << u_umlat_serialized << '}'

assoc_serialized.encoding.name
# => "ISO-8859-1"

PHP.unserialize(assoc_serialized)
# => {"xxxxxxxxx\xE4xx"=>"\xDCxxxxxxxxxx"}

PHP.unserialize(assoc_serialized).map { |k, v| [k.encode("UTF-8"), v.encode("UTF-8")] }.to_h
# => {"xxxxxxxxxäxx"=>"Üxxxxxxxxxx"}

So I think you have ISO-8859-1 data, which is being transcoded into UTF-8, and upon doing so causes the size encoding to be off.

To address this, you need to keep the encoding of the original data until you call PHP.unserialize, afterwards you can transcode to UTF-8 for typical modern usage.

jqr avatar Mar 02 '24 04:03 jqr

BTW I did notice unserialize is modifying the encoding of it's argument which is #20 and should be fixed shortly.

jqr avatar Mar 02 '24 04:03 jqr