Unable to unserialize this string
The following error is thrown:
TypeError: Unable to unserialize type ';'
php-serialize-1.1.0/lib/php_serialize.rb:308:in `do_unserialize'
a:9:{s:7:"xxxxxxx";s:12:"xxxxxxxxxäxx";s:11:"description";s:12:"xxxxxxxxxäxx";s:8:"xxxxxxxx";s:7:"xxxxxxx";s:7:"xxxxxxx";s:0:"";s:8:"xxxxxxxx";s:0:"";s:6:"xxxxxx";a:1:{i:1;s:15:"xxxxxxxxxxxxxxx";}s:7:"xxxxxxx";a:1:{i:0;N;}s:4:"xxxx";a:9:{i:0;s:7:"xxxxxxx";i:1;s:8:"xxxxxxxx";i:2;s:6:"xxxxxx";i:3;s:8:"xxxxxxxx";i:4;s:6:"xxxxxx";i:5;s:10:"xxxxxxxxxx";i:6;s:4:"xxxx";i:7;s:5:"xxxxx";i:8;s:6:"xxxxxx";}s:6:"xxxxxx";a:25:{s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:7:"xxxxxxx";a:4:{s:7:"xxxxxxx";s:11:"Üxxxxxxxxxx";s:6:"xxxxxx";s:4:"xxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:10:"subxxxxxxx";a:4:{s:7:"xxxxxxx";s:11:"xxxxxxxxxxx";s:6:"xxxxxx";s:4:"xxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:7:"xxxxxxx";a:4:{s:7:"xxxxxxx";s:4:"xxxx";s:6:"xxxxxx";s:8:"xxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:13:"xxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:15:"xxxxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:17:"xxxxxxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:10:"xxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:13:"xxxxxxxx-xxxx";s:6:"xxxxxx";s:17:"xxxxxxxxxxxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:11:"xxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:13:"xxxxxxxx-xxxx";s:6:"xxxxxx";s:8:"xxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:10:"xxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:15:"xxxxxxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:16:"xxxx-xxxxxxxxxxx";s:6:"xxxxxx";s:4:"xxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:4:"xxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:8:"xxxxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:11:"xxxxxxxxxxx";a:4:{s:7:"xxxxxxx";s:12:"xxxxxx-xxxxx";s:6:"xxxxxx";s:8:"xxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:9:"xxxxxxxxx";a:4:{s:7:"xxxxxxx";s:12:"xxxxxx-xxxxx";s:6:"xxxxxx";s:17:"xxxxxxxxxxxxxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:7:"xxxxxxx";a:4:{s:7:"xxxxxxx";s:14:"xxxxxxx-xxxxxx";s:6:"xxxxxx";s:6:"xxxxxx";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:5:"xxxxx";a:4:{s:7:"xxxxxxx";s:5:"xxxxx";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}s:6:"xxxxxx";a:4:{s:7:"xxxxxxx";s:0:"";s:6:"xxxxxx";s:0:"";s:5:"xxxxx";s:0:"";s:6:"xxxxxx";s:0:"";}}}
Ah - for other people having issues with php-serialize. The gem "php-serialization" / "ruby-php-serialization" (https://rubygems.org/gems/php-serialization) has no problem unserializing this.
@antifarben i tried this with "php-serialization" gem, but it still fails. at least with the current version.
the only one which seems to work is this: https://rubygems.org/gems/k-php-serialize
@panaak okay. I had this issue more than five years ago and it isn't relevant for me anymore. Your hint might be relevant for others though.
@antifarben for what it's worth: I got the same error and tried the other gems to no avail.
However, .force_encoding(Encoding::ISO_8859_1) did the trick in my case.
We use this gem everyday. Would be nice to maintainer result both UTF-8 issues posted so we don't have to patch.
The serialization you provided seems to be malformed in a some critical ways, even PHP won't parse it as presented:
-
There are duplicate entries for keys in some of the associative arrays. Specifically the
a:25array contains the keys:7:"xxxxxxx";repeated 3 times by my count. Don't do that. -
As @stefankreitmayer noted, the string is very likely this is
ISO-8859-1encoded, because the strings show sizes which are clearly notUTF-8when values include the ä(reference) and Ü(reference) characters. Both of these encode as 2 bytes even inUTF-8, inISO-8859-1they are single byte which seems to match your serialization.
Here are the strings copied from your example and run through php's serialize:
echo(serialize("xxxxxxxxxäxx"));
# s:13:"xxxxxxxxxäxx";
echo serialize("Üxxxxxxxxxx");
# s:12:"Üxxxxxxxxxx";
These both report longer by one character when serialized by PHP, suggesting that they are now actually UTF-8 values, probably due to automatic transcoding when posted to the web.
a_umlat_serialized = 's:12:"xxxxxxxxxäxx";' # Copied from example
u_umlat_serialized = 's:11:"Üxxxxxxxxxx";' # Copied from example
assoc_serialized = 'a:1{' << a_umlat_serialized << u_umlat_serialized << '}'
assoc_serialized.encoding.name
# => "UTF-8"
PHP.unserialize(assoc_serialized)
# => lib/php_serialize.rb:270:in `do_unserialize': Unable to unserialize type ';' (TypeError)
Which reproduces the reported issue. Then we can address it by recreating the characters as they were originally encoded, and not mangling the encoding.
a_umlat = 228.chr # ä in ISO-8859-1
u_umlat = 220.chr # Ü in ISO-8859-1
a_umlat_serialized = "s:12:\"xxxxxxxxx#{a_umlat}xx\";".force_encoding("ISO-8859-1")
u_umlat_serialized = "s:11:\"#{u_umlat}xxxxxxxxxx\";".force_encoding("ISO-8859-1")
assoc_serialized = 'a:1{' << a_umlat_serialized << u_umlat_serialized << '}'
assoc_serialized.encoding.name
# => "ISO-8859-1"
PHP.unserialize(assoc_serialized)
# => {"xxxxxxxxx\xE4xx"=>"\xDCxxxxxxxxxx"}
PHP.unserialize(assoc_serialized).map { |k, v| [k.encode("UTF-8"), v.encode("UTF-8")] }.to_h
# => {"xxxxxxxxxäxx"=>"Üxxxxxxxxxx"}
So I think you have ISO-8859-1 data, which is being transcoded into UTF-8, and upon doing so causes the size encoding to be off.
To address this, you need to keep the encoding of the original data until you call PHP.unserialize, afterwards you can transcode to UTF-8 for typical modern usage.
BTW I did notice unserialize is modifying the encoding of it's argument which is #20 and should be fixed shortly.