java-cookie Decoding sometimes fails when two encoded characters appear one after another.

The Cookies.decode( String encoded ) method sometimes fails to properly decode the given string when two encoded characters appear one after another.

For example, the decoding fails for the given string: New%20York%2C%20NY. Outputting: New York%2C NY.

The reason being that the decoding regex (%[0-9A-Z]{2})+ produces two matches %20 and %2C%20. The method then goes on to replace these matches by first searching for %20 and replacing it with the space character. It then goes on to look for %2C%20 to replace them with a comma and space, however, since all instances of %20 have been replaced by a space, the match never occurs. %2C%20 is simply left as %2c.

Oct 15 '16 01:10 davidklebanoff

Hey good catch. Want to open a Pull Request fixing it with tests? Thanks.

Oct 15 '16 04:10 FagnerMartinsBrack

Thoughts on simply replacing the decoding logic with the Java UrlDecoder?

URLDecoder.decode(cookieValue, "UTF-8")

Oct 15 '16 06:10 davidklebanoff

The integration tests will probably fail because URLDecoder.decode doesn't decode only the characters that are not allowed in the cookie-value according to the RFC 6265.

The encoding part of the README explains how we handle encoding/decoding.

If you want to take a look how we are handling it in JavaScript, see the js-cookie latest version (2.1.3)

Oct 15 '16 06:10 FagnerMartinsBrack

I would recommend starting a PR with a failing test so that we can see the problem clearly first before implementing the fix.

Oct 15 '16 06:10 FagnerMartinsBrack

I have the same problem! En-/Decoding is obviously really hard. I will see what I can do here.

Feb 25 '19 17:02 tholu

@FagnerMartinsBrack See the PR I created, please check, merge and release a new version. Thanks!

Feb 25 '19 21:02 tholu