jackson-dataformats-binary icon indicating copy to clipboard operation
jackson-dataformats-binary copied to clipboard

Unable to deserialize stringref-enabled CBOR with ignored properties

Open morokosi opened this issue 8 months ago • 6 comments

When deserializing CBOR with string references enabled using jackson-dataformats-binary, if the original string being referenced is in a field ignored by an annotation such as @JsonIgnoreProperties, the string reference cannot be resolved, causing an exception.

version: 2.17.1

record APub(String a, String b) {}
@JsonIgnoreProperties(ignoreUnknown = true)
record ASub(String b) {}

@Test
public void testCborDecode() throws IOException {
  var mapper = CBORMapper.builder.enable(CBORGenerator.Feature.STRINGREF).build();
  var aPub = new APub("foo", "foo"); // second occurrence of `foo` will be a stringref
  var aSer = mapper.writeValueAsBytes(aPub);
  var aSub = mapper.readValue(aSer, ASub.class); // <- throws com.fasterxml.jackson.core.JsonParseException: String reference (0) out of range:
}

morokosi avatar Jun 10 '25 01:06 morokosi

Sounds like a bug indeed.

/cc @here-abarany

cowtowncoder avatar Jun 10 '25 02:06 cowtowncoder

Ideally would be fixed in 2.18 branch if safe enough (that's our first Long-Term Support branch for backporting).

cowtowncoder avatar Jun 10 '25 02:06 cowtowncoder

Added test

com.fasterxml.jackson.dataformat.cbor.tofix.StringRef599Test

Test added in 2.19 branch (alas, 2.18 uses old JUnit4 and has no @JacksonTestFailureExpected) -- fix itself can be backported, I hope.

cowtowncoder avatar Jun 10 '25 02:06 cowtowncoder

My guess as to what's happening is the parser is skipping the fields entirely, which prevents any new strings from being registered for references. If the stringref extension is enabled on read, any byte or text string field will need to be parsed and registered rather than simply skipped.

here-abarany avatar Jun 10 '25 17:06 here-abarany

@here-abarany CBORParser will still iterate over tokens, regardless. But there may be some optimization for case of "nextToken()" called without accessing String value of preceding JsonToken.VALUE_STRING.

cowtowncoder avatar Jun 10 '25 20:06 cowtowncoder

Double checking the parser, there are multiple blocks such as this one:

        // For longer tokens (text, binary), we'll only read when requested
        if (_tokenIncomplete) {
            _skipIncomplete();
        }

I think what will need to be done here is to modify _skipIncomplete() and call _finishShortText()/_finishLongText() as appropriate when:

  1. The stringref stack is non-empty.
  2. The string isn't chunked. (as chunked strings aren't added to the reference table)

here-abarany avatar Jun 10 '25 20:06 here-abarany

Fixed via #627 in 2.21(.0) and 3.0.2 (both to be released).

May backport in 2.18 if feasible.

cowtowncoder avatar Oct 27 '25 16:10 cowtowncoder

Backported in

  • 2.18.5 (LTS)
  • 2.19 branch
  • 2.20 branch

cowtowncoder avatar Oct 27 '25 16:10 cowtowncoder