XMLCoder icon indicating copy to clipboard operation
XMLCoder copied to clipboard

Spaces removed when decoding strings with escaped elements

Open mxcl opened this issue 6 years ago • 7 comments

I am getting this kind of XML from an ancient SOAP server:

<aResponse>&lt;uesb2b:response xmlns:uesb2b=&quot;http://services.b2b.ues.ut.uhg.com/types/plans/&quot; xmlns=&quot;http://services.b2b.ues.ut.uhg.com/types/plans/&quot;&gt;&#xd;
  &lt;uesb2b:st cd=&quot;GA&quot; /&gt;&#xd;
  &lt;uesb2b:obligId val=&quot;01&quot; /&gt;&#xd;
  &lt;uesb2b:shrArrangementId val=&quot;00&quot; /&gt;&#xd;
  &lt;uesb2b:busInsType val=&quot;CG&quot; /&gt;&#xd;
  &lt;uesb2b:metalPlans typ=&quot;Array&quot;
…

When I decode with a struct like:

struct Response: Decodable {
     let aResponse: String
}

Using:

let decoder = XMLDecoder()
decoder.shouldProcessNamespaces = true
let rsp = try decoder.decode(Response.self, from: data)
print(rsp.aResponse)

I get:

<uesb2b:response xmlns:uesb2b="http://services.b2b.ues.ut.uhg.com/types/plans/"xmlns="http://services.b2b.ues.ut.uhg.com/types/plans/">
<uesb2b:st cd="GA"/>
<uesb2b:obligId val="01"/>
<uesb2b:shrArrangementId val="00"/>
<uesb2b:busInsType val="CG"/>
<uesb2b:metalPlans typ="Array"arrayTyp="metalPlan[110]">
<uesb2b:metalPlan cd="AUWJ"rx="286A"level="S"min="0"max="0"/>
<uesb2b:metalPlan cd="AUWK"rx="286A"level="G"min="0"max="0"/><uesb2b:metalPlan …

(Newlines added for legibility). You can see the spaces on either side of the &quot;s in the attribute heavy nodes are removed (last two lines).

This makes the inner XML invalid.

Happy to fix the bug, just point me at the right code, thanks.

mxcl avatar Sep 10 '19 14:09 mxcl

Ping on a pointer. This is a real problem for us.

mxcl avatar Sep 30 '19 20:09 mxcl

Hi @mxcl, sorry for the delay. Does setting trimValueWhitespaces to false on XMLDecoder instance resolve the issue for you?

MaxDesiatov avatar Oct 05 '19 17:10 MaxDesiatov

I've added a test for this in #137, but it didn't require any changes in the library code, just disabling the trimValueWhitespaces flag. Please let me know if anything's missing.

MaxDesiatov avatar Oct 05 '19 17:10 MaxDesiatov

I'll check, I apologize that I didn't see this property before.

mxcl avatar Oct 07 '19 15:10 mxcl

Can confirm this fixes it thanks.

Not sure if the feature is working as intended, since it was trimming from the middle of my strings.

mxcl avatar Oct 07 '19 15:10 mxcl

We trim any string chunk that Foundation's XMLParser passes us in its delegate function, which usually is just whole XML element content. Oddly enough, XMLParser chunks strings on every escaped element, and I agree this is unexpected behavior. Will keep the issue open for a bit until I make it accumulate those chunks and trim whole element content as expected.

MaxDesiatov avatar Oct 07 '19 15:10 MaxDesiatov

Looks like this is already being addressed, but I just noticed the same issue with single quote values as well.

For example, the value:

What does ‘being verified’ actually mean?

was returned as:

What does'being verified' actually mean?

Setting trimValueWhitespaces to false fixed the problem for me as well. Thank you so much for all your work here!

ethan-kusters avatar Dec 27 '19 17:12 ethan-kusters