jdom icon indicating copy to clipboard operation
jdom copied to clipboard

merging CDATAs can producing an ending tag ]]> in the resulting CDATA leading to an exception

Open siddhadev opened this issue 12 years ago • 4 comments

Imagine you have the following XML input:

     <message priority="info"><![CDATA[  expected:<[[D/0]]]]><![CDATA[> but was:<[null]>]]></message>

this would be a legal XML representing the following string:

expected:<[[D/0]]> but was: <[null]>

now when SAXHandler.characters() gets called, it would try to merge subsequent CDATAs, and call setText() which would fail with:

Caused by: org.jdom.IllegalDataException: The data "  expected:<[[D/0]]> but was:<[null]>" is not legal for a JDOM CDATA section: CDATA cannot internally contain a CDATA ending delimiter (]]>).
at org.jdom.CDATA.setText(CDATA.java:121)
at org.jdom.CDATA.<init>(CDATA.java:95)
at org.jdom.DefaultJDOMFactory.cdata(DefaultJDOMFactory.java:97)
at org.jdom.input.SAXHandler.flushCharacters(SAXHandler.java:652)
at org.jdom.input.SAXHandler.flushCharacters(SAXHandler.java:623)
at org.jdom.input.SAXHandler.endElement(SAXHandler.java:678)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)

It's not a nice fix, but its an easy workaround.

siddhadev avatar Nov 25 '13 12:11 siddhadev

Good catch. And appreciate the patch... but I am not particularly happy with the fix.... I think there's an issue in it:

if (previousCDATA != inCDATA || (ch[start] == '>' || (ch[start] == ']' && ch[start] == '>') ))

(ch[start] == ']' && ch[start] == '>') cannot possibly be true....

The situation is that the 'real' data contains a closing CDATA tag ']]>' and this is 'split' between two 'real' CDATA sections... so, the fix really needs to span what's been seen already, and what's just arrived...

I realize that the bug is real, but this fix is incomplete... If you want to have another stab at it, feel free, and I'll pull it in, but as it stands it's not ready. Otherwise, I'll take a look at it myself in a day or two.

rolfl avatar Nov 25 '13 13:11 rolfl

Oh, you are right, it was meant like this:

ch[start] == ']' && ch[start+1] == '>'

I updated the pull request, but you don't have to use it, a better fix should be possible. I guess a CDATA could get broken by more than just a closing tag, i.e. "]]>".

siddhadev avatar Nov 25 '13 14:11 siddhadev

I am going to contemplate this issue for a bit. I think I may have an alternative approach which is more reliable. I will have to take the time to play with the code though.

rolfl avatar Nov 25 '13 15:11 rolfl

Just so you are aware, this issue is not a problem in JDOM 2.x , and I strongly recommend you upgrade to that. It is unlikely that there will be another JDOM 1.x release any time soon.

rolfl avatar Mar 30 '14 03:03 rolfl