The default UTF CharacterEscapeHandler (MinimumEscapeHandler) eats carriage returns
For some reason the default UTF CharacterEscapeHandler (MinimumEscapeHandler) eats carriage returns ("\r"), i.e. it doesn't write them to the output. NioEscapeHandler, an alternate CharacterEscapeHandler included in JAXB, doesn't ignore them. Here's the relevant code:
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
// avoid calling the Writerwrite method too much by assuming
// that the escaping occurs rarely.
// profiling revealed that this is faster than the naive code.
int limit = start+length;
for (int i = start; i < limit; i++) {
char c = ch[i];
if(c == '&' || c == '<' || c == '>' || c == '\r' || (c == '\"' && isAttVal) ) {
if(i!=start)
out.write(ch,start,i-start);
start = i+1;
switch (ch[i]) {
case '&':
out.write("&");
break;
case '<':
out.write("<");
break;
case '>':
out.write(">");
break;
case '\"':
out.write(""");
break;
}
}
}
if( start!=limit )
out.write(ch,start,limit-start);
}
Affected Versions
[2.2.6]
Reported by gredler
gredler said: Possibly relevant past issue (though I can't find a record in SVN of how it was addressed in the code): #449
Was assigned to yaroska
This issue was imported from java.net JIRA JAXB-959
This happens for example when you marshall with JAXB to a StringWriter. This impacts the entire JAXB ecosystem: Java, JavaEE, JAX-RS, and all the projects depending on jaxb-core. And nobody notices that their XMLs get modified...