Combining OWASP Sanitizer and Encoder
Hi,
is it possible to combine the OWASP Sanitizer and the OWASP Encoder to not remove malicious code but to encode the problematic parts from a given string, so that f.e. a script tag will do no harm and is just displayed as a text. I am asking this because I would like to deal with texts where it is not certain if they will be displayed as inner html or as "normal text".
Thank you very much for any answer ;)
I think this would be a great idea. Neither library is that large so combining them would make sense + 1
If the content is data that you want to display exactly like a user typed it in safely, then I would use the encoder.If the content is HTML that you actually want to render that’s authored by a user then you want to use the HTML sanitizer.Does that make sense to you?--Jim @./manicodeSecure Coding EducationOn Jan 19, 2024, at 5:34 AM, bmscodespace @.> wrote: Hi, is it possible to combine the OWASP Sanitizer and the OWASP Encoder to not remove malicious code but to encode the problematic parts from a given string, so that f.e. a script tag will do no harm and is just displayed as a text. I am asking this because I would like to deal with texts where it is not certain if they will be displayed as inner html or as "normal text". Thank you very much for any answer ;)
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
@jmanico it totally does I just find a lot of people use both these libs one for santizing HTML input and the others for sanitizing output before its send back to the browser like JSON data etc. I know in PrimeFaces we use both libraries.
Hi,
thank you for your comments. My question imagined a scenario where we don't know if a text will be displayed as inner HTML, f.e. as formatted text with lots of p tags or b tags in it, or as an ordinary data text that was f.e. typed in safely. If I sanitize the text then this might destroy a text like f.e.
A script in HTML starts with <script> and ends with </script> .
On the other hand, if I encode every string, a HTML string which we might want to display as formatted text will then be displayed as a HTML string with possible code from an attacker in it ;).
Encoding must be done at the point of output. Otherwise you run into the problem of using the wrong encoding.