tutorials Security Vulnerability - Action Required: XXE vulnerability in the newest version of the tutorials

I think the module xml may be vulnerable to Improper Restriction of XML External Entity Reference. It shares similarities to a recent CVE disclosure CVE-2021-3878 and CVE-2021-3869 in the project stanfordnlp/CoreNLP. The vulnerable methods are as follows:

com.baeldung.xml.JaxenDemo.getAllTutorial() in the file xml/src/main/java/com/baeldung/xml/JaxenDemo.java.
com.baeldung.xml.DefaultParser.getFirstLevelNodeList() in the file xml/src/main/java/com/baeldung/xml/DefaultParser.java
com.baeldung.xml.DefaultParser.getNodeById(String id) in the file xml/src/main/java/com/baeldung/xml/DefaultParser.java
com.baeldung.xml.DefaultParser.getNodeListByTitle(String name) in the file xml/src/main/java/com/baeldung/xml/DefaultParser.java.

The source vulnerability information is as follows:

Vulnerability Detail: CVE Identifier: CVE-2021-3878 and CVE-2021-3869 Description: corenlp is vulnerable to Improper Restriction of XML External Entity Reference Reference:https://nvd.nist.gov/vuln/detail/CVE-2021-3869 for CVE-2021-3869 and https://nvd.nist.gov/vuln/detail/CVE-2021-3878 for CVE-2021-3878. Patch: https://github.com/stanfordnlp/corenlp/commit/5d83f1e8482ca304db8be726cad89554c88f136a for CVE-2021-3869 and https://github.com/stanfordnlp/corenlp/commit/e5bbe135a02a74b952396751ed3015e8b8252e99 for CVE-2021-3878.

Vulnerability Description: This vulnerability occurs because of the Improper Restriction of XML External Entity Reference. Given that the XML schema files which is compromised by a hacker, the victim conducts regular process may result in an XML External Entity (XXE) Injection attack.

Recommended Actions: The corresponding fixes are similar to CVE-2021-3878 and CVE-2021-3869 to some extent. I have provided the following fixes by applying several patching statements, ensuring that the external entities and DTDs are not loaded when parsing and processing XML documents using the document builder. You can call the function safeDocumentBuilderFactory I defined below instead of directly calling DocumentBuilderFactory.newInstance() to create a DocumentBuilderFactory object to avoid XXE attacks.

  public static DocumentBuilderFactory safeDocumentBuilderFactory() {
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    try {
      dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
      dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
      dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
      dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
      dbf.setFeature("http://apache.org/xml/features/dom/create-entity-ref-nodes", false);
      dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
    } catch (ParserConfigurationException e) {
      log.warn(e);
    }
    return dbf;
  }

Considering the potential riskes it may have, I am willing to cooperate with your to verify, address, and report the identified vulnerability promptly through responsible means. If you require any further information or assistance, please do not hesitate to reach out to me. Thank you and looking forward to hearing from you soon.

Jan 30 '24 15:01 Crispy-fried-chicken

Hey, @Crispy-fried-chicken.

Thanks for bringing this to our attention. We'll look into this.

This issue will remain open until then.

Feb 02 '24 16:02 ulisseslima

Addressed in this PR by creating a secureDocumentBuilderFactory instance - https://github.com/eugenp/tutorials/pull/17732

Oct 08 '24 03:10 rajat-garg

Hey @rajat-garg , Thank you for your reply about this issue which is detected by our tools. Now I really want to know your thoughts about our tool. When you have a chance, could you please take a look at our tool? Specifically, we're interested in understanding:

Do you feel the detection results from our tool help enhance the security of your project?
Would you be willing to let us regularly scan your project in the future to identify potential vulnerabilities?
Our tool works by collecting patches from existing publicly disclosed vulnerabilities in real time and scanning target projects for the presence of identical code or similar logic. Do you have any suggestions for improving this vulnerability detection approach? Please feel free to tell me your thoughts, it's really important for us to improve our tool. Thank you!

Dec 24 '24 13:12 Crispy-fried-chicken