CodeQL documentation

XML external entity expansion

ID: py/xxe
Kind: path-problem
Severity: error
Precision: high
Tags:
   - security
   - external/cwe/cwe-611
   - external/cwe/cwe-827
Query suites:
   - python-code-scanning.qls
   - python-security-extended.qls
   - python-security-and-quality.qls

Click to see the query in the CodeQL repository

Parsing untrusted XML files with a weakly configured XML parser may lead to an XML External Entity (XXE) attack. This type of attack uses external entity references to access arbitrary files on a system, carry out denial-of-service (DoS) attacks, or server-side request forgery. Even when the result of parsing is not returned to the user, DoS attacks are still possible and out-of-band data retrieval techniques may allow attackers to steal sensitive data.

Recommendation

The easiest way to prevent XXE attacks is to disable external entity handling when parsing untrusted data. How this is done depends on the library being used. Note that some libraries, such as recent versions of the XML libraries in the standard library of Python 3, disable entity expansion by default, so unless you have explicitly enabled entity expansion, no further action needs to be taken.

We recommend using the defusedxml PyPI package, which has been created to prevent XML attacks (both XXE and XML bombs).

Example

The following example uses the lxml XML parser to parse a string xml_src. That string is from an untrusted source, so this code is vulnerable to an XXE attack, since the default parser from lxml.etree allows local external entities to be resolved.

from flask import Flask, request
import lxml.etree

app = Flask(__name__)

@app.post("/upload")
def upload():
    xml_src = request.get_data()
    doc = lxml.etree.fromstring(xml_src)
    return lxml.etree.tostring(doc)

To guard against XXE attacks with the lxml library, you should create a parser with resolve_entities set to false. This means that no entity expansion is undertaken, althuogh standard predefined entities such as >, for writing > inside the text of an XML element, are still allowed.

from flask import Flask, request
import lxml.etree

app = Flask(__name__)

@app.post("/upload")
def upload():
    xml_src = request.get_data()
    parser = lxml.etree.XMLParser(resolve_entities=False)
    doc = lxml.etree.fromstring(xml_src, parser=parser)
    return lxml.etree.tostring(doc)

References

  • © GitHub, Inc.
  • Terms
  • Privacy