CodeQL documentation

Incomplete HTML attribute sanitization

ID: js/incomplete-html-attribute-sanitization
Kind: path-problem
Severity: warning
Precision: high
Tags:
   - security
   - external/cwe/cwe-079
   - external/cwe/cwe-116
   - external/cwe/cwe-020
Query suites:
   - javascript-code-scanning.qls
   - javascript-security-extended.qls
   - javascript-security-and-quality.qls

Click to see the query in the CodeQL repository

Sanitizing untrusted input for HTML meta-characters is a common technique for preventing cross-site scripting attacks. Usually, this is done by escaping <, >, & and ". However, the context in which the sanitized value is used decides the characters that need to be sanitized.

As a consequence, some programs only sanitize < and > since those are the most common dangerous characters. The lack of sanitization for " is problematic when an incompletely sanitized value is used as an HTML attribute in a string that later is parsed as HTML.

Recommendation

Sanitize all relevant HTML meta-characters when constructing HTML dynamically, and pay special attention to where the sanitized value is used.

An even safer alternative is to design the application so that sanitization is not needed, for instance by using HTML templates that are explicit about the values they treat as HTML.

Example

The following example code writes part of an HTTP request (which is controlled by the user) to an HTML attribute of the server response. The user-controlled value is, however, not sanitized for ". This leaves the website vulnerable to cross-site scripting since an attacker can use a string like " onclick="alert(42) to inject JavaScript code into the response.

var app = require('express')();

app.get('/user/:id', function(req, res) {
	let id = req.params.id;
	id = id.replace(/<|>/g, ""); // BAD
	let userHtml = `<div data-id="${id}">${getUserName(id) || "Unknown name"}</div>`;
	// ...
	res.send(prefix + userHtml + suffix);
});

Sanitizing the user-controlled data for " helps prevent the vulnerability:

var app = require('express')();

app.get('/user/:id', function(req, res) {
	let id = req.params.id;
	id = id.replace(/<|>|&|"/g, ""); // GOOD
	let userHtml = `<div data-id="${id}">${getUserName(id) || "Unknown name"}</div>`;
	// ...
	res.send(prefix + userHtml + suffix);
});

References