CodeQL documentation

Data flow cheat sheet for JavaScript

This article describes parts of the JavaScript libraries commonly used for variant analysis and in data flow queries.

Taint tracking path queries

Use the following template to create a taint tracking path query:

/**
 * @kind path-problem
 */
import javascript
import DataFlow
import DataFlow::PathGraph

class MyConfig extends TaintTracking::Configuration {
  MyConfig() { this = "MyConfig" }
  override predicate isSource(Node node) { ... }
  override predicate isSink(Node node) { ... }
  override predicate isAdditionalTaintStep(Node pred, Node succ) { ... }
}

from MyConfig cfg, PathNode source, PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "taint from $@.", source.getNode(), "here"

This query reports flow paths which:

  • Begin at a node matched by isSource.
  • Step through variables, function calls, properties, strings, arrays, promises, exceptions, and steps added by isAdditionalTaintStep.
  • End at a node matched by isSink.

See also: “Global data flow” and “Creating path queries.”

DataFlow module

Use data flow nodes to match program elements independently of syntax. See also: “Analyzing data flow in JavaScript and TypeScript.”

Predicates in the DataFlow:: module:

Classes and member predicates in the DataFlow:: module:

StringOps module

  • StringOps::Concatenation – string concatenation, using a plus operator, template literal, or array join call
  • StringOps::StartsWith – check if a string starts with something
  • StringOps::EndsWith – check if a string ends with something
  • StringOps::Includes – check if a string contains something
  • StringOps::RegExpTest – check if a string matches a RegExp

Utility

System and Network

Untrusted data

Note: some RemoteFlowSource instances, such as input from a web socket, belong to none of the specific subcategories above.

Files

AST nodes

See also: “Abstract syntax tree classes for working with JavaScript and TypeScript programs.”

Conversion between DataFlow and AST nodes:

String matching

  • x.matches(“escape%”) – holds if x starts with “escape”
  • x.regexpMatch(“escape.*”) – holds if x starts with “escape”
  • x.regexpMatch(“(?i).*escape.*”) – holds if x contains “escape” (case insensitive)

Access paths

When multiple property accesses are chained together they form what’s called an “access path”.

To identify nodes based on access paths, use the following predicates in AccessPath module:

getAReferenceTo and getAnAssignmentTo have a 1-argument version for global access paths, and a 2-argument version for access paths starting at a given node.

Type tracking

See also: “Using type tracking for API modeling.”

Use the following template to define forward type tracking predicates:

import DataFlow

SourceNode myType(TypeTracker t) {
  t.start() and
  result = /* SourceNode to track */
  or
  exists(TypeTracker t2 |
    result = myType(t2).track(t2, t)
  )
}

SourceNode myType() {
  result = myType(TypeTracker::end())
}

Use the following template to define backward type tracking predicates:

import DataFlow

SourceNode myType(TypeBackTracker t) {
  t.start() and
  result = (/* argument to track */).getALocalSource()
  or
  exists(TypeBackTracker t2 |
    result = myType(t2).backtrack(t2, t)
  )
}

SourceNode myType() {
  result = myType(TypeBackTracker::end())
}

Troubleshooting

  • Using a call node as as sink? Try using getArgument to get an argument of the call node instead.
  • Trying to use moduleImport or moduleMember as a call node? Try using getACall to get a call to the imported function, instead of the function itself.
  • Compilation fails due to incompatible types? Make sure AST nodes and DataFlow nodes are not mixed up. Use asExpr() or flow() to convert.