Data flow cheat sheet for JavaScript¶
This article describes parts of the JavaScript libraries commonly used for variant analysis and in data flow queries.
Taint tracking path queries¶
Use the following template to create a taint tracking path query:
/**
* @kind path-problem
*/
import javascript
module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node node) { ... }
predicate isSink(DataFlow::Node node) { ... }
predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ) { ... }
}
module MyFlow = TaintTracking::Global<MyConfig>;
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "taint from $@.", source.getNode(), "here"
This query reports flow paths which:
Begin at a node matched by isSource.
Step through variables, function calls, properties, strings, arrays, promises, exceptions, and steps added by isAdditionalTaintStep.
End at a node matched by isSink.
See also: “Global data flow” and “Creating path queries.”
DataFlow module¶
Use data flow nodes to match program elements independently of syntax. See also: “Analyzing data flow in JavaScript and TypeScript.”
Predicates in the DataFlow::
module:
moduleImport – finds uses of a module
moduleMember – finds uses of a module member
globalVarRef – finds uses of a global variable
Classes and member predicates in the DataFlow::
module:
- Node – something that can have a value, such as an expression, declaration, or SSA variable
getALocalSource – find the node that this came from
getTopLevel – top-level scope enclosing this node
getFile – file containing this node
getIntValue – value of this node if it’s is an integer constant
getStringValue – value of this node if it’s is a string constant
mayHaveBooleanValue – check if the value is
true
orfalse
- SourceNode extends Node – function call, parameter, object creation, or reference to a property or global variable
getALocalUse – find nodes whose value came from this node
getACall – find calls with this as the callee
getAnInstantiation – find
new
-calls with this as the calleegetAnInvocation – find calls or
new
-calls with this as the calleegetAMethodCall – find method calls with this as the receiver
getAMemberCall – find calls with a member of this as the callee
getAPropertyRead – find property reads with this as the base
getAPropertyWrite – find property writes with this as the base
getAPropertySource – find nodes flowing into a property of this node
- InvokeNode, NewNode, CallNode, MethodCallNode extends SourceNode – call to a function or constructor
getArgument – an argument to the call
getCalleeNode – node being invoked as a function
getCalleeName – name of the variable or property being called
getOptionArgument – a “named argument” passed in through an object literal
getCallback – a function passed as a callback
getACallee - a function being called here
(MethodCallNode).getMethodName – name of the method being invoked
(MethodCallNode).getReceiver – receiver of the method call
- FunctionNode extends SourceNode – definition of a function, including closures, methods, and class constructors
getName – name of the function, derived from a variable or property name
getParameter – a parameter of the function
getReceiver – the node representing the value of
this
getAReturn – get a returned expression
- ParameterNode extends SourceNode – parameter of a function
getName – the parameter name, if it has one
- ClassNode extends SourceNode – class declaration or function that acts as a class
getName – name of the class, derived from a variable or property name
getConstructor – the constructor function
getInstanceMethod – get an instance method by name
getStaticMethod – get a static method by name
getAnInstanceReference – find references to an instance of the class
getAClassReference – find references to the class itself
- ObjectLiteralNode extends SourceNode – object literal
getAPropertyWrite – a property in the object literal
getAPropertySource – value flowing into a property
- ArrayCreationNode extends SourceNode – array literal or call to
Array
constructor getElement – an element of the array
- ArrayCreationNode extends SourceNode – array literal or call to
- PropRef, PropRead, PropWrite – read or write of a property
getPropertyName – name of the property, if it is constant
getPropertyNameExpr – expression holding the name of the property
getBase – object whose property is accessed
(PropWrite).getRhs – right-hand side of the property assignment
StringOps module¶
StringOps::Concatenation – string concatenation, using a plus operator, template literal, or array join call
StringOps::StartsWith – check if a string starts with something
StringOps::EndsWith – check if a string ends with something
StringOps::Includes – check if a string contains something
StringOps::RegExpTest – check if a string matches a RegExp
Utility¶
ExtendCall – call that copies properties from one object to another
JsonParserCall – call that deserializes a JSON string
JsonStringifyCall – call that serializes a JSON string
PropertyProjection – call that extracts nested properties by name
System and Network¶
ClientRequest – outgoing network request
DatabaseAccess – query being submitted to a database
FileNameSource – reference to a filename
- FileSystemAccess – file system operation
FileSystemReadAccess – reading the contents of a file
FileSystemWriteAccess – writing to the contents of a file
PersistentReadAccess – reading from persistent storage, like cookies
PersistentWriteAccess – writing to persistent storage
SystemCommandExecution – execution of a system command
Untrusted data¶
- RemoteFlowSource – source of untrusted user input
isUserControlledObject – is the input deserialized to a JSON-like object? (as opposed to just being a string)
- ClientSideRemoteFlowSource extends RemoteFlowSource – input specific to the browser environment
getKind – is this derived from the
path
,fragment
,query
,url
, orname
?
- HTTP::RequestInputAccess extends RemoteFlowSource – input from an incoming HTTP request
getKind – is this derived from a
parameter
,header
,body
,url
, orcookie
?
- HTTP::RequestHeaderAccess extends RequestInputAccess – access to a specific header
getAHeaderName – the name of a header being accessed
Note: some RemoteFlowSource instances, such as input from a web socket, belong to none of the specific subcategories above.
Files¶
File, Folder extends Container – file or folder in the database
getBaseName – the name of the file or folder
getRelativePath – path relative to the database root
AST nodes¶
See also: “Abstract syntax tree classes for working with JavaScript and TypeScript programs.”
Conversion between DataFlow and AST nodes:
Node.asExpr() – convert node to an expression, if possible
Expr.flow() – convert expression to a node (always possible)
DataFlow::valueNode – convert expression or declaration to a node
DataFlow::parameterNode – convert a parameter to a node
DataFlow::thisNode – get the receiver node of a function
String matching¶
x.matches(“escape%”) – holds if x starts with “escape”
x.regexpMatch(“escape.*”) – holds if x starts with “escape”
x.regexpMatch(“(?i).*escape.*”) – holds if x contains “escape” (case insensitive)
Access paths¶
When multiple property accesses are chained together they form what’s called an “access path”.
To identify nodes based on access paths, use the following predicates in AccessPath module:
AccessPath::getAReferenceTo – find nodes that refer to the given access path
AccessPath::getAnAssignmentTo – finds nodes that are assigned to the given access path
AccessPath::getAnAliasedSourceNode – finds nodes that refer to the same access path
getAReferenceTo
and getAnAssignmentTo
have a 1-argument version for global access paths, and a 2-argument version for access paths starting at a given node.
Type tracking¶
See also: “Using type tracking for API modeling.”
Use the following template to define forward type tracking predicates:
import DataFlow
SourceNode myType(TypeTracker t) {
t.start() and
result = /* SourceNode to track */
or
exists(TypeTracker t2 |
result = myType(t2).track(t2, t)
)
}
SourceNode myType() {
result = myType(TypeTracker::end())
}
Use the following template to define backward type tracking predicates:
import DataFlow
SourceNode myType(TypeBackTracker t) {
t.start() and
result = (/* argument to track */).getALocalSource()
or
exists(TypeBackTracker t2 |
result = myType(t2).backtrack(t2, t)
)
}
SourceNode myType() {
result = myType(TypeBackTracker::end())
}
Troubleshooting¶
Using a call node as as sink? Try using getArgument to get an argument of the call node instead.
Trying to use moduleImport or moduleMember as a call node? Try using getACall to get a call to the imported function, instead of the function itself.
Compilation fails due to incompatible types? Make sure AST nodes and DataFlow nodes are not mixed up. Use asExpr() or flow() to convert.
Further reading¶
Exploring data flow with path queries in the GitHub documentation.