Migrating JavaScript Dataflow Queries¶
The JavaScript analysis used to have its own data flow library, which differed from the shared data flow library used by other languages. This library has now been deprecated in favor of the shared library.
This article explains how to migrate JavaScript data flow queries to use the shared data flow library, and some important differences to be aware of. Note that the article on analyzing data flow in JavaScript and TypeScript provides a general guide to the new data flow library, whereas this article aims to help with migrating existing queries from the old data flow library.
Note that the DataFlow::Configuration
class is still backed by the original data flow library, but has been marked as deprecated.
This means data flow queries using this class will continue to work, albeit with deprecation warnings, until the 1-year deprecation period expires in early 2026.
It is recommended that all custom queries are migrated before this time, to ensure they continue to work in the future.
Data flow queries should be migrated to use DataFlow::ConfigSig
-style modules instead of the DataFlow::Configuration
class.
This is identical to the interface found in other languages.
When making this switch, the query will become backed by the shared data flow library instead. That is, data flow queries will only work
with the shared data flow library when they have been migrated to ConfigSig
-style, as shown in the following table:
API |
Implementation |
---|---|
|
Old library (deprecated, to be removed in early 2026) |
|
Shared library |
A straightforward translation to DataFlow::ConfigSig
-style is usually possible, although there are some complications
that may cause the query to behave differently.
We’ll first cover some straightforward migration examples, and then go over some of the complications that may arise.
Simple migration example¶
A simple example of a query using the old data flow library is shown below:
/** @kind path-problem */
import javascript
import DataFlow::PathGraph
class MyConfig extends DataFlow::Configuration {
MyConfig() { this = "MyConfig" }
override predicate isSource(DataFlow::Node node) { ... }
override predicate isSink(DataFlow::Node node) { ... }
}
from MyConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "Flow found"
With the new style this would look like this:
/** @kind path-problem */
import javascript
module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node node) { ... }
predicate isSink(DataFlow::Node node) { ... }
}
module MyFlow = DataFlow::Global<MyConfig>;
import MyFlow::PathGraph
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink, source, sink, "Flow found"
The changes can be summarized as:
The
DataFlow::Configuration
class was replaced with a module implementingDataFlow::ConfigSig
.The characteristic predicate was removed (modules have no characteristic predicates).
Predicates such as
isSource
no longer have theoverride
keyword (as they are defined in a module now).The configuration module is being passed to
DataFlow::Global
, resulting in a new module, calledMyFlow
in this example.The query imports
MyFlow::PathGraph
instead ofDataFlow::PathGraph
.The
MyConfig cfg
variable was removed from thefrom
clause.The
hasFlowPath
call was replaced withMyFlow::flowPath
.The type
DataFlow::PathNode
was replaced withMyFlow::PathNode
.
With these changes, we have produced an equivalent query that is backed by the new data flow library.
Taint tracking¶
For configuration classes extending TaintTracking::Configuration
, the migration is similar but with a few differences:
The
TaintTracking::Global
module should be used instead ofDataFlow::Global
.Some predicates originating from
TaintTracking::Configuration
should be renamed to match theDataFlow::ConfigSig
interface: -isSanitizer
should be renamed toisBarrier
. -isAdditionalTaintStep
should be renamed toisAdditionalFlowStep
.
Note that there is no such thing as TaintTracking::ConfigSig
. The DataFlow::ConfigSig
interface is used for both data flow and taint tracking.
For example:
class MyConfig extends TaintTracking::Configuration {
MyConfig() { this = "MyConfig" }
predicate isSanitizer(DataFlow::Node node) { ... }
predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) { ... }
...
}
The above configuration can be migrated to the shared data flow library as follows:
module MyConfig implements DataFlow::ConfigSig {
predicate isBarrier(DataFlow::Node node) { ... }
predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) { ... }
...
}
module MyFlow = TaintTracking::Global<MyConfig>;
Flow labels and flow states¶
The DataFlow::FlowLabel
class has been deprecated. Queries that relied on flow labels should use the new flow state concept instead.
This is done by implementing DataFlow::StateConfigSig
instead of DataFlow::ConfigSig
, and passing the module to DataFlow::GlobalWithState
or TaintTracking::GlobalWithState
. See using flow state for more details about flow state.
Some changes to be aware of:
The 4-argument version of
isAdditionalFlowStep
now takes parameters in a different order. It now takesnode1, state1, node2, state2
instead ofnode1, node2, state1, state2
.Taint steps apply to all flow states, not just the
taint
flow label. See more details further down in this article.
Barrier guards¶
The predicates isBarrierGuard
and isSanitizerGuard
have been removed.
Instead, the isBarrier
predicate must be used to define all barriers. To do this, barrier guards can be reduced to a set of barrier nodes using the DataFlow::MakeBarrierGuard
module.
For example, consider this data flow configuration using a barrier guard:
class MyConfig extends DataFlow::Configuration {
override predicate isBarrierGuard(DataFlow::BarrierGuardNode node) {
node instanceof MyBarrierGuard
}
..
}
class MyBarrierGuard extends DataFlow::BarrierGuardNode {
MyBarrierGuard() { ... }
override predicate blocks(Expr e, boolean outcome) { ... }
}
This can be migrated to the shared data flow library as follows:
module MyConfig implements DataFlow::ConfigSig {
predicate isBarrier(DataFlow::Node node) {
node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode()
}
..
}
class MyBarrierGuard extends DataFlow::Node {
MyBarrierGuard() { ... }
predicate blocksExpr(Expr e, boolean outcome) { ... }
}
The changes can be summarized as:
- The contents of isBarrierGuard
have been moved to isBarrier
.
- The node instanceof MyBarrierGuard
check was replaced with node = DataFlow::MakeBarrierGuard<MyBarrierGuard>::getABarrierNode()
.
- The MyBarrierGuard
class no longer has DataFlow::BarrierGuardNode
as a base class. We simply use DataFlow::Node
instead.
- The blocks
predicate has been renamed to blocksExpr
and no longer has the override
keyword.
See using flow state for examples of how to use barrier guards with flow state.
Query-specific load and store steps¶
The predicates isAdditionalLoadStep
, isAdditionalStoreStep
, and isAdditionalLoadStoreStep
have been removed. There is no way to emulate the original behavior.
Library models can still contribute such steps, but they will be applicable to all queries. Also see the section on jump steps further down.
Changes in behavior¶
When the query has been migrated to the new interface, it may seem to behave differently due to some technical differences in the internals of the two data flow libraries. The most significant changes are described below.
Taint steps now propagate all flow states¶
There’s an important change from the old data flow library when using flow state and taint-tracking together.
When using TaintTracking::GlobalWithState
, all flow states can propagate along taint steps.
In the old data flow library, only the taint
flow label could propagate along taint steps.
A straightforward translation of such a query may therefore result in new flow paths being found, which might be unexpected.
To emulate the old behavior, use DataFlow::GlobalWithState
instead of TaintTracking::GlobalWithState
,
and manually add taint steps using isAdditionalFlowStep
. The predicate TaintTracking::defaultTaintStep
can be used to access to the set of taint steps.
For example:
module MyConfig implements DataFlow::StateConfigSig {
class FlowState extends string {
FlowState() { this = ["taint", "foo"] }
}
predicate isAdditionalFlowStep(DataFlow::Node node1, FlowState state1, DataFlow::Node node2, FlowState state2) {
// Allow taint steps to propagate the "taint" flow state
TaintTracking::defaultTaintStep(node1, node2) and
state1 = "taint" and
state2 = state
}
...
}
module MyFlow = DataFlow::GlobalWithState<MyConfig>;
Jump steps across function boundaries¶
When a flow step crosses a function boundary, that is, it starts and ends in two different functions, it will now be classified as a “jump” step.
Jump steps can be problematic in some cases. Roughly speaking, the data flow library will “forget” which call site it came from when following a jump step. This can lead to spurious flow paths that go into a function through one call site, and back out of a different call site.
If the step was generated by a library model, that is, the step is applicable to all queries, this is best mitigated by converting the step to a flow summary.
For example, the following library model adds a taint step from x
to y
in foo.bar(x, y => {})
:
class MyStep extends TaintTracking::SharedTaintStep {
override predicate step(DataFlow::Node node1, DataFlow::Node node2) {
exists(DataFlow::CallNode call |
call = DataFlow::moduleMember("foo", "bar").getACall() and
node1 = call.getArgument(0) and
node2 = call.getCallback(1).getParameter(0)
)
}
}
Because this step crosses a function boundary, it becomes a jump step. This can be avoided by converting it to a flow summary as follows:
class MySummary extends DataFlow::SummarizedCallable {
MySummary() { this = "MySummary" }
override DataFlow::CallNode getACall() { result = DataFlow::moduleMember("foo", "bar").getACall() }
override predicate propagatesFlow(string input, string output, boolean preservesValue) {
input = "Argument[this]" and
output = "Argument[1].Parameter[0]" and
preservesValue = false // taint step
}
}
See customizing library models for JavaScript for details about the format of the input
and output
strings.
The aforementioned article also provides guidance on how to store the flow summary in a data extension.
For query-specific steps that cross function boundaries, that is, steps added with isAdditionalFlowStep
, there is currently no way to emulate the original behavior.
A possible workaround is to convert the query-specific step to a flow summary. In this case it should be stored in a data extension to avoid performance issues, although this also means
that all other queries will be able to use the flow summary.
Barriers block all flows¶
In the shared data flow library, a barrier blocks all flows, even if the tracked value is inside a content.
In the old data flow library, only barriers specific to the data
flow label blocked flows when the tracked value was inside a content.
This rarely has significant impact, but some users may observe some result changes because of this.
There is currently no way to emulate the original behavior.
Further reading¶
Analyzing data flow in JavaScript and TypeScript provides a general guide to the new data flow library.
Using flow state for precise data flow analysis provides a general guide on using flow state.