A node in the API graph, that is, a value that can be tracked interprocedurally.
The API graph is a graph for tracking values of certain types in a way that accounts for inheritance and interprocedural data flow.
API graphs are typically used to identify “API calls”, that is, calls to an external function whose implementation is not necessarily part of the current codebase.
Basic usage
The most basic use of API graphs is typically as follows:
- Start with
API::getTopLevelMember
for the relevant library. - Follow up with a chain of accessors such as
getMethod
describing how to get to the relevant API function. - Map the resulting API graph nodes to data-flow nodes, using
asSource
,asSink
, orasCall
.
The following examples demonstrate how to identify the expression x
in various basic cases:
# API::getTopLevelMember("Foo").getMethod("bar").getArgument(0).asSink()
Foo.bar(x)
# API::getTopLevelMember("Foo").getMethod("bar").getKeywordArgument("foo").asSink()
Foo.bar(foo: x)
# API::getTopLevelMember("Foo").getInstance().getMethod("bar").getArgument(0).asSink()
Foo.new.bar(x)
Foo.bar do |x| # API::getTopLevelMember("Foo").getMethod("bar").getBlock().getParameter(0).asSource()
end
Data flow
The members predicates on this class generally take inheritance and data flow into account.
The following example demonstrates a case where data flow was used to find the sink x
:
def doSomething f
f.bar(x) # API::getTopLevelMember("Foo").getInstance().getMethod("bar").getArgument(0).asSink()
end
doSomething Foo.new
The call API::getTopLevelMember("Foo").getInstance()
identifies the Foo.new
call, and getMethod("bar")
then follows data flow from there to find calls to bar
where that object flows to the receiver.
This results in the f.bar
call.
Backward data flow
When inspecting the arguments of a call, the data flow direction is backwards.
The following example illustrates this when we match the x
parameter of a block:
def doSomething &blk
Foo.bar &blk
end
doSomething do |x| # API::getTopLevelMember("Foo").getMethod("bar").getBlock().getParameter(0).asSource()
end
When getParameter(0)
is evaluated, the API graph backtracks the &blk
argument to the block argument a few
lines below. As a result, it eventually matches the x
parameter of that block.
Inheritance
When a class or module object is tracked, inheritance is taken into account.
In the following example, a call to Foo.bar
was found via a subclass of Foo
,
because classes inherit singleton methods from their base class:
class Subclass < Foo
def self.doSomething
bar(x) # API::getTopLevelMember("Foo").getMethod("bar").getArgument(0).asSink()
end
end
Similarly, instance methods can be found in subclasses, or ancestors of subclases in cases of multiple inheritance:
module Mixin
def doSomething
bar(x) # API::getTopLevelMember("Foo").getInstance().getMethod("bar").getArgument(0).asSink()
end
end
class Subclass < Foo
include Mixin
end
The value of self
in Mixin#doSomething
is seen as a potential instance of Foo
, and is thus found by getTopLevelMember("Foo").getInstance()
.
This eventually results in finding the call bar
, due to its implicit self
receiver, and finally its argument x
is found as the sink.
Backward data flow and classes
When inspecting the arguments of a call, and the value flowing into that argument is a user-defined class (or an instance thereof),
uses of getMethod
will find method definitions in that class (including inherited ones) rather than finding method calls.
This example illustrates how this can be used to model cases where the library calls a specific named method on a user-defined class:
class MyClass
def doSomething
x # API::getTopLevelMember("Foo").getMethod("bar").getArgument(0).getMethod("doSomething").getReturn().asSink()
end
end
Foo.bar MyClass.new
When modeling an external library that is known to call a specific method on a parameter (in this case doSomething
), this makes
it possible to find the corresponding method definition in user code.
Strict left-to-right evaluation
Most member predicates on this class are intended to be chained, and are always evaluated from left to right, which means the caller should restrict the initial set of values.
For example, in the following snippet, we always find the uses of Foo
before finding calls to bar
:
API::getTopLevelMember("Foo").getMethod("bar")
In particular, the implementation will never look for calls to bar
and work backward from there.
Beware of the footgun that is to use API graphs with an unrestricted receiver:
API::Node barCall(API::Node base) {
result = base.getMethod("bar") // Do not do this!
}
The above predicate does not restrict the receiver, and will thus perform an interprocedural data flow search starting at every node in the graph, which is very expensive.
Import path
import codeql.ruby.ApiGraphs
Direct supertypes
Known direct subtypes
Predicates
asCall | Gets the call referred to by this API node. |
asCallable | Gets a callable that can reach this sink. |
asModule | Gets a module or class referred to by this API node. |
asSink | Gets a data-flow node where this value potentially flows into an external library. |
asSource | Gets a data-flow node where this value enters the current codebase. |
getADescendentModule | Gets a module or class that descends from the module or class referenced by this API node. |
getAMember | Gets an access to a constant with this value as the base of the access. |
getAMethodCall | Gets a call to a method on the receiver represented by this API node. |
getAValueReachableFromSource | Gets a data-flow node where this value may flow interprocedurally. |
getAValueReachingSink | Get a data-flow node that transitively flows to this value, provided that this value corresponds to a sink. |
getAnElement | Gets a representative for an arbitrary element of this collection. |
getAnInstantiation | Gets a |
getArgument | Gets the |
getArgumentAtPosition | Gets the argument passed in argument position |
getBlock | Gets the block argument to this call, or the block parameter of this callable. |
getBlockParameter | Gets the block parameter of a callable that can reach this sink. |
getContent | Gets a representative for the |
getContents | Gets a representative for the |
getField | Gets a representative for the instance field of the given |
getInducingNode | Gets the data-flow node that gives rise to this node, if any. |
getInstance | Gets a node that may refer to an instance of the module or class represented by this API node. |
getKeywordArgument | Gets the given keyword argument to this call. |
getKeywordParameter | Gets the given keyword parameter of this callable, or keyword argument to this call. |
getLocation | Gets the location of this node. |
getMember | Gets an access to the constant |
getMethod | Gets a call to |
getParameter | Gets the |
getParameterAtPosition | Gets the parameter at position |
getReturn | Gets the result of this call, or the return value of this callable. |
getReturn | Gets the result of a call to |
toString | Gets a textual representation of this element. |