CodeQL library for Python
codeql/python-all 1.0.6 (changelog, source)
Search

Predicate getCallArg

Gets the argument arg of call at position apos, if any. Requires that we can resolve call to target with CallType type.

It might seem like it’s enough to know the CallType to resolve arguments. The reason we also need the target, is to avoid cross-talk. In the example below, assuming that Foo and Bar define their own meth methods, we might end up passing both foo and bar to both Foo.meth and Bar.meth, which is wrong. Since the attribute access uses the same name, we need to also distinguish on the resolved target, to know which of the two objects to pass as the self argument.

foo = Foo()
bar = Bar()
if cond:
    func = foo.meth
else:
    func = bar.meth
func(42)

Note: If Bar.meth and Foo.meth resolves to the same function, we will end up sending both self arguments to that function, which is by definition the right thing to do.

Bound methods

For bound methods, such as bm = x.m; bm(), it’s a little unclear whether we should still use the object in the attribute lookup (x.m) as the self argument in the call (bm()). We currently do this, but there might also be cases where we don’t want to do this.

In the example below, we want to clear taint from the list before it reaches the sink, but because we don’t have a use of l in the clear() call, we currently don’t have any way to achieve our goal. (Note that this is a contrived example)

l = list()
clear = l.clear
l.append(tainted)
clear()
sink(l)

To make the above even worse, bound-methods have a __self__ property that refers to the object of the bound-method, so we can re-write the code as:

l = list()
clear = l.clear
clear.__self__.append(tainted)
clear()
sink(l)

One idea to solve this is to track the object in a synthetic data-flow node every time the bound method is used, such that the clear() call would essentially be translated into l.clear(), and we can still have use-use flow.

Import path

import semmle.python.dataflow.new.internal.DataFlowDispatch
predicate getCallArg(CallNode call, Function target, CallType type, Node arg, ArgumentPosition apos)