Module InvalidPointerToDereference
This file provides the second phase of the cpp/invalid-pointer-deref
query that identifies flow
from the out-of-bounds pointer identified by the AllocationToInvalidPointer.qll
library to
a dereference of the out-of-bounds pointer.
Consider the following snippet:
1. char* base = (char*)malloc(size);
2. char* end = base + size;
3. for(char *p = base; p <= end; p++) {
4. use(*p); // BUG: Should have been bounded by `p < end`.
5. }
this file identifies the flow from base + size
to end
. We call base + size
the “dereference source” and end
the “dereference sink” (even though end
is not actually dereferenced we will use this term because we will perform
dataflow to find a use of a pointer x
such that x <= end
which is dereferenced. In the above example, x
is p
on line 4).
Merely constructing a pointer that’s out-of-bounds is fine if the pointer is never dereferenced (in reality, the
standard only guarantees that it is safe to move the pointer one element past the last element, but we ignore that
here). So this step is about identifying which of the out-of-bounds pointers found by pointerAddInstructionHasBounds
in AllocationToInvalidPointer.qll
are actually being dereferenced. We do this using a regular dataflow
configuration (see InvalidPointerToDerefConfig
).
The dataflow traversal defines the set of sources as any dataflow node n
such that there exists a pointer-arithmetic
instruction pai
found by AllocationToInvalidPointer.qll
and a n.asInstruction() = pai
.
The set of sinks is defined as any dataflow node n
such that addr <= n.asInstruction() + deltaDerefSinkAndDerefAddress
for some address operand addr
and constant difference deltaDerefSinkAndDerefAddress
. Since an address operand is
always consumed by an instruction that performs a dereference this lets us identify a “bad dereference”. We call the
instruction that consumes the address operand the “operation”.
For example, consider the flow from base + size
to end
above. The sink is end
on line 3 because
p <= end.asInstruction() + deltaDerefSinkAndDerefAddress
, where p
is the address operand in use(*p)
and
deltaDerefSinkAndDerefAddress >= 0
. The load attached to *p
is the “operation”. To ensure that the path makes
intuitive sense, we only pick operations that are control-flow reachable from the dereference sink.
We use the deltaDerefSinkAndDerefAddress
to compute how many elements the dereference is beyond the end position of
the allocation. This is done in the operationIsOffBy
predicate (which is the only predicate exposed by this file).
Handling false positives:
Consider the following snippet:
1. char *p = new char[size];
2. char *end = p + size;
3. if (p < end) {
4. p += 1;
5. }
6. if (p < end) {
7. int val = *p; // GOOD
8. }
this is safe because p
is guarded to be strictly less than end
on line 6 before the dereference on line 7. However, if we
run the query on the above without further modifications we would see an alert on line 7. This is because range analysis infers
that p <= end
after the increment on line 4, and thus the result of p += 1
is seen as a valid dereference source. This
node then flows to p
on line 6 (which is a valid dereference sink since it non-strictly upper bounds an address operand), and
range analysis then infers that the address operand of *p
(i.e., p
) is non-strictly upper bounded by p
, and thus reports
an alert on line 7.
In order to handle the above false positive, we define a barrier that identifies guards such as p < end
that ensures that a value
is less than the pointer-arithmetic instruction that computed the invalid pointer. This is done in the InvalidPointerToDerefBarrier
module. Since the node we are tracking is not necessarily equal to the pointer-arithmetic instruction, but rather satisfies
node.asInstruction() <= pai + deltaDerefSourceAndPai
, we need to account for the delta when checking if a guard is sufficiently
strong to infer that a future dereference is safe. To do this, we check that the guard guarantees that a node n
satisfies
n < node + k
where node
is a node such that node <= pai
. Thus, we know that any node m
such that m <= n + delta
where
delta + k <= 0
will be safe because:
m <= n + delta
< node + k + delta
<= pai + k + delta
<= pai
Import path
import semmle.code.cpp.security.InvalidPointerDereference.InvalidPointerToDereference
Predicates
invalidPointerToDereferenceFieldFlowBranchLimit | Gets the virtual dispatch branching limit when calculating field flow while searching for flow from an out-of-bounds pointer to a dereference of the pointer. |
operationIsOffBy | Holds if |