Module InvalidPointerToDereference
This file provides the second phase of the cpp/invalid-pointer-deref query that identifies flow
from the out-of-bounds pointer identified by the AllocationToInvalidPointer.qll library to
a dereference of the out-of-bounds pointer.
Consider the following snippet:
1. char* base = (char*)malloc(size);
2. char* end = base + size;
3. for(char *p = base; p <= end; p++) {
4. use(*p); // BUG: Should have been bounded by `p < end`.
5. }
this file identifies the flow from base + size to end. We call base + size the “dereference source” and end
the “dereference sink” (even though end is not actually dereferenced we will use this term because we will perform
dataflow to find a use of a pointer x such that x <= end which is dereferenced. In the above example, x is p
on line 4).
Merely constructing a pointer that’s out-of-bounds is fine if the pointer is never dereferenced (in reality, the
standard only guarantees that it is safe to move the pointer one element past the last element, but we ignore that
here). So this step is about identifying which of the out-of-bounds pointers found by pointerAddInstructionHasBounds
in AllocationToInvalidPointer.qll are actually being dereferenced. We do this using a regular dataflow
configuration (see InvalidPointerToDerefConfig).
The dataflow traversal defines the set of sources as any dataflow node n such that there exists a pointer-arithmetic
instruction pai found by AllocationToInvalidPointer.qll and a n.asInstruction() = pai.
The set of sinks is defined as any dataflow node n such that addr <= n.asInstruction() + deltaDerefSinkAndDerefAddress
for some address operand addr and constant difference deltaDerefSinkAndDerefAddress. Since an address operand is
always consumed by an instruction that performs a dereference this lets us identify a “bad dereference”. We call the
instruction that consumes the address operand the “operation”.
For example, consider the flow from base + size to end above. The sink is end on line 3 because
p <= end.asInstruction() + deltaDerefSinkAndDerefAddress, where p is the address operand in use(*p) and
deltaDerefSinkAndDerefAddress >= 0. The load attached to *p is the “operation”. To ensure that the path makes
intuitive sense, we only pick operations that are control-flow reachable from the dereference sink.
We use the deltaDerefSinkAndDerefAddress to compute how many elements the dereference is beyond the end position of
the allocation. This is done in the operationIsOffBy predicate (which is the only predicate exposed by this file).
Handling false positives:
Consider the following snippet:
1. char *p = new char[size];
2. char *end = p + size;
3. if (p < end) {
4. p += 1;
5. }
6. if (p < end) {
7. int val = *p; // GOOD
8. }
this is safe because p is guarded to be strictly less than end on line 6 before the dereference on line 7. However, if we
run the query on the above without further modifications we would see an alert on line 7. This is because range analysis infers
that p <= end after the increment on line 4, and thus the result of p += 1 is seen as a valid dereference source. This
node then flows to p on line 6 (which is a valid dereference sink since it non-strictly upper bounds an address operand), and
range analysis then infers that the address operand of *p (i.e., p) is non-strictly upper bounded by p, and thus reports
an alert on line 7.
In order to handle the above false positive, we define a barrier that identifies guards such as p < end that ensures that a value
is less than the pointer-arithmetic instruction that computed the invalid pointer. This is done in the InvalidPointerToDerefBarrier
module. Since the node we are tracking is not necessarily equal to the pointer-arithmetic instruction, but rather satisfies
node.asInstruction() <= pai + deltaDerefSourceAndPai, we need to account for the delta when checking if a guard is sufficiently
strong to infer that a future dereference is safe. To do this, we check that the guard guarantees that a node n satisfies
n < node + k where node is a node such that node <= pai. Thus, we know that any node m such that m <= n + delta where
delta + k <= 0 will be safe because:
m <= n + delta
< node + k + delta
<= pai + k + delta
<= pai
Import path
import semmle.code.cpp.security.InvalidPointerDereference.InvalidPointerToDereferencePredicates
| invalidPointerToDereferenceFieldFlowBranchLimit | Gets the virtual dispatch branching limit when calculating field flow while searching for flow from an out-of-bounds pointer to a dereference of the pointer. |
| operationIsOffBy | Holds if |