Module PrintIR
Outputs a representation of the IR as a control flow graph.
This file contains the actual implementation of PrintIR.ql
. For test cases and very small
databases, PrintIR.ql
can be run directly to dump the IR for the entire database. For most
uses, however, it is better to write a query that imports PrintIR.qll
, extends
PrintIRConfiguration
, and overrides shouldPrintDeclaration()
to select a subset of declarations
to dump.
Anatomy of a printed IR instruction
An instruction:
# 2281| v2281_19(void) = Call[~String] : func:r2281_18, this:r2281_17
The prefix # 2281|
specifies that this instruction was generated by the C++ source code on line 2281.
Scrolling up in the printed output, one will eventually find the name of the file to which the line
belongs.
v2281_19(void)
is the result of the instruction. Here, v
means this is a void result or operand (so
there should be no later uses of the result; see below for other possible values). The 2281_19
is a
unique ID for the result. This is usually just the line number plus a small integer suffix to make it
unique within the function. The type of the result is void
. In this case, it is void
, because
~String
returns void
. The type of the result is usually just the name of the appropriate C++ type,
but it will sometimes be a type like glval<int>
, which means result holds a glvalue, which at the
IR level works like a pointer. In other words, in the source code the type was int
, but it is really
more like an int*
. We see this, for example, in x = y;
, where x
is a glvalue.
Call
is the opcode of the instruction. Common opcodes include:
- Arithmetic operations:
Add
,Sub
,Mul
, etc. - Memory access operations:
Load
,Store
. - Function calls:
Call
. - Literals:
Constant
. - Variable addresses:
VariableAddress
. - Function entry points:
EnterFunction
. - Return from a function:
Return
,ReturnVoid
. Note that the value being returned is set separately by aStore
to a special#return
variable. - Stack unwinding for C++ function that throw and where the exception escapes the function:
Unwind
. - Common exit point for
Unwind
andReturn
:ExitFunction
. - SSA-related opcodes:
Phi
,Chi
.
[~String]
denotes additional information. The information might be present earlier in the IR, as is the case
for Call
, where it is the name of the called function. This is also the case for Load
and Store
, where it
is the name of the variable that loaded or stored (if known). In the case of Constant
, FieldAddress
, and
VariableAddress
, the information between brackets does not occur earlier.
func:r2281_18
and this:r28281_17
are the operands of the instruction. The func:
prefix denotes the operand
that holds the address of the called function. The this:
prefix denotes the argument to the special this
parameter of an instance member function. r2281_18
, r2281_17
are the unique IDs of the operands. Each of these
matches the ID of a previously seen result, showing where that value came from. The r
means that these are
“register” operands (see below).
Result and operand kinds:
Every result and operand is one of these three kinds:
r
“register”. These operands are not stored in any particular memory location. We can think of them as temporary values created during the evaluation of an expression. A register operand almost always has one use, often in the same block as its definition.m
“memory”. These operands represents accesses to a specific memory location. The location could be a local variable, a global variable, a field of an object, an element of an array, or any memory that we happen to have a pointer to. These only occur as the result of aStore
, the source operand of aLoad
or on the SSA instructions (Phi
,Chi
).v
“void”. Really just a register operand, but we mark register operands of type void with this special prefix so we know that there is no actual value there.
Branches in the IR:
The IR is divided into basic blocks. At the end of each block, there are one or more edges showing the possible control flow successors of the block.
# 44| v44_3(void) = ConditionalBranch : r44_2
#-----| False -> Block 4
#-----| True -> Block 3
Here we have a block that ends with a conditional branch. The two edges show where the control flows to depending on whether the condition is true or false.
SSA instructions:
We use Phi
instructions in SSA to create a single definition for a variable that might be assigned on multiple
control flow paths. The Phi
instruction merges the potential values of that variable from each predecessor edge,
and the resulting definition is then used wherever that variable is accessed later on.
When dealing with aliased memory, we use the Chi
instruction to create a single definition for memory that might
or might not have been updated by a store, depending on the actual address that was written to. For example, take:
int x = 5;
int y = 7;
int* p = condition ? &x : &y;
*p = 6;
return x;
At the point where we store to *p
, we do not know whether p
points to x
or y
. Thus, we do not know whether
return x;
is going to return the value that x
was originally initialized to (5), or whether it will return 6,
because it was overwritten by *p = 6;
. We insert a Chi
instruction immediately after the store to *p
:
r2(int) = Constant[6]
r3(int*) = <<value of p>>
m4(int) = Store : &r3, r2 // Stores the constant 6 to *p
m5(unknown) = Chi : total:m1, partial:m4
The partial:
operand represents the memory that was just stored. The total:
operand represents the previous
contents of all of the memory that p
might have pointed to (in this case, both x
and y
). The result of the
Chi
represents the new contents of whatever memory the total:
operand referred to. We usually do not know exactly
which parts of that memory were overwritten, but it does model that any of that memory could have been modified, so
that later instructions do not assume that the memory was unchanged.
Import path
import semmle.code.cpp.ir.implementation.aliased_ssa.PrintIR
Imports
IRConfiguration | Module used to configure the IR generation process. |
Predicates
Classes
PrintIRConfiguration | The query can extend this class to control which declarations are printed. |