Overflow-prone comparisons in Java and Kotlin¶
You can use CodeQL to check for comparisons in Java/Kotlin code where one side of the comparison is prone to overflow.
About this article¶
In this tutorial article you’ll write a query for finding comparisons between integers and long integers in loops that may lead to non-termination due to overflow.
To begin, consider this code snippet:
void foo(long l) {
for(int i=0; i<l; i++) {
// do something
}
}
If l
is bigger than 231- 1 (the largest positive value of type int
), then this loop will never terminate: i
will start at zero, being incremented all the way up to 231- 1, which is still smaller than l
. When it is incremented once more, an arithmetic overflow occurs, and i
becomes -231, which also is smaller than l
! Eventually, i
will reach zero again, and the cycle repeats.
More about overflow
All primitive numeric types have a maximum value, beyond which they will wrap around to their lowest possible value (called an “overflow”). For
int
, this maximum value is 231- 1. Typelong
can accommodate larger values up to a maximum of 263- 1. In this example, this means thatl
can take on a value that is higher than the maximum for typeint
;i
will never be able to reach this value, instead overflowing and returning to a low value.
We’re going to develop a query that finds code that looks like it might exhibit this kind of behavior. We’ll be using several of the standard library classes for representing statements and functions. For a full list, see “Abstract syntax tree classes for working with Java programs.”
Initial query¶
We’ll start by writing a query that finds less-than expressions (CodeQL class LTExpr
) where the left operand is of type int
and the right operand is of type long
:
import java
from LTExpr expr
where expr.getLeftOperand().getType().hasName("int") and
expr.getRightOperand().getType().hasName("long")
select expr
This query usually finds results on most codebases.
Notice that we use the predicate getType
(available on all subclasses of Expr
) to determine the type of the operands. Types, in turn, define the hasName
predicate, which allows us to identify the primitive types int
and long
. As it stands, this query finds all less-than expressions comparing int
and long
, but in fact we are only interested in comparisons that are part of a loop condition. Also, we want to filter out comparisons where either operand is constant, since these are less likely to be real bugs. The revised query looks like this:
import java
from LTExpr expr
where expr.getLeftOperand().getType().hasName("int") and
expr.getRightOperand().getType().hasName("long") and
exists(LoopStmt l | l.getCondition().getAChildExpr*() = expr) and
not expr.getAnOperand().isCompileTimeConstant()
select expr
Notice that fewer results are found.
The class LoopStmt
is a common superclass of all loops, including, in particular, for
loops as in our example above. While different kinds of loops have different syntax, they all have a loop condition, which can be accessed through predicate getCondition
. We use the reflexive transitive closure operator *
applied to the getAChildExpr
predicate to express the requirement that expr
should be nested inside the loop condition. In particular, it can be the loop condition itself.
The final conjunct in the where
clause takes advantage of the fact that predicates can return more than one value (they are really relations). In particular, getAnOperand
may return either operand of expr
, so expr.getAnOperand().isCompileTimeConstant()
holds if at least one of the operands is constant. Negating this condition means that the query will only find expressions where neither of the operands is constant.
Generalizing the query¶
Of course, comparisons between int
and long
are not the only problematic case: any less-than comparison between a narrower and a wider type is potentially suspect, and less-than-or-equals, greater-than, and greater-than-or-equals comparisons are just as problematic as less-than comparisons.
In order to compare the ranges of types, we define a predicate that returns the width (in bits) of a given integral type:
int width(PrimitiveType pt) {
(pt.hasName("byte") and result=8) or
(pt.hasName("short") and result=16) or
(pt.hasName("char") and result=16) or
(pt.hasName("int") and result=32) or
(pt.hasName("long") and result=64)
}
We now want to generalize our query to apply to any comparison where the width of the type on the smaller end of the comparison is less than the width of the type on the greater end. Let’s call such a comparison overflow prone, and introduce an abstract class to model it:
abstract class OverflowProneComparison extends ComparisonExpr {
Expr getLesserOperand() { none() }
Expr getGreaterOperand() { none() }
}
There are two concrete child classes of this class: one for <=
or <
comparisons, and one for >=
or >
comparisons. In both cases, we implement the constructor in such a way that it only matches the expressions we want:
class LTOverflowProneComparison extends OverflowProneComparison {
LTOverflowProneComparison() {
(this instanceof LEExpr or this instanceof LTExpr) and
width(this.getLeftOperand().getType()) < width(this.getRightOperand().getType())
}
}
class GTOverflowProneComparison extends OverflowProneComparison {
GTOverflowProneComparison() {
(this instanceof GEExpr or this instanceof GTExpr) and
width(this.getRightOperand().getType()) < width(this.getLeftOperand().getType())
}
}
Now we rewrite our query to make use of these new classes:
import java
// Return the width (in bits) of a given integral type
int width(PrimitiveType pt) {
(pt.hasName("byte") and result=8) or
(pt.hasName("short") and result=16) or
(pt.hasName("char") and result=16) or
(pt.hasName("int") and result=32) or
(pt.hasName("long") and result=64)
}
// Find any comparison where the width of the type on the smaller end of
// the comparison is less than the width of the type on the greater end
abstract class OverflowProneComparison extends ComparisonExpr {
Expr getLesserOperand() { none() }
Expr getGreaterOperand() { none() }
}
// Return `<=` and `<` comparisons
class LTOverflowProneComparison extends OverflowProneComparison {
LTOverflowProneComparison() {
(this instanceof LEExpr or this instanceof LTExpr) and
width(this.getLeftOperand().getType()) < width(this.getRightOperand().getType())
}
}
// Return `>=` and `>` comparisons
class GTOverflowProneComparison extends OverflowProneComparison {
GTOverflowProneComparison() {
(this instanceof GEExpr or this instanceof GTExpr) and
width(this.getRightOperand().getType()) < width(this.getLeftOperand().getType())
}
}
from OverflowProneComparison expr
where exists(LoopStmt l | l.getCondition().getAChildExpr*() = expr) and
not expr.getAnOperand().isCompileTimeConstant()
select expr