CodeQL documentation

CodeQL 2.16.0 (2024-01-16)

This is an overview of changes in the CodeQL CLI and relevant CodeQL query and library packs. For additional updates on changes to the CodeQL code scanning experience, check out the code scanning section on the GitHub blog, relevant GitHub Changelog updates, changes in the CodeQL extension for Visual Studio Code, and the CodeQL Action changelog.

Security Coverage

CodeQL 2.16.0 runs a total of 405 security queries when configured with the Default suite (covering 160 CWE). The Extended suite enables an additional 128 queries (covering 33 more CWE). 4 security queries have been added with this release.

CodeQL CLI

Potentially Breaking Changes

  • The Python extractor will no longer extract dependencies by default. See https://github.blog/changelog/2023-07-12-code-scanning-with-codeql-no-longer-installs-python-dependencies-automatically-for-new-users/ for more context. In versions until 2.17.0, it will be possible to restore the old behavior by setting CODEQL_EXTRACTOR_PYTHON_FORCE_ENABLE_LIBRARY_EXTRACTION_UNTIL_2_17_0=1.

  • The --ram option to codeql database run-queries and other commands that execute queries is now interpreted more strictly. Previously it was mostly a rough hint for how much memory to use, and the actual memory footprint of the CodeQL process could be hundreds of megabytes higher. From this release, CodeQL tries harder to keep its total memory consumption during evaluation below the given limit.

    The new behavior yields more predictable memory use, but since it works by allocating less RAM, it can lead to more use of disk storage for intermediate results compared to earlier releases with the same --ram value, and consequently a slight performance loss. In rare cases, for large databases, analysis may fail with a Java OutOfMemoryError.

    The cure for this is to increase --ram to be closer to the amount of memory actually available for CodeQL. As a rule of thumb, it will usually be possible to increase the value of --ram by 700 MB or more, without actually using more resources than release 2.15.x would with the old setting. An exact amount cannot stated, however, since the actual memory footprint in earlier releases depended on factors such as the size of the databases that were not fully taken into account.

    If you use the CodeQL Action, you do not need to do anything unless you have manually overridden the Action’s RAM setting. The Action will automatically select a --ram setting that matches the version of the CLI it uses.

New Features

  • Users specifying extra tracing configurations may now use the GetRegisteredMatchers(languageId) Lua function to retrieve the existing table of matchers registered to a given language.

Improvements

  • The Experimental flag has been removed from all packaging and related commands.
  • The RA pretty-printer omits names of internal RA nodes and pretty-prints binary unions with nested internal unions as n-ary unions. VS Code extension v1.11.0 or newer is required to compute join order badness metrics in VS Code for the new RA format.

Query Packs

Bug Fixes

Java

  • The three queries java/insufficient-key-size, java/server-side-template-injection, and java/android/implicit-pendingintents had accidentally general extension points allowing arbitrary string-based flow state. This has been fixed and the old extension points have been deprecated where possible, and otherwise updated.

Minor Analysis Improvements

C/C++

  • The cpp/badly-bounded-write query could report false positives when a pointer was first initialized with a literal and later assigned a dynamically allocated array. These false positives now no longer occur.

C#

  • Fixed a Log forging false positive when using String.Replace to sanitize the input.
  • Fixed a URL redirection from remote source false positive when guarding a redirect with HttpRequestBase.IsUrlLocalToHost()

Golang

  • There was a bug in the query go/incorrect-integer-conversion which meant that upper bound checks using a strict inequality (<) and comparing against math.MaxInt or math.MaxUint were not considered correctly, which led to false positives. This has now been fixed.

Java

  • Modified the java/potentially-weak-cryptographic-algorithm query to include the use of weak cryptographic algorithms from configuration values specified in properties files.
  • The query java/android/missing-certificate-pinning should no longer alert about requests pointing to the local filesystem.
  • Removed some spurious sinks related to com.opensymphony.xwork2.TextProvider.getText from the query java/ognl-injection.

Swift

  • Added additional sinks for the “Cleartext logging of sensitive information” (swift/cleartext-logging) query. Some of these sinks are heuristic (imprecise) in nature.

New Queries

C/C++

  • Added a new query, cpp/use-of-unique-pointer-after-lifetime-ends, to detect uses of the contents unique pointers that will be destroyed immediately.
  • The cpp/incorrectly-checked-scanf query has been added. This finds results where the return value of scanf is not checked correctly. Some of these were previously found by cpp/missing-check-scanf and will no longer be reported there.

Java

  • Added the java/insecure-randomness query to detect uses of weakly random values which an attacker may be able to predict. Also added the crypto-parameter sink kind for sinks which represent the parameters and keys of cryptographic operations.

Language Libraries

Bug Fixes

C/C++

  • Under certain circumstances a function declaration that is not also a definition could be associated with a Function that did not have the definition as a FunctionDeclarationEntry. This is now fixed when only one definition exists, and a unique Function will exist that has both the declaration and the definition as a FunctionDeclarationEntry.

Python

  • We would previously confuse all captured variables into a single scope entry node. Now they each get their own node so they can be tracked properly.
  • The dataflow graph no longer contains SSA variables. Instead, flow is directed via the corresponding controlflow nodes. This should make the graph and the flow simpler to understand. Minor improvements in flow computation has been observed, but in general negligible changes to alerts are expected.

Major Analysis Improvements

Python

  • Added support for global data-flow through captured variables.

Minor Analysis Improvements

C/C++

  • Changed the output of Node.toString to better reflect how many indirections a given dataflow node has.
  • Added a new predicate Node.asDefinition on DataFlow::Nodes for selecting the dataflow node corresponding to a particular definition.
  • The deprecated DefaultTaintTracking library has been removed.
  • The Guards library has been replaced with the API-compatible IRGuards implementation, which has better precision in some cases.

C#

  • The Call::getArgumentForParameter predicate has been reworked to add support for arguments passed to params parameters.
  • The dataflow models for the System.Text.StringBuilder class have been reworked. New summaries have been added for Append and AppendLine. With the changes, we expect queries that use taint tracking to find more results when interpolated strings or StringBuilder instances are passed to Append or AppendLine.
  • Additional support for Amazon.Lambda SDK

Golang

  • The diagnostic query go/diagnostics/successfully-extracted-files, and therefore the Code Scanning UI measure of scanned Go files, now considers any Go file seen during extraction, even one with some errors, to be extracted / scanned.
  • The XPath library, which is used for the XPath injection query (go/xml/xpath-injection), now includes support for Parser sinks from the libxml2 package.
  • CallNode::getACallee and related predicates now recognise more callees accessed via a function variable, in particular when the callee is stored into a global variable or is captured by an anonymous function. This may lead to new alerts where data-flow into such a callee is relevant.

Java

  • Added the Map#replace and Map#replaceAll methods to the MapMutator class in semmle.code.java.Maps.
  • Taint tracking now understands Kotlin’s Array.get and Array.set methods.
  • Added a sink model for the createRelative method of the org.springframework.core.io.Resource interface.
  • Added source models for methods of the org.springframework.web.util.UrlPathHelper class and removed their taint flow models.
  • Added models for the following packages:
    • com.google.common.io
    • hudson
    • hudson.console
    • java.lang
    • java.net
    • java.util.logging
    • javax.imageio.stream
    • org.apache.commons.io
    • org.apache.hadoop.hive.ql.exec
    • org.apache.hadoop.hive.ql.metadata
    • org.apache.tools.ant.taskdefs
  • Added models for the following packages:
    • com.alibaba.druid.sql.repository
    • jakarta.persistence
    • jakarta.persistence.criteria
    • liquibase.database.jvm
    • liquibase.statement.core
    • org.apache.ibatis.mapping
    • org.keycloak.models.map.storage

Python

  • Captured subclass relationships ahead-of-time for most popular PyPI packages so we are able to resolve subclass relationships even without having the packages installed. For example we have captured that flask_restful.Resource is a subclass of flask.views.MethodView, so our Flask modeling will still consider a function named post on a class Foo(flask_restful.Resource): as a HTTP request handler.
  • Python now makes use of the shared type tracking library, exposed as semmle.python.dataflow.new.TypeTracking. The existing type tracking library, semmle.python.dataflow.new.TypeTracker, has consequently been deprecated.

Ruby

  • Parsing of division operators (/) at the end of a line has been improved. Before they were wrongly interpreted as the start of a regular expression literal (/.../) leading to syntax errors.
  • Parsing of case statements that are formatted with the value expression on a different line than the case keyword has been improved and should no longer lead to syntax errors.
  • Ruby now makes use of the shared type tracking library, exposed as codeql.ruby.typetracking.TypeTracking. The existing type tracking library, codeql.ruby.typetracking.TypeTracker, has consequently been deprecated.

Swift

  • Expanded flow models for UnsafePointer and similar classes.
  • Added flow models for non-member withUnsafePointer and similar functions.
  • Added flow models for withMemoryRebound, assumingMemoryBound and bindMemory member functions of library pointer classes.
  • Added a sensitive data model for SecKeyCopyExternalRepresentation.
  • Added imprecise flow models for append and insert methods, and initializer calls with a data argument.
  • Tyes for patterns are now included in the database and made available through the Pattern::getType() method.

Deprecated APIs

C/C++

  • The isUserInput, userInputArgument, and userInputReturned predicates from SecurityOptions have been deprecated. Use FlowSource instead.

Java

  • Imports of the old dataflow libraries (e.g. semmle.code.java.dataflow.DataFlow2) have been deprecated in the libraries under the semmle.code.java.security namespace.

New Features

C/C++

  • UserDefineLiteral and DeductionGuide classes have been added, representing C++11 user defined literals and C++17 deduction guides.

Shared Libraries

Deprecated APIs

Dataflow Analysis

  • © GitHub, Inc.
  • Terms
  • Privacy