Finding unhandled errors using CodeQL

By Fredrik Dahlgren

One of your developers finds a bug in your codebase—an unhandled error code—and wonders whether there could be more. He combs through the code and finds unhandled error after unhandled error. One lone developer playing whack-a-mole. It’s not enough. And your undisciplined team of first-year Stanford grads never learned software engineering. You’re doomed.

Good developers know that unhandled errors can be exploitable and cause serious problems in a codebase. Take CVE-2018-1002105, a critical vulnerability in Kubernetes allowing attackers to leverage incorrectly handled errors to establish a back-end connection through the Kubernetes API.

At Trail of Bits, we find issues like this all the time, and we know that there are better ways to find the rest than manually searching for them one by one. One particularly recent discovery of this problem motivated us to write this post. Rather than manually sifting through the codebase like our poor developer playing whack-a-mole, we used CodeQL to conduct a variant analysis (taking an existing vulnerability and searching for similar patterns). In this post, we’ll walk you through how we used CodeQL to whack all the moles at once.

Building a CodeQL database

To be able to run CodeQL queries against the codebase, we first needed to build a CodeQL database. Typically, this is done with the CodeQL CLI using the following command:

codeql database create -l <language> -c '<build command>' <database name>

In our case, the codebase under audit was developed on Windows using Visual Studio. Since CodeQL does not integrate with Visual Studio directly, we used MSBuild.exe to build solutions from the Windows command line using the following command:

codeql database create -l cpp -c 'MSBuild.exe <solution>.sln' <solution>.codeql

Setting up a custom query pack

To be able to run queries against the database, we defined a custom query pack, or a QL pack, which contains query metadata. Query packs can also be used to define custom query suites and query test suites. (If this makes your heart beat faster, see here and here.) In the same directory housing our custom queries, we created a file named qlpack.yml with the following content:

name: <some snazzy QL pack name>
version: 0.0.1
libraryPathDependencies: [codeql-cpp]

This file defines the custom query pack and its dependencies. The last line of the qlpack.yml file simply indicates that the query pack depends on the built-in CodeQL libraries for C and C++, which we need to get off the ground.

Finding unhandled errors

For this post, let’s say that the codebase used a custom error type called CustomErrorType to propagate errors. To locate all calls to functions returning CustomErrorType, we started by creating a new CodeQL type called CustomError:

class CustomError extends FunctionCall {
    CustomError() {
        this.getUnderlyingType().getName() = "CustomErrorType"
    }
}

Since return values are represented as function calls in CodeQL, it makes sense to extend the FunctionCall type (which is, in fact, a subtype of the more general Expr type used to model arbitrary expressions). Using this.getUnderlyingType() ensures that the name of the underlying type is CustomErrorType. This means that we capture all function calls in which the return type is either CustomErrorType or any typedef that resolves to CustomErrorType.

To test that the CustomError class does what we expect, we simply ran a query over the codebase and selected all CustomErrorType return values. To do so, we added the following select clause immediately below the class definition:

 
from 
    CustomError ce
select
    ce.getLocation(), 
    "Unhandled error code in ", ce.getEnclosingFunction().getName(), 
    "Error code returned by ", ce.getTarget().getName()

Here, ce.getEnclosingFunction() returns the function containing the CustomErrorType instance (i.e., the calling function), and ce.getTarget() returns the target function of the underlying FunctionCall (i.e., the called function).

We saved the file under a descriptive and colorful name—for this post, let’s call it UnhandledCustomError.ql. To run the query, we wrote the following:

codeql query run -d <database name here> UnhandledCustomError.ql

This query returns all call sites of functions in the codebase that return a value of type CustomErrorType, along with the names of the calling function and the called function.

Developing new queries iteratively in this way—by first over-approximating the vulnerability class you’re trying to model and then successively refining the query to prune false positives—makes it easier to catch mistakes as they happen since actually running a query on a codebase is a bit of a black box.

So what is an unhandled error?

To be able to restrict the results to unhandled errors, we need to define what it means to handle an error using CodeQL. Intuitively, handling an error means that the return value is acted upon and affects control flow in some way. This idea can be captured using CodeQL by checking whether the return value taints the condition of a branching statement, like an if statement, a while statement, or a switch statement. As CodeQL supports both local and global taint tracking, we had a choice of how to model this.

In our case, we were initially a bit concerned about how CodeQL’s global taint tracking engine would handle itself on a larger codebase, so we decided to try modeling the problem using local taint tracking. As a first approximation, we considered cases in which the returned error code directly affected the control flow of the calling function in some way. To capture these cases, we added the following predicate to the CustomError CodeQL type (thus, this below refers to an instance of CustomError):

 
// True if the return value is checked locally.
predicate isChecked() {
    // The return value flows into the condition of an if-statement.    	    
    exists (IfStmt is |
        TaintTracking::localTaint(
            DataFlow::exprNode(this),
            DataFlow::exprNode(is.getCondition().getAChild*())
        )
    ) or
    // The return value flows into the condition of a while-statement.
    exists (WhileStmt ws |
        TaintTracking::localTaint(
            DataFlow::exprNode(this),
            DataFlow::exprNode(ws.getCondition().getAChild*())
        )
    ) or
    // The return value flows into the condition of a switch-statement.
    exists (SwitchStmt ss |
        TaintTracking::localTaint(
            DataFlow::exprNode(this), 
            DataFlow::exprNode(ss.getExpr().getAChild*())
        )
    )
}

Since TaintTracking::localTaint only models local data flow, we did not need to require that the conditional statement (the sink) is located in the same function as the returned error (the source). With local taint tracking, we got this for free.

Our intuition told us that we wanted to model taint flowing into the condition of a branching statement, but looking at how this is modeled, we actually required taint to flow into a sub-expression of the condition. For example, is.getCondition().getAChild*() returns a sub-expression of the if statement condition. (The * indicates that the operation is applied 0 or more times. You can use + for 1 or more times.)

It may not be immediately obvious why we needed to use getAChild() here. If this taints a sub-expression of the if statement condition C, then it is natural to assume that this would also taint the entire condition. However, looking at the CodeQL taint tracking documentation, it is clear that taint propagates only from a source to a sink if a “substantial part of the information from the source is preserved at the sink.” In particular, boolean expressions (which carry only a single bit of information) are not automatically considered tainted by their individual sub-expressions. Thus, we needed to use getAChild() to capture that this taints a sub-expression of the condition.

It is worth mentioning that it would have been possible to use the DataFlow module to model local data flow: both DataFlow::localFlow and TaintTracking::localTaint can be used to capture local data flow. However, since DataFlow::localFlow is used to track only value-preserving operations, it made more sense in our case to use the more general TaintTracking::localTaint predicate. This allowed us to catch expressions like the following, in which the returned error is mutated before it is checked:

    if ( ((CustomErrorType)(response.GetStatus(msg) & 0xFF)) == NO_ERROR )
    {
    	[…]
    }

To restrict the output from the CodeQL select statement, we added a where clause to the query:

 
from 
    CustomError ce
where
    not ce.isChecked()
select
    ce.getLocation(), 
    "Unhandled error code in ", ce.getEnclosingFunction().getName(), 
    "Error code returned by ", ce.getTarget().getName()

Refining the query

Running the query again, we noticed that it found numerous locations in the codebase in which a returned error was not handled correctly. However, it also found a lot of false positives in which the return value did affect control flow globally in some way. Reviewing some of the results manually, we noticed three overarching classes of false positives:

The returned error was simply returned from the enclosing function and passed down the call chain.
The returned error was passed as an argument to a function (which hopefully acted upon the error in some meaningful way).
The returned error was assigned to a class member variable (which could then be checked elsewhere in the codebase).

The local behavior in all three of these cases could clearly be modeled using local taint tracking.

First, to exclude all cases in which the returned error was used to update the return value of the calling function, we added the following predicate to the CustomError class:

// The return value is returned from the enclosing function.
predicate isReturnValue() {
    exists (ReturnStmt rs |
        TaintTracking::localTaint(
            DataFlow::exprNode(this),
            DataFlow::exprNode(rs.getExpr())
        )
    )
}

Second, to filter out cases in which the return value was passed as an argument to some other function, we added the following predicate:

// The return value is passed as an argument to another function.
predicate isPassedToFunction() {
    exists (FunctionCall fc |
        TaintTracking::localTaint(
            DataFlow::exprNode(this),
            DataFlow::exprNode(fc.getAnArgument())
        )
    )
}

Again, since TaintTracking::localTaint only models local data flow, we did not need to require that the enclosing function of the FunctionCall node fc is identical to the enclosing function of this.

Finally, to model the case in which the returned error was used to update the value of a class member variable, we needed to express the fact that the calling function was a class method and that the return value was used to update the value of a member variable on the same class. We modeled this case by casting the enclosing function to a MemberFunction and then requiring that there is a member variable on the same object that is tainted by this.

 
// Test if the return value is assigned to a member variable.
predicate isAssignedToMemberVar() {
    exists (MemberVariable mv, MemberFunction mf |
        mf = this.getEnclosingFunction() and
        mf.canAccessMember(mv, mf.getDeclaringType()) and
        TaintTracking::localTaint(
            DataFlow::exprNode(this), 
            DataFlow::exprNode(mv.getAnAccess())
        )
    )
}

Note that it is not enough to require that data flows from this to an access of a member variable. If you do not restrict mv further, mv could be a member of any class defined in the codebase. Clearly, we also needed to require that the calling function is a method on some class and that the member variable is a member of the same class. We captured this requirement using the predicate canAccessMember, which is true when the enclosing method mf can access the member variable mv in the context of the mf.getDeclaringType() class.

Running the updated version of the query, we then noticed that some of the results were from unit tests. Since issues in unit tests are typically of less interest, we wanted to exclude them from the final result. This, of course, could easily be done using grep -v on the resulting output from codeql, but it could also be done by restricting the location of the call site using CodeQL itself.

To filter by file path, we defined a new class called IgnoredFile, which captured the type of files we wanted to exclude from the result. In this case, we excluded any file with an absolute path containing the word "test":

 
class IgnoredFile extends File {
    IgnoredFile() {
        this.getAbsolutePath().matches("%test%")
    }
}

We then added the following line to the where clause of the final query, which excluded all the locations that we were less interested in:

not ce.getFile() instanceof IgnoredFile

The final query resulted in slightly over 100 code locations that we were able to review and verify manually. For reference, the final query is located here.

But what about global data flow?

CodeQL supports global data flow and taint tracking through the DataFlow::Configuration and TaintTracking::Configuration classes. As explained earlier, we can use the DataFlow module to track value-preserving operations and the TaintTracking module to track more general flows in which the value may be updated along the flow path. While we were initially concerned that the codebase under review was too large for CodeQL’s global taint tracking engine to handle, we were also curious to see whether a global analysis would give us more accurate results than a local analysis could. As it turns out, the query using global data flow was easier to express, and it achieved more accurate results with the same running time as the query using local data flow!

Since we didn’t want to restrict ourselves to value-preserving operations, we needed to extend the TaintTracking::Configuration class. To do so, we defined what it means to be a source and a sink by overriding the isSource and isSink predicates as follows:

 
class GuardConfiguration extends TaintTracking::Configuration {
    GuardConfiguration() { this = "GuardConfiguration" }
  
    override predicate isSource(DataFlow::Node source) {
        source.asExpr().(FunctionCall).getUnderlyingType().getName() =
            "CustomErrorType"
    }
  
    override predicate isSink(DataFlow::Node sink) {
        exists (IfStmt is | sink.asExpr() = is.getCondition().getAChild*()) or
        exists (WhileStmt ws | sink.asExpr() = ws.getCondition().getAChild*()) or
        exists (SwitchStmt ss | sink.asExpr() = ss.getExpr().getAChild*())
    }
}

We then redefined the predicate CustomError::isChecked in terms of global taint tracking as follows:

 
class CustomError extends FunctionCall {
    CustomError() {
        this.getUnderlyingType().getName() = "CustomErrorType"
           }
  
    predicate isCheckedAt(Expr guard) {
        exists (GuardConfiguration config |
            config.hasFlow(
                DataFlow::exprNode(this), 
                DataFlow::exprNode(guard)
            )
        )
    }
  
    predicate isChecked() {
        exists (Expr guard | this.isCheckedAt(guard))
    }
}

That is, the return error is handled if it taints the condition of an if, while, or switch statement anywhere in the codebase. This actually made the entire query much simpler.

Interestingly, the run time for the global analysis turned out to be about the same as the run time for local taint tracking (about 20 seconds to compile the query and 10 seconds to run it on a 2020 Intel i5 MacBook Pro).

Running the query using global taint tracking gave us over 200 results. By manually reviewing these results, we noticed that error codes often ended up being passed to a function that created a response to the user of one of the APIs defined by the codebase. Since this was expected behavior, we excluded all such cases from the end result. To do so, we simply added a single line to the definition of GuardCondition::isSink as follows:

 
override predicate isSink(DataFlow::Node sink) {
    exists (ReturnStmt rs | sink.asExpr() = rs.getExpr()) or
    exists (IfStmt is | sink.asExpr() = is.getCondition().getAChild*()) or
    exists (WhileStmt ws | sink.asExpr() = ws.getCondition().getAChild*()) or
    exists (SwitchStmt ss | sink.asExpr() = ss.getExpr().getAChild*()) or
    exists (IgnoredFunctionCall fc | sink.asExpr() = fc.getExtParam())
}

Here, IgnoredFunctionCall is a custom type capturing a call to the function generating responses to the user. Running the query, we ended up with around 150 locations that we could go through manually. In the end, the majority of the locations identified using CodeQL represented real issues that needed to be addressed by the client. The updated file UnhandledCustomError.ql can be found here.

At Trail of Bits, we often say that we never want to see the same bug twice in a client’s codebase, and to make sure that doesn’t happen, we often deliver security tooling like fuzzing harnesses and static analysis tools with our audit reports. In this respect, tools like CodeQL are great, as they let us encode our knowledge about a bug class as a query that anyone can run and benefit from—in effect, ensuring that we never see that particular bug ever again.