-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Step 10: Data flow and taint tracking analysis
Great! You made it to the final step!
In step 9 we found expressions in the source code that are likely to have integers supplied from remote input, because they are being processed with invocations of ntoh, ntohll, or ntohs. These can be considered sources of remote input.
In step 6 we found calls to memcpy. These calls can be unsafe when their length arguments are controlled by a remote user. Their length arguments can be considered sinks: they should not receive user-controlled values without further validation.
Combining these pieces of information,
we know that code is vulnerable if tainted data flows from a network integer source to a sink in the length argument of a memcpy call.
However, how do we know whether data from a particular source might reach a particular sink? This is known as data flow or taint tracking analysis. Given the number of results (hundreds of memcpy calls and a large number of macro invocations), it would be quite a lot of work to triage all these cases manually.
To make our triaging job easier, we will have CodeQL do this analysis for us.
You will now write a query to track the flow of tainted data from network-controlled integers to the memcpy length argument. As a result you will find 9 real vulnerabilities!
To achieve this, we’ll use the CodeQL taint tracking library. This library allows you to describe sources and sinks, and its predicate hasFlowPath holds true when tainted data from a given source flows to a sink.