This lab aims at finding vulnerabilities in glibc, provides us glibc‘s codeql snapshot and step by step hints.
alloca is used to allocate a buffer on stack. It is ususally implemented by simply substracting the size parameter from the stack pointer and returning the new value of the stack pointer. This means that it has two benefits.
The memory allocated by alloca is automatically freed when function returns.
It is extremely fast.
But alloca is not safe enough because it does not check whether there is enough stack space left for the buffer. If the requested buffer size is too big, then alloca may returns an invalid pointer. This can cause application to crash with a SIGSEGV when it attempts to read or write the buffer. Therefor alloca is only intent to be used to allocate small buffers.
The GNU C Library contains hundreds of calls to alloca . In this challenge, we should use CodeQL to find those calls, then determine which calls are unsafe.
Step 0
finding the definition of alloca
alloca is a macro. It is expanded to *__builtin_alloca which is a builtin function.
codeql provides FunctionCall class to reprent all calls in program.
FunctionCall has a predicate getTarget , which can get the callee Function class, and then we use new predicate to select all calls to __builtin_alloca
1 2 3
from FunctionCall fc where fc.getTarget().getName() = "__builtin_alloca" select fc
Step 1
filtering out small sizes. codeql provides upperBound and lowerBound to analyze expression range. We can use these predicates to check alloca paramter range. I consider [0, 65535] as safe range. calls to alloca unsafe if alloca size paramter out of [0, 65535]
1 2 3 4 5
from FunctionCall fc where fc.getTarget().hasQualifiedName("__builtin_alloca") and ( upperBound(fc.getArgument(0).getFullyConverted()) >=65535or lowerBound(fc.getArgument(0).getFullyConverted()) <0) select fc
Step 2.0 & 2.1
The correct way to use alloca
check the allocation is safe by calling __libc_use_alloca
if __libc_use_alloca returns true, then call __libc_use_alloca
Find all calls to __libc_use_alloca
Find all guard conditions where the condition is a call to __libc_use_alloca
Guard condition means a boolean condition that guards one or more basic blocks. CodeQL uses GuardCondition class to represent. GuardCondition has a predicate called controls
It holds if this condition controls specify basic block in parameter.
1 2 3 4 5 6
from FunctionCall fc, GuardCondition gc, FunctionCall fc2 where fc.getTarget().getName() = "__builtin_alloca" and fc2.getTarget().getName() = "__libc_use_alloca" and gc.controls(fc.getBasicBlock(), _) and gc.getAChild*() = fc2 select fc
gc.getAChild*() = fc2 makes sure guard condition based on __libc_use_alloca
* is Reflexive transitive closure, apply this predicate zero or more times(including itself)
For example:
__libc_use_alloca (alloca_used + key_len) is a guard condition and it controls alloca
Step 2.2
Sometimes the results of __libc_use_alloca is assigned to a variable, and later uesd as the guard condition.
In this case, we should use local data flow analysis to filter out indirect usage of __libc_use_alloca return value.
1 2 3 4 5 6 7 8 9
from FunctionCall fc, FunctionCall fc2, GuardCondition gc, DataFlow::Node source, DataFlow::Node sink where fc.getTarget().getName() = "__builtin_alloca" and fc2.getTarget().getName() = "__libc_use_alloca" and gc.controls(fc.getBasicBlock(), _) and DataFlow::localFlow(source , sink) and source.asExpr() = fc2 and sink.asExpr() = gc select gc
By this way, we can filter out this situation
Step2.3
Sometimes the call to __libc_use_alloca is wrapped in a call to __builtin_expect.
1 2 3 4 5 6 7 8 9 10 11 12
from FunctionCall fc, FunctionCall fc2, GuardCondition gc, DataFlow::Node source, DataFlow::Node sink where fc.getTarget().getName() = "__builtin_alloca" and fc2.getTarget().getName() = "__libc_use_alloca" and gc.controls(fc.getBasicBlock(), _) and DataFlow::localFlow(source , sink) and source.asExpr() = fc2 and sink.asExpr() = gc.getAChild*() //*is Reflexive transitive closure, apply this predicate zero or more times. // gc.getAChild*() means gc and gc's all chlidren // __builtin_expect(__libc_use_alloca()), __libc_use_alloca is __builtin_expect's child. select gc
Step2.4
Sometimes the result of __libc_use_alloca is negated with the ! operator.
1 2 3 4 5 6 7 8 9
from FunctionCall fc, FunctionCall fc2, GuardCondition gc, DataFlow::Node source, DataFlow::Node sink where fc.getTarget().getName() = "__builtin_alloca" and fc2.getTarget().getName() = "__libc_use_alloca" and gc.controls(fc.getBasicBlock(), _) and DataFlow::localFlow(source , sink) and source.asExpr() = fc2.getBasicBlock().getANode() // Here and sink.asExpr() = gc.getAChild*() select gc
Step2.5
Summarize safe alloca logic in a predicate
1 2 3 4 5 6 7 8 9
predicate isSafeAllocaCall(FunctionCall fc) { exists( FunctionCall fc2, GuardCondition gc, DataFlow::Node source, DataFlow::Node sink | fc2.getTarget().getName() = "__libc_use_alloca" and gc.controls(fc.getBasicBlock(), _) and DataFlow::localFlow(source , sink) and source.asExpr() = fc2.getBasicBlock().getANode() and sink.asExpr() = gc.getAChild*() ) }
Step3
combine step1 and step2, select out collection of safe alloca calls.
1 2 3 4 5 6
from FunctionCall fc where fc.getTarget().getName() = "__builtin_alloca" and isSafeAllocaCall(fc) and upperBound(fc.getArgument(0).getFullyConverted()) <65535 and lowerBound(fc.getArgument(0).getFullyConverted()) >0 select fc
Step4 Taint track
Find an unsafe call to alloca where the allocation size is controlled by a value read from a file. fopen is a macro, so I find out the actual function _IO_new_fopen . Write a taint tracking query. The source should be a call to fopen and the sink should be the size argument of an unsafe call to alloca.