GitHub Security Lab CTF 1: SEGV hunt

Introduction

This lab aims at finding vulnerabilities in glibc, provides us glibc‘s codeql snapshot and step by step hints.

alloca is used to allocate a buffer on stack. It is ususally implemented by simply substracting the size parameter from the stack pointer and returning the new value of the stack pointer. This means that it has two benefits.

The memory allocated by alloca is automatically freed when function returns.
It is extremely fast.

But alloca is not safe enough because it does not check whether there is enough stack space left for the buffer. If the requested buffer size is too big, then alloca may returns an invalid pointer. This can cause application to crash with a SIGSEGV when it attempts to read or write the buffer. Therefor alloca is only intent to be used to allocate small buffers.

The GNU C Library contains hundreds of calls to alloca . In this challenge, we should use CodeQL to find those calls, then determine which calls are unsafe.

Step 0

finding the definition of alloca

alloca is a macro. It is expanded to *__builtin_alloca which is a builtin function.

codeql provides FunctionCall class to reprent all calls in program.

FunctionCall has a predicate getTarget , which can get the callee Function class, and then we use new predicate to select all calls to __builtin_alloca

1
2
3

from FunctionCall fc
where fc.getTarget().getName() = "__builtin_alloca"
select fc

Step 1

filtering out small sizes. codeql provides upperBound and lowerBound to analyze expression range. We can use these predicates to check alloca paramter range. I consider [0, 65535] as safe range. calls to alloca unsafe if alloca size paramter out of [0, 65535]

from FunctionCall fc
where fc.getTarget().hasQualifiedName("__builtin_alloca") and
    (   upperBound(fc.getArgument(0).getFullyConverted()) >= 65535 or
        lowerBound(fc.getArgument(0).getFullyConverted()) < 0)
select fc

Step 2.0 & 2.1

The correct way to use alloca

check the allocation is safe by calling __libc_use_alloca
if __libc_use_alloca returns true, then call __libc_use_alloca

Find all calls to __libc_use_alloca

Find all guard conditions where the condition is a call to __libc_use_alloca

Guard condition means a boolean condition that guards one or more basic blocks. CodeQL uses GuardCondition class to represent. GuardCondition has a predicate called controls

It holds if this condition controls specify basic block in parameter.

from FunctionCall fc, GuardCondition gc, FunctionCall fc2
where fc.getTarget().getName() = "__builtin_alloca"
    and fc2.getTarget().getName() = "__libc_use_alloca"
    and gc.controls(fc.getBasicBlock(), _)
    and gc.getAChild*() = fc2
select fc

gc.getAChild*() = fc2 makes sure guard condition based on __libc_use_alloca

* is Reflexive transitive closure, apply this predicate zero or more times(including itself)

For example:

__libc_use_alloca (alloca_used + key_len) is a guard condition and it controls alloca

Step 2.2

Sometimes the results of __libc_use_alloca is assigned to a variable, and later uesd as the guard condition.

In this case, we should use local data flow analysis to filter out indirect usage of __libc_use_alloca return value.

from FunctionCall fc, FunctionCall fc2, GuardCondition gc,
DataFlow::Node source, DataFlow::Node sink
where fc.getTarget().getName() = "__builtin_alloca"
    and fc2.getTarget().getName() = "__libc_use_alloca"
    and gc.controls(fc.getBasicBlock(), _)
    and DataFlow::localFlow(source , sink)
    and source.asExpr() = fc2
    and sink.asExpr() = gc
select gc

By this way, we can filter out this situation

Step2.3

Sometimes the call to __libc_use_alloca is wrapped in a call to __builtin_expect.

from FunctionCall fc, FunctionCall fc2, GuardCondition gc,
DataFlow::Node source, DataFlow::Node sink
where fc.getTarget().getName() = "__builtin_alloca"
    and fc2.getTarget().getName() = "__libc_use_alloca"
    and gc.controls(fc.getBasicBlock(), _)
    and DataFlow::localFlow(source , sink)
    and source.asExpr() = fc2
    and sink.asExpr() = gc.getAChild*() 
    // * is Reflexive transitive closure, apply this predicate zero or more times.
    //   gc.getAChild*() means gc and gc's all chlidren
    // __builtin_expect(__libc_use_alloca()), __libc_use_alloca is __builtin_expect's child.
select gc

Step2.4

Sometimes the result of __libc_use_alloca is negated with the ! operator.

from FunctionCall fc, FunctionCall fc2, GuardCondition gc,
DataFlow::Node source, DataFlow::Node sink
where fc.getTarget().getName() = "__builtin_alloca"
    and fc2.getTarget().getName() = "__libc_use_alloca"
    and gc.controls(fc.getBasicBlock(), _)
    and DataFlow::localFlow(source , sink)
    and source.asExpr() = fc2.getBasicBlock().getANode() // Here
    and sink.asExpr() = gc.getAChild*() 
 select gc

Step2.5

Summarize safe alloca logic in a predicate

predicate isSafeAllocaCall(FunctionCall fc) {
     exists( FunctionCall fc2, GuardCondition gc, DataFlow::Node source, DataFlow::Node sink
        |   fc2.getTarget().getName() = "__libc_use_alloca"
            and gc.controls(fc.getBasicBlock(), _)
            and DataFlow::localFlow(source , sink)
            and source.asExpr() = fc2.getBasicBlock().getANode()
            and sink.asExpr() = gc.getAChild*() 
        )   
 }

Step3

combine step1 and step2, select out collection of safe alloca calls.

from FunctionCall fc
where fc.getTarget().getName() = "__builtin_alloca"
    and  isSafeAllocaCall(fc)
    and upperBound(fc.getArgument(0).getFullyConverted()) < 65535 
    and lowerBound(fc.getArgument(0).getFullyConverted()) > 0
select fc

Step4 Taint track

Find an unsafe call to alloca where the allocation size is controlled by a value read from a file. fopen is a macro, so I find out the actual function _IO_new_fopen . Write a taint tracking query. The source should be a call to fopen and the sink should be the size argument of an unsafe call to alloca.

import cpp
import semmle.code.cpp.rangeanalysis.SimpleRangeAnalysis
import semmle.code.cpp.dataflow.TaintTracking
import semmle.code.cpp.models.interfaces.DataFlow
import semmle.code.cpp.controlflow.Guards
import DataFlow::PathGraph

// Track taint through `__strnlen`.
class StrlenFunction extends DataFlowFunction {
  StrlenFunction() { this.getName().matches("%str%len%") }

  override predicate hasDataFlow(FunctionInput i, FunctionOutput o) {
    i.isParameter(0) and o.isReturnValue()
  }
}

// Track taint through `__getdelim`.
class GetDelimFunction extends DataFlowFunction {
  GetDelimFunction() { this.getName().matches("%get%delim%") }

  override predicate hasDataFlow(FunctionInput i, FunctionOutput o) {
    i.isParameter(3) and o.isParameterDeref(0)
  }
}

class Config extends TaintTracking::Configuration {
  Config() { this = "fopen_to_alloca_taint" }

  override predicate isSource(DataFlow::Node source) {
    exists(FunctionCall fc | 
        fc.getTarget().getName() = "_IO_new_fopen"
        and source.asExpr() = fc
        )
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(Expr sizeExpr, FunctionCall alloca|
        alloca.getTarget().getName() = "__builtin_alloca"
        and not isSafeAllocaCall(alloca)
        and (upperBound(alloca.getArgument(0).getFullyConverted()) >= 65535 or upperBound(alloca.getArgument(0).getFullyConverted()) < 0)
        and sizeExpr = alloca.getArgument(0).getFullyConverted()
        and sink.asExpr() = sizeExpr
        )
  }
}

from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "fopen flows to alloca"

finally，I found two vulnerable usage of alloca , whose size parameter can be control from user input file.

codeql

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

SCTF2021: ret2text 出题思路 Previous

codeql lab Next

Writeup for GitHub Security Lab CTF 1: SEGV hunt