ModuleShifting - Stealthier Variation Of Module Stomping And Module Overloading Injection Techniques That Reduces Memory IoCs

ModuleShifting is stealthier variation of Module Stomping and Module overloading injection technique. It is actually implemented in Python ctypes so that it can be executed fully in memory via a Python interpreter and Pyramid, thus avoiding the usage of compiled loaders.

The technique can be used with PE or shellcode payloads, however, the stealthier variation is to be used with shellcode payloads that need to be functionally independent from the final payload that the shellcode is loading.

ModuleShifting, when used with shellcode payload, is performing the following operations:

Legitimate hosting dll is loaded via LoadLibrary
Change the memory permissions of a specified section to RW
Overwrite shellcode over the target section
add optional padding to better blend into false positive behaviour (more information here)
Change permissions to RX
Execute shellcode via function pointer - additional execution methods: function callback or CreateThread API
Write original dll content over the executed shellcode - this step avoids leaving a malicious memory artifact on the image memory space of the hosting dll. The shellcode needs to be functionally independent from further stages otherwise execution will break.

When using a PE payload, ModuleShifting will perform the following operation:

Legitimate hosting dll is loaded via LoadLibrary
Change the memory permissions of a specified section to RW
copy the PE over the specified target point section-by-section
add optional padding to better blend into false positive behaviour
perform base relocation
resolve imports
finalize section by setting permissions to their native values (avoids the creation of RWX memory region)
TLS callbacks execution
Executing PE entrypoint

ModuleShifting can be used to inject a payload without dynamically allocating memory (i.e. VirtualAlloc) and compared to Module Stomping and Module Overloading is stealthier because it decreases the amount of IoCs generated by the injection technique itself.

There are 3 main differences between Module Shifting and some public implementations of Module stomping (one from Bobby Cooke and WithSecure)

Padding: when writing shellcode or PE, you can use padding to better blend into common False Positive behaviour (such as third-party applications or .net dlls writing x amount of bytes over their .text section).
Shellcode execution using function pointer. This helps in avoid a new thread creation or calling unusual function callbacks.
restoring of original dll content over the executed shellcode. This is a key difference.

The differences between Module Shifting and Module Overloading are the following:

The PE can be written starting from a specified section instead of starting from the PE of the hosting dll. Once the target section is chosen carefully, this can reduce the amount of IoCs generated (i.e. PE header of the hosting dll is not overwritten or less bytes overwritten on .text section etc.)
Padding that can be added to the PE payload itself to better blend into false positives.

Using a functionally independent shellcode payload such as an AceLdr Beacon Stageless shellcode payload, ModuleShifting is able to locally inject without dynamically allocating memory and at the moment generating zero IoC on a Moneta and PE-Sieve scan. I am aware that the AceLdr sleeping payloads can be caught with other great tools such as Hunt-Sleeping-Beacon, but the focus here is on the injection technique itself, not on the payload. In our case what is enabling more stealthiness in the injection is the shellcode functional independence, so that the written malicious bytes can be restored to its original content, effectively erasing the traces of the injection.

All information and content is provided for educational purposes only. Follow instructions at your own risk. Neither the author nor his employer are responsible for any direct or consequential damage or loss arising from any person or organization.

This work has been made possible because of the knowledge and tools shared by incredible people like Aleksandra Doniec @hasherezade, Forest Orr and Kyle Avery. I heavily used Moneta, PeSieve, PE-Bear and AceLdr throughout all my learning process and they have been key for my understanding of this topic.

ModuleShifting can be used with Pyramid and a Python interpreter to execute the local process injection fully in-memory, avoiding compiled loaders.

Clone the Pyramid repo:

git clone https://github.com/naksyn/Pyramid

Generate a shellcode payload with your preferred C2 and drop it into Pyramid Delivery_files folder. See Caveats section for payload requirements.
modify the parameters of moduleshifting.py script inside Pyramid Modules folder.
Start the Pyramid server: python3 pyramid.py -u testuser -pass testpass -p 443 -enc chacha20 -passenc superpass -generate -server 192.168.1.2 -setcradle moduleshifting.py
execute the generated cradle code on a python interpreter.

Caveats

To successfully execute this technique you should use a shellcode payload that is capable of loading an additional self-sustainable payload in another area of memory. ModuleShifting has been tested with AceLdr payload, which is capable of loading an entire copy of Beacon on the heap, so breaking the functional dependency with the initial shellcode. This technique would work with any shellcode payload that has similar capabilities. So the initial shellcode becomes useless once executed and there's no reason to keep it in memory as an IoC.

A hosting dll with enough space for the shellcode on the targeted section should also be chosen, otherwise the technique will fail.

Module Stomping and Module Shifting need to write shellcode on a legitimate dll memory space. ModuleShifting will eliminate this IoC after the cleanup phase but indicators could be spotted by scanners with realtime inspection capabilities.