Emulation is a technique used in many disciplines and it is especially important in hardware hacking, for example in research involving firmware emulation. It makes it possible to replicate the behavior of a physical device or program virtually. In the context of cybersecurity, it is particularly interesting for auditing and finding vulnerabilities.
A concrete use case for this type of technique can be found, for example, in an authentication service with a maximum number of attempts. We can perform infinite tests if we emulate the device and restart it when the maximum number of attempts runs out. Another practical case of emulation applied to cybersecurity is found in penetration testing; we can use combined techniques (fuzzing + emulation) to attack a service without the need to use the physical device.
Mainly, there are two types of emulation:
- Partial, or in user space, when a specific service or executable is emulated.
- Total, or system-wide, when the entire system is emulated.
The complexity of the total emulation process is almost always greater than the partial one. However, when performing a partial emulation, it is important to make sure that we have all the necessary inputs for its correct emulation. For example, if we run a service that depends on a third-party piece of software or that requires a specific file or peripheral, it will be our responsibility to provide it during the emulation.
The motivation for firmware emulation of IoT devices during hardware pentesting analysis is based on two principles: speed up testing and attacks on the device and remove the need of the physical device. Through firmware emulation, different strategies can be addressed in parallel without affecting the integrity of the physical device and avoiding the need to acquire more devices.
Thus, by acquiring a single device, it is possible to extract its firmware, emulate it totally or partially and attack it using different techniques (fuzzing, attack frameworks, etc.). If the necessary resources are available, these attacks can be parallelized, obtaining better results in less time.
Currently, there are different solutions for device emulation. Some of the most popular are QEMU and Unicorn, but there are many others, some of which will be discussed in this article.
1. Firmware emulation, the path
Thanks to firmware emulation and these tools, the researcher can investigate in greater detail the operation of the device/executable he wants to study without having the source code. Among the important information that the researcher must know is the architecture of the system to be studied, its endianness and the memory regions associated or referred to the emulated executable.
When emulating a device during a hardware hacking analysis, regardless of whether the emulation is total or partial, a major problem is found: replicating the context. In the real world, there is a close dependency between the hardware and software dimensions of a device. Thus, if we change the architecture of the system, the software will probably not be able to run. Similarly, if we change the software of the system to that of another, the hardware is likely to not function properly.
At the emulation level, we will need to replicate in one way or another the context, architecture and sometimes the peripherals of the device to achieve a successful emulation. Throughout this article we will present a set of tools and best practices to successfully perform the emulation process.
The firmware emulation process is closely dependent on the previous steps of the OWASP-FSTM methodology. OWASP step 2 details how the firmware is obtained; OWASP step 4 details how it is extracted… In addition, it’s necessary, or highly recommended, to have a clear idea of what hardware is being worked on to emulate it according to its characteristics (OWASP step 1 describes recommendations and best practices for obtaining this type of information).
As previously mentioned, there are two types of emulation: partial (or user space) and total (or system) emulation. In both cases it is of vital importance to know the context on which you want to emulate. The same dependence that exists between the HW and SW of a physical device will be found when performing emulation. Therefore, we must be able to replicate the context (architecture, peripherals…) to be successful.
When emulating a process or a complete firmware, all the information gathered must be considered to choose the most fruitful technique, as this is usually a task that can require many hours of work. Therefore, it is necessary to evaluate whether it is necessary to do a system emulation or if it is possible to reduce the effort dedicated to this task by emulating what is necessary to run a single process of all the software present in a device.
2. Firmware emulation ways
Once this decision has been made, there are multiple options for emulation in increasing order of effort:
- Emulation in user space for an executable or a service. In this case the emulator would oversee simulating a generic kernel that would allow loading that executable from another architecture. Some of the tools that allow this are QEMU or Qiling.
- Emulation in user space with simulated files. Sometimes, executables require access to devices or files with hardware-dependent behaviors. Sometimes we can replicate these devices that are accessed through operations on a file with software tools such as CUSE (Character device in User space) or FUSE (Filesystem in User space).
- System emulation without bootloader. Sometimes we need to emulate the execution of a kernel so that the processes or services work properly. In this case some tools can load the kernel in memory and run it directly without the need of a bootloader. This option may require us to develop software to emulate some physical devices that the emulated kernel needs to boot. This can be a relatively simple or very complicated task depending on the device documentation available.
- Full system emulation. There are firmware that use undocumented and obfuscated compression and/or encryption methods. In these cases, it is sometimes interesting to emulate the complete system from its earliest initialization. This can be an arduous task since when this option is used it is usually due to a lack of documentation. When we opt for this emulation modality, we must be aware that it will require considerable effort to achieve results.
3. Emulation tools
The main current tools include QEMU, Unicorn, Renode, Qiling and Firmadyne.
3.1 QEMU
QEMU is a virtualizer and machine emulator that has different modes of operation. The most common is “system emulation”, where it provides a complete virtual machine model (CPU, memory, and virtual devices) to run the emulated system. In this mode you can emulate the complete system if the emulated system and the host are the same. It can also work from supervisor such as KVM, Xen, Hax, or Hypervisor.
Development environments allow the guest system to emulate the system on the host CPU. Additionally, it allows to “emulate executables” (user space emulation) through qemu-arch executables, which only emulate the architecture of the source CPU of the executable and translate the instructions to the target architecture.
In both cases, it can be useful to compile or install the static versions of the QEMU executables, usually distinguished with the -static suffix, since these versions, unlike the dynamic ones, do not depend on the host system libraries. This is especially interesting for building a virtual environment for executable emulation, with tools such as chroot, in which the libraries and executables of the host architecture should not be installed.
3.2 Unicorn
Unicorn is an alternative to QEMU, focusing on the emulation of multiple CPU architectures. Unicorn is defined as a CPU emulator and is abstracted from the rest of the system by a simple API with support for multiple languages.
The idea behind Unicorn is to provide an extremely flexible and high-performance emulator in terms of emulation speed. This makes Unicorn one of the options to turn to when you need to emulate a system that requires a relatively heavy computational load because it reduces execution times significantly compared to its alternatives. It is also useful for simulating isolated chunks of code with limited context since its flexibility allows us to emulate the context and limit execution to arbitrary sections of code.
This flexibility also allows us to perform code analysis dynamically at a very low level. Unicorn, through its API, always exposes all the states and contexts of the emulation.
Although it is true that this emulator provides considerable advantages (such as its power) compared to other alternatives, it requires a large investment of time. Thus, to perform an emulation, advanced knowledge of the platform to be emulated is required and a large investment of time to write the code that allows us to launch its executions (on numerous occasions, peripherals and other components must be emulated at a low level for the system to work properly).
3.3 Renode
Renode is another QEMU-based tool for emulating firmware and other systems. Its functionalities include interaction between multiple virtual processors, with virtualized memory and with multiple types of devices such as sensors, displays and other system inputs and outputs. It also allows connecting the emulated environment with hardware implemented in FPGA systems.
Unlike QEMU, Renode has been primarily designed for emulation of IoT and embedded devices with real-time operating systems, although it is also capable of emulating more powerful systems, such as Linux.
To emulate a system with Renode, it is required to know how the communications with the devices are organized, which, in case of using the usual Memory Mapped I/O (MMIO) communication method, is about finding which devices are assigned to each memory region. Although this type of information can be obtained from the manufacturer’s manuals of the IoT device being analyzed, it can also be found in the Device Tree Blob (DTB) data blocks that are usually present in firmware.
The way to configure Renode for emulating a system is often to start with a minimal set of devices and run the system until failures occur, then investigate the failures and complete the list of devices.
Examples of Renode application in different scenarios can be found in its official repository, while the documentation contains lists of devices and development boards for which immediate support is provided.
3.4 Qiling
The Qiling emulation framework aims to emulate any type of executable. It supports multiple architectures and the emulation of different operating systems and executable formats.
It is a framework written in Python as the base language that relies on the Unicorn emulation engine to provide file system emulation, dynamic libraries, executable loading, and many other high-level operating system features.
To prepare for emulation with Qiling, the file system of the extracted firmware with the executable to be emulated is required so that the original executables and files are available. System calls and outputs are directed to stdout and generally resemble the use of Unicorn, with extensions to integrate execution with multiple tools.
One such tool is the AFL++ fuzzing framework, which allows fuzzing tests to be performed on executables being emulated with Qiling. This integration can be very useful to speed up the process of emulation and analysis of an executable and is a great tool in this phase of the study.
3.5 Firmadyne
Firmadyne is a set of tools that try to automate and simplify the work of emulating a firmware image. It is written in Python and consists of a set of modules that are used for file system extraction, firmware architecture identification and the creation of an executable virtual machine specific to the analyzed firmware.
Firmadyne also attempts to automate part of the scanning process and contains modules for on-device Web service inspection, SNMP service information collection and vulnerability testing with Metasploit.
However, installation of this suite can be more complicated than in other cases and integration with debuggers such as GDB requires static compilation of the gdbserver server for inclusion in the firmware file system.
In general, emulation with Firmadyne can be much easier than with the other tools, but it is also more limited in terms of execution flow control and is more error prone.
3.6 ARMX Firmware Emulation Framework
ARMX is a collection of scripts, executables, kernels, and file systems that together with QEMU try to emulate ARM IoT devices. The project tries to be as like an IoT virtual machine as possible.
The project has support for emulating eleven models of ARM devices in full.
If ARMX does not have support for the device we are emulating, it has a section that explains how to create our own devices and templates to make the implementation work easier.
4. Firmware emulation example with QEMU
In some embedded systems it is necessary to have capabilities that allow data to be displayed through a web. While it is true that there are multiple alternatives to deploy this type of capabilities, the services used for this purpose depend a lot on the processing power and the type of device used.
When using powerful systems or systems with constant power supply, it is common to use generalist services such as Apache, nginx, etcetera. However, when the device has limited resources or you want to extend the battery life, other simpler services such as devd are used. This will be the executable that we will use as a case study to illustrate the firmware emulation process with QEMU.
Note: The QEMU tool is present in the repositories of different Linux distributions or in the official GitHub repository (https://github.com/qemu/QEMU). Its default location after installation is usually one of these two:
/usr/local/bin/qemu-arch
/usr/bin/qemu-arch
4.1 Architecture identification
Before emulating an executable with QEMU it is necessary to know the endianness and the proper architecture. For this, we can make use of tools like binwalk or readelf. In our example, the architecture is x86-64 and the endianness: little endian.
Note: when the endianness is not specified, QEMU uses the default endianness: Little endian (EL).
$ readelf -h devd
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2’s complement, little endian
Version: 1 (current)
OS/ABI: UNIX – System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x458e40
Start of program headers: 64 (bytes into file)
Start of section headers: 456 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 7
Size of section headers: 64 (bytes)
Number of section headers: 24
Section header string table index: 3
4.2 Emulation and recommendation of best practices
Once the architecture and endianness (EL (Little Endian) or EB (Big Endian)) are known, the appropriate QEMU executable is identified to perform the user space emulation (there are different ones within the QEMU utility depending on the purpose of the emulation). In our case it will be qemu-arch-static, where arch refers to the architecture of the executable (x86_64).
A good practice when doing emulation is to copy the QEMU executable to the working directory (the one containing the unpacked system and the extracted executables that you want to emulate). In this case, it is recommended to use QEMU versions ending in “-static”.
$ qemu-arch-static [options] <filename>
If, on the other hand, the file is dependent on the rest of the system, a virtual environment (e.g., with chroot) containing the required file system must be created. Once it has been created, it will be executed.
$ sudo chroot . ./qemu-arch-static [options] <filename>
Another interesting recommendation is to use the “-strace” option of QEMU during execution. This way, we will be able to visualize the kernel calls that are occurring during the execution in real time.
The result obtained when running the devd utility natively is the following:
$ ./devd . -ol
12:13:12: Route / -> reads files from .
12:13:12: Listening on https://devd.io:8000 (127.0.0.1:8000)
12:13:15: GET /
<- 200 OK 1.6 kB
4.3 Emulation results
Using the recommendations presented in the previous section, we proceed to the emulation of the executable. It can be verified that the emulation is successful and that it returns the same result as when running it natively.
$ ./qemu-x86_64-static ./devd . -ol
12:14:18: Route / -> reads files from .
12:14:18: Listening on https://devd.io:8000 (127.0.0.1:8000)
12:14:20: GET /
<- 200 OK 1.6 kB
For illustrative purposes, the executable is also emulated with the strace option to study the calls it makes. Due to the length of the result obtained, only the first calls are presented.
$ qemu-x86_64-static -strace devd . -ol
80379 arch_prctl(4098,13492464,126614525,0,0,0) = 0
80379 sched_getaffinity(0,8192,274886286920,0,274886296000,0) = 40
80379 mmap(NULL,262144,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x0000004000803000
80379 mmap(0x000000c000000000,67108864,PROT_NONE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0) = 0x000000c000000000
80379 mmap(0x000000c000000000,67108864,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED,-1,0) = 0x000000c000000000
…
…
15:46:30: GET /
…
…
<- 200 OK 2.8 kB
After the firmware emulation, we proceed to check that the application works correctly (and thus verify that the emulation process has been successful). In the following screenshot you can see the emulated web.
5. Conclusions
This article has presented the importance of emulation in the context of cybersecurity during hardware pentesting analysis. Emulation allows to parallelize the process of searching for vulnerabilities without affecting or altering the physical devices. This workflow saves time and resources.
However, as explained above, this emulation process is not simple and requires a high level of knowledge on the part of the researcher. In addition, it requires a series of data that are necessary for a correct emulation, such as the system architecture, the existing peripherals, or the endianness.
It’s also important to note that the vulnerabilities found during emulation may not be applicable to the real (physical) device. Sometimes a PoC will violate an emulated system, but it does not do so in a real system because it depends on some parameter that we cannot control or that has not been emulated correctly. Therefore, these failures are always a good indication, but they are not decisive and will require a detailed check by the researcher.
Finally, it should be noted that there are numerous tools such as QEMU and Unicorn that help us to perform these emulation processes. However, new, and more mature alternatives are emerging every day that will probably help to speed up and facilitate this type of research work.