Reverse Engineering Windows Printer Drivers (Part 2)
2022-8-31 00:0:0 Author: blog.includesecurity.com(查看原文) 阅读量:5 收藏

In our blog last post (Part 1), we discussed how you can find and extract drivers from executables and other packages and the general methodology for confirming that drivers are loaded and ready. We also highlighted the Windows driver architecture. In this post, we’ll focus on an introduction to the driver architecture, basics of reverse engineering drivers, and inspect a bug found in the drivers we’re analyzing as part of this series.

We will start with the Dot4.sys driver, as it is the largest of the three and probably contains the most functionality. First, we will identify the type of driver, go over how to figure out areas where user input is received, and then we will talk about the general architecture of the driver.

Identifying the Type of Driver

Immediately after loading the driver onto the system, we can check that the driver is a WDM or a WDF driver by checking its imports and by filtering for functions with the name Wdf. If any functions with the Wdf prefix are found, then it’s assumed that it is a WDF driver, otherwise we assume it’s a WDM driver.

Tips and Tricks

There are several publicly available data structures that you will commonly encounter when reverse engineering drivers. We’re using Ghidra for reverse engineering, but unfortunately it does not come with these data structures by default.

To set up Windows kernel data structures, we used the .gdt (Ghidra Data Type) found here. To install and use these types once a project has been loaded, go to Window -> Ghidra Datatype Manager. Once the datatype manager tab is visible, open the gdts (Open File Archive) as shown in the image below:

To verify that the types have been successfully loaded, use the search functionality to search for a Windows driver type (e.g. _DRIVER_OBJECT), as shown in the figure below:

Note: Ghidra has trouble when trying to decompile structures with unions and structures inside of these unions. For more references, check out this GitHub issue.

Finding the DriverEntry

Just like every program starts with main, drivers have an exported entrypoint known as either DriverEntry or DriverInitialize. We will use DriverEntry in this example.

DriverEntry can be found in the Ghidra symbol tree as DriverEntry. It’s an exported function, because the Windows kernel must call DriverEntry when it finishes loading the driver in memory. In Ghidra, often the entrypoint of a binary is just called entry.

The signature for the driver entry is shown below:

Analyzing DriverEntry

NTSTATUS DriverInitialize(
  _DRIVER_OBJECT *DriverObject,
  PUNICODE_STRING RegistryPath
)
{...}

The types of parameters for DriverEntry are shown in the function prototype above. In order to match this function signature in Ghidra, we can change the types of input parameters by right clicking on the current types of the parameters in DriverEntry and using the function Retype variable.

Major Functions

The driver accepts a _DRIVER_OBJECT* in its entry. This pointer contains an array to a table of function pointers to be populated by the driver called MajorFunction. The indexes to this array of functions correspond to specific events.

When reverse engineering Windows drivers, this array is commonly populated in DriverEntry . In Dot4.sys, this array is populated in DriverEntry, as shown in the figure below:

Each index in the MajorFunction table represents an IRP_MAJOR_CODE. These correspond to an action/event. For example:

MajorFunction[0] = IoCreate, 

The IRP code equivalent to 0 is IRP_MJ_CREATE. This means that whenever a user-mode function opens a handle to the device using WinAPI (CreateFile()), the function the driver assigned to this event is called. MajorFunction table callbacks all share the same function prototype.

It is important to know that these MajorFunction dispatches can happen in parallel with each other. This is important to note, because race conditions can be the source of bugs when there is no appropriate locking. 

While there are no locks, some internal I/O manager reference counters stop something like IRP_MJ_CLOSE happening at the same time as an IRP_MJ_DEVICE_CONTROL.

The most important IRP Major Function code is IRP_MJ_DEVICE_CONTROL. This is the portal through which user mode processes ask kernel mode drivers to perform actions using the WinAPI (DeviceIoControl), optionally sending user input and optionally receiving output.

Here is documentation to read more on the topic:

An important concept to understand is that each feature/command/action that your driver implements is represented by an ioctl code. This IOCTL is not an arbitrary number but is defined by a C macro, which is supplied with information that determines the transfer method, access control and other properties.

The output is an encoded ioctl code that has all these pieces of information inside of a single four byte number. The various methods of data transferred in ioctls are crucial for different needs, whether it be a light request passing in tens of bytes or super high speed hardware device capable of doing gigabytes of processing a second. The listing below shows a general dissection of an ioctl code.


Types of Data Transfer from User Mode to Kernel Mode

METHOD_BUFFERED is the simplest method, copying memory from user mode into an allocated kernel buffer, this type results in the least bugs, but it’s the most costly because when interacting with high speed hardware devices, constantly copying memory between user mode and kernel mode is slow.

METHOD_IN/OUT_DIRECT is more targeted for performance while still attempting to be safe. This method pre-initializes an MDL (Memory Descriptor List), which is a kernel mode reflection of a piece of memory in user mode. Using this method, copying and context switching is avoided but this can also be prone to certain types of bugs.

METHOD_NEITHER is the most flexible. It passes the user mode address of the buffer to the kernel mode driver. The driver developer determines how they want to transfer this memory and is the most difficult to correctly implement without introducing bugs.

A very handy tool capable of decoding these encoded ioctls is the osr online ioctl decoders.

Analyzing MajorFunctions

Now it’s time to explore where user input comes from in drivers. The function type of the MajorFunctions are:

NTSTATUS DriverDispatch(
  _DEVICE_OBJECT *DeviceObject,
  _IRP *Irp
)

All of these functions receive a _DEVICE_OBJECT that represents the device and an IRP (I/O Request Packet). The structure of the _IRP can be found here. It contains many members but for accessing information about the user input such as the input/output buffer sizes, the most important member is the IO_STACK_LOCATION* CurrentStackLocation. The input and out lengths of the buffers are always stored at this location.

inBufLength = CurrentStackLocation->Parameters.DeviceIoControl.InputBufferLength;

outBufLength = CurrentStackLocation->Parameters.DeviceIoControl.OutputBufferLength;

Warning:

Due to the bug mentioned earlier when working with Ghidra, Ghidra will not properly be able to find the IoStackLocation struct. Instead it is shown as offset 0x40 into the Tail parameter. Here is the original code:

PIO_STACK_LOCATION irpSp = IoGetCurrentIrpStackLocation( Irp );
ULONG IoControlCode =  irpSp->Parameters.DeviceIoControl.IoControlCode

And here is what it looks like in Ghidra:

IoCode = *(uint *)(*(longlong *)&(Irp->Tail).field_0x40 + 0x18);

Here we can see that field_0x40 is equivalent to getting the PIO_STACK_LOCATION and offset 0x18 into that pointer results in the IoCode. It is important to remember this because you will see it often.

Finding the Input

The actual member to access to reach the user buffer is different depending on the ioctl data method transfer type:

METHOD_BUFFERED:

The user data is stored in Irp->AssociatedIrp.SystemBuffer, there are no driver specific bugs, just common overflows or maybe miscalculations with userdata.

METHOD_IN/OUT_DIRECT:

The user data can be stored both in Irp->AssociatedIrp.SystemBuffer or the Irp->MdlAddress.

When using MdlAddress, the driver must call MmGetSystemAddressForMdlSafe to obtain the buffer address. MmGetSystemAddressForMdlSafe is a macro, so it’s not visible as an API in Ghidra. Instead, it can be identified with the following common pattern:

There is another type of bug that can occur when using data from MDLs. It’s known as a double-fetch, and this will be discussed later.

METHOD_NEITHER

This is the most complicated wait to retrieve and output data from/to user mode, as it is the most flexible but requires calling numerous APIs. Here is an example provided by Microsoft.

The virtual address of the User Buffer is located at CurrentStackLocation->Parameters.DeviceIoControl.Type3InputBuffer;

This type of IOCTL is not in Dot4.sys, therefore this will not be covered.

Our First Finding

Immediately opening the driver, there are several functions called in DriverEntry. Decompiling one of the functions results in the following output:

  int iVar1;
  int iVar2;
  int iVar3;
  RTL_QUERY_REGISTRY_TABLE local_b8 [2];
  
  memset(local_b8,0,0xa8);
  iVar2 = DAT_000334bc;
  iVar1 = DAT_000334b8;
  local_b8[0].Name = L"gTrace";
  local_b8[0].EntryContext = &DAT_000334b8;
  local_b8[0].DefaultData = &DAT_000334b8;
  local_b8[1].Name = L"gBreak";
  local_b8[0].DefaultType = 4;
  local_b8[0].DefaultLength = 4;
  local_b8[1].DefaultType = 4;
  local_b8[1].DefaultLength = 4;
  local_b8[1].EntryContext = &DAT_000334bc;
  local_b8[1].DefaultData = &DAT_000334bc;
  local_b8[0].Flags = 0x20;
  local_b8[1].Flags = 0x20;
  iVar3 = RtlQueryRegistryValues(0x80000001,L"Dot4",local_b8,0,0);
  if (iVar3 < 0) {
    DbgPrint("DOT4: call to RtlQueryRegistryValues failed - status = %x\n",iVar3);
  }
  else {
    if (iVar1 != DAT_000334b8) {
      DbgPrint("DOT4: gTrace changed from %x to %x\n",iVar1);
    }
    if (iVar2 != DAT_000334bc) {
      DbgPrint("DOT4: gBreak changed from %x to %x\n",iVar2);
    }
  }
  return;
}

Immediately visible in the function above are the calls to DbgPrint(), which print debug messages. It provides evidence that it changes global variables, indicated by the &DAT_ prefix in Ghidra. It uses RtlQueryRegistryValues() to query two registry keys named gTrace and gBreak, and two global variables at 0x334b8 and 0x334bc. We will refer to these global variables as gTrace and gBreak respectively.

It is important to note that the first argument of RtlQueryRegistryValues() is RTL_REGISTRY_SERVICES | RTL_REGISTRY_OPTIONAL (0x80000001)

Only admins can modify these keys, because only admins have access to the service registries, meaning that the bug we found in this fragment could not be used for local privilege escalation. However, we chose to continue exploring it as an exercise in analyzing Windows drivers.

When looking at the uses of gTrace and gBreak outside of this function, the messages indicate they are used to set breakpoints for debugging DriverEntry and to enable debug printing and logging.

gBreak

if ( (gBreak & 1) != 0 )
  {
    DbgPrint("DOT4: Breakpoint Requested - DriverEntry - gBreak=%x - pRegistryPath=<%wZ>\n", gBreak, SerivceName);
    DbgBreakPoint();
  }

gTrace

case 0x3A2006u:
      if ( (gTrace & 8) != 0 )
        result = DbgPrint("DOT4: Dispatch - IOCTL_DOT4_OPEN_CHANNEL - DevObj= %x\n", a1);
      break;

The API RtlQueryRegistryValues() is designed for retrieving multiple values from the Registry. MSDN has several remarks to make about this API:

“Caution: If you use the RTL_QUERY_REGISTRY_DIRECT flag, an untrusted user-mode application may be able to cause a buffer overflow. A buffer overflow can occur if a driver uses this flag to read a registry value to which the wrong type is assigned. In all cases, a driver that uses the RTL_QUERY_REGISTRY_DIRECT flag should additionally use the RTL_QUERY_REGISTRY_TYPECHECK flag to prevent such overflows.”

The flags for both of these fetches are 0x20 (RTL_QUERY_REGISTRY_DIRECT), and the bitflag for the RTL_QUERY_REGISTRY_TYPECHECK is #define RTL_QUERY_REGISTRY_TYPECHECK 0x00000100

What size variables are these registry checks expecting?

local_b8[0].DefaultLength = 4;

They are querying for a registry key with a 4 byte length (a DWORD), so we could possibly overflow these global variables by creating a registry key with the same name but of type REG_SZ.

The area that would be overflowed is DAT_000334b8. This would not be a stack overflow but a global buffer overflow. Global variables are typically stored alongside other global variables rather than on the stack, and as a result exploitation can be more difficult. For example, return addresses aren’t typically present so an attacker must rely on the application’s use of these variables to gain control of execution.

Testing the Bug

First, change the gTrace key at \Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\dot4 to a 40 character long string value of the letter A using REGEDIT (Registry Editor)

The changes are not immediate. We must unload and reload the driver to make sure that the entrypoint is called again. We can do this by unplugging and replugging the USB cable to the printer to which we are connected.

Now to confirm a finding ideally a VM setup would be used, but we can use WinDBG local kernel debugging to confirm our findings as all we need is data introspection.

Setting Up Debugging


In the Microsoft Store, search for WinDBG Preview, and install it.

In a administrator prompt, enable debugging by running the following command

bcdedit /debug on

and restart. Then, start debugging kernel live. There are certain constraints such as the inability to deal with breakpoints. WinDBG is limited to the actions it can retrieve from a program DUMP.

Once loaded, we attach a printer so that we can check for the presence of dot4 in the loaded modules list, using the following command:

lkd> lmDvm dot4
Browse full module list
start             end                 module name

Notice from the output that it’s not there. When this type of trouble arises, the number one thing to is to run is:

.reload

After checking .reload, the problem was fixed as seen in the output below:

lkd> lmDvm dot4
Browse full module list
start             end                 module name
fffff806`b7ed0000 fffff806`b7ef8000   Dot4       (deferred)             
    Image path: \SystemRoot\system32\DRIVERS\Dot4.sys
    Image name: Dot4.sys
    Browse all global symbols  functions  data
    Timestamp:        Mon Aug  6 19:01:00 2012 (501FF84C)
    CheckSum:         0002B6FE
    ImageSize:        00028000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
    Information from resource tables:

Debugging

Now that we’ve set up debugging, let’s calculate where the overflow would occur. Ghidra loads the driver at 

The address of the global variable gTrace overflowed is at 0x0334b8. The Relative Virtual Address from the base would be (0x334b8-0x10000) = 0x234b8.

kd> dq dot4+234b8
fffff806`b7fc34b8  00000000`00520050 ffffbb87`302c5290
fffff806`b7fc34c8  00000000`00000000 00000000`00680068
fffff806`b7fc34d8  ffff908f`ff709b10 00000000`00000000
fffff806`b7fc34e8  00000000`00000000 00000000`00000000
fffff806`b7fc34f8  00000000`00000000 00000000`00000000

Using the DQ command, we can check the qword value of the memory at dot4+234b8. The value of this variable has been set to 0x520050 (remember little endian) and gBreak is 0, but the value in front of QWORD at dot4+234b8 looks like a kernel address (visible from the prefixed fffff).

Checking the byte content (db) shows a very interesting hex dump:

lkd> db ffffbb87`302c5290
ffffbb87`302c5290  41 00 41 00 41 00 41 00-41 00 41 00 41 00 41 00  A.A.A.A.A.A.A.A.
ffffbb87`302c52a0  41 00 41 00 41 00 41 00-41 00 41 00 41 00 41 00  A.A.A.A.A.A.A.A.
ffffbb87`302c52b0  41 00 41 00 41 00 41 00-41 00 41 00 41 00 41 00  A.A.A.A.A.A.A.A.
ffffbb87`302c52c0  41 00 41 00 41 00 41 00-41 00 41 00 41 00 41 00  A.A.A.A.A.A.A.A.
ffffbb87`302c52d0  41 00 41 00 41 00 41 00-41 00 41 00 41 00 41 00  A.A.A.A.A.A.A.A.
ffffbb87`302c52e0  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  .......

The above is a UNICODE_STRING, the structure in the kernel used for describing a string. The value 0x00520050 in front of the string describes two things:

  • The total length 0x0052 (82)
  • The string length 0x0050 (80)

Why are these larger than 40 letters? Because a UTF16 character occupies 2 bytes. It’s important not to forget that there’s a null byte at the end of the string (null byte is actually 2 bytes with UTF16). Meaning that the total length is 82 bytes and the actual string content length is 80 bytes

Exploitability

Following the 2 global variables comes 16 bytes of padding for alignment purposes, meaning that the only bytes we can overflow into are random bytes that do nothing, making this unexploitable.

Driver Architecture

Since the DOT4 driver is actually a default Microsoft driver, the IOCTL codes for it are open source for applications that want to implement a protocol over DOT4. In the Windows 10 SDK we were able to find all the publicly exposed IOCTLs:

#define FILE_DEVICE_DOT4         0x3a
#define IOCTL_DOT4_USER_BASE     2049
#define IOCTL_DOT4_LAST          IOCTL_DOT4_USER_BASE + 9
 
#define IOCTL_DOT4_CREATE_SOCKET                 CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  7, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_DESTROY_SOCKET                CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  9, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_CREATE_SOCKET CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE + 7, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_WAIT_FOR_CHANNEL              CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  8, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_OPEN_CHANNEL                  CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  0, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_CLOSE_CHANNEL                 CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  1, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define IOCTL_DOT4_READ                          CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  2, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_WRITE                         CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  3, METHOD_IN_DIRECT, FILE_ANY_ACCESS)
#define IOCTL_DOT4_ADD_ACTIVITY_BROADCAST        CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  4, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define IOCTL_DOT4_REMOVE_ACTIVITY_BROADCAST     CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  5, METHOD_BUFFERED, FILE_ANY_ACCESS)
#define IOCTL_DOT4_WAIT_ACTIVITY_BROADCAST       CTL_CODE(FILE_DEVICE_DOT4, IOCTL_DOT4_USER_BASE +  6, METHOD_OUT_DIRECT, FILE_ANY_ACCESS)

The general control flow of the drivers follows the listing below:

The diagram shows that a thread is created and then it waits for an event. A user mode program calls the driver IOCTL then the I/O manager calls the MajorFunction. The MajorFunction then inputs the IOCTL into the IRP queue and then sets the event. Setting the event tells the WorkerThread waiting for IRPs to process that there are IRPs in the IRP queue. The worker thread clears the event and then starts processing the IRPs. When the worker thread finishes processing the IRP it calls IoCompleteRequest, which signals to the I/O manager that it can now return back to user mode.

It is important to understand that these device drivers try to do everything synchronously by using queues. This design eliminates race conditions, as at no moment are two threads ever working parallel to user requests.

IoCompleteRequest is the API that every driver must call to tell the I/O Manager that processing has been completed and that the user mode program can now inspect their input/output buffers.

The Attack Surface

Out of the IOCTLs we listed before, few take user input and those that do, don’t have very complicated inputs:

  • IOCTL_DOT4_DESTROY_SOCKET takes in 1 byte, the socket/channel index, which selects the DOT4 socket/channel to destroy.
  • IOCTL_DOT4_WAIT_FOR_CHANNEL takes in 1 bytes and uses it as a channel index.
  • IOCTL_DOT4_OPEN_CHANNEL takes in 1 byte and uses it as a channel index.
  • IOCTL_DOT4_READ – takes a quantity and an offset.
  • IOCTL_DOT4_WRITE – takes a quantity and an offset.

IOCTL_DOT4_CREATE_SOCKET is the most interesting ioctl as it takes in 56 bytes in the form of a certain structure, described by the code snippet below:

typedef struct _DOT4_DC_CREATE_DATA
{
    // This or service name sent
    unsigned char bPsid;
 
    CHAR pServiceName[MAX_SERVICE_LENGTH + 1];
 
    // Type (stream or packet) of channels on socket
    unsigned char bType;
 
    // Size of read buffer for channels on socket
    ULONG ulBufferSize;
 
    USHORT usMaxHtoPPacketSize;
 
    USHORT usMaxPtoHPacketSize;
 
    // Host socket id returned
    unsigned char bHsid;
 
} DOT4_DC_CREATE_DATA, *PDOT4_DC_CREATE_DATA;

This structure serves mainly for settings options when creating a new socket and assigning a name to the socket.

Common Bugs (Specific to Windows Drivers)

Double Fetch

The reason drivers use METHOD_NEITHER/METHOD_OUT/IN_DIRECT is because it allows them to read and write data from the user mode program without copying the data over to kernel mode. This is done by creating an MDL or receiving an MDL (Irp->MdlAddress).

This means that changes to the mapped user mode memory reflect to the mapped memory in KernelMode, therefore a value can change after passing certain checks.. Let’s take a piece of example code:

    int* buffer = MmGetSystemAddressForMdlSafe( mdl, NormalPagePriority | MdlMappingNoExecute );
 
    char LocalInformation[0x100] = { 0 };
    
    // Max Size we can handle is 0x100
    if (*buffer <= 0x100) {
        memcpy(LocalInformation, buffer, *buffer);
    } else {
        FailAndErrorOut();
    }

If *buffer was a stable variable and didn’t change, then this code would be valid and bug-free, but after the check happens, if a user mode thread decides to change the value of *buffer in their address space, this change would reflect into the kernel address space and then suddenly *buffer could be much larger than 0x100 after the check, introducing memory corruption.

The way to avoid these type of bugs is to store the information in a local variable that cannot change:

            int* buffer = MmGetSystemAddressForMdlSafe( mdl, NormalPagePriority | MdlMappingNoExecute );
 
            char LocalInformation[0x100] = { 0 };
            
            int Size = *buffer;
            
            // Max Size we can handle is 100
            if (Size <= 0x100) {
                memcpy(LocalInformation, buffer, Size);
            } else {
                FailAndErrorOut();
            }

In dot4.sys, every IOCTL that uses MDLs copies over the content before using it, except for IOCTL_DOT4_READ and IOCTL_DOT4_WRITE, which use the MDLs in copy operations to READ/WRITE once, so there’s no possibility of double fetch.

IoStatus.Information

When using METHOD_BUFFERED, the IO Manager determines how much information to copy over by looking at IoStatus.information, which is set by the driver dispatch routine called. It’s important to know that a driver should fill this member not to the size of the buffer that they want to return, but to the amount of content they actual wrote into the buffer, because the IO Manager does not initialize the buffer returned and if you haven’t actually used all the buffer, you may leak information.

Other Resources About Windows Driver Exploitation

We highly recommend Ilja Van Sprundel’s amazing talk about common driver mistakes:

Possible Exploration and Future Targets

This driver must parse commands that arrive from the printer, and this seems like an interesting candidate for fuzzing and could possibly open up printer->driver RCEs/LPEs. A quick look at the strings with search for reply brings up: 

It seems that there are nine Reply packet handlers, which are not enough for fuzzing, but maybe manually auditing. This, however, is out of the scope of this guide, as it is out of reach of a user mode program.

In this analysis, we audited one of the three drivers. The other two drivers are lower-level drivers in the device stack responsible for implementing interaction with the USB port to read and write messages. A potential area for more bugs would be cancellable IRPs, though we suspect that it uses cancel-safe IRP queues, but there could be cancellable IRP problems when passing IRPs down the device stack onto lower level drivers, etc.

In this analysis we focused on specific printer drivers that were publicly distributed, but the methodology used in this blog post is applicable to most drivers and is hopefully of use to those who are planning to audit a driver. Using these strategies and methodologies, you can find driver-specific bugs and possible vulnerabilities that could open up security concerns for your infrastructure and user devices.


文章来源: https://blog.includesecurity.com/2022/08/reverse-engineering-windows-printer-drivers-part-2/
如有侵权请联系:admin#unsafe.sh