A bug was found sort of accidentally in Adreno/KGSL GPU for Android devices. The post covers a lot of background, but what’s important is that userspace can map shared memory from the CPU into the GPU, and use it to pass buffers such as command buffers. Later, the GPU handle can be sent with the IOCTL_KGSL_GPU_COMMAND
ioctl to process these commands. What they found was that sometimes if you wrote and sent the commands immediately, they would fail and give unexpected behavior. This lead them to believe the GPU was getting an inconsistent view of the command buffer from what the CPU was writing.
The Bug
Indeed it ultimately turned out to be a cache coherency issue. The CPU was writing to the command buffer which updated the CPU cache view, but the backing physical memory isn’t updated until the cache is flushed. While all the CPU cores will have cache coherency and see the same data, this isn’t necessarily true for ‘external’ devices such as the GPU. The bug is also extremely subtle from a code perspective, because the IOCTL_KGSL_MAP_USER_MEM
ioctl calls get_user_pages()
on the command buffer, which in turn calls flush
-related functions to flush the cache. But on AARCH64, this function is empty/a NOP.
Exploitation The way this was exploited was quite neat. By spanning the command buffer across two pages, they could have the first page that contains the command opcode be cache coherent, and the second page with the data to write be incoherent. This way, by abusing write commands on the GPU, they could use the write command to potentially leak data. Leaking user data was fairly trivial, leaking kernel data took more work though.
For one thing, kernel has its own page allocator, and the pages you could leak would need to be mapped as userspace pages with the GFP_HIGHUSER
with the MIGRATE_UNMOVABLE
flag set, as well as not be mapped via remap_pfn_range()
. Ultimately they used Asynchronous I/O (AIO) for this, as the io_setup()
syscall would map such pages for the AIO ring buffers. The second problem was that the GPU would try to set up the memory region as a dmabuf. When it sees the region is not anonymous and isn’t mapped to a DMA file it would fail. However, it would only try to do this on the first VMA in the region, so by doing two separate but adjacent userspace mappings, you can map both of them into the GPU and have the second one slip through.
World’s worst fuzzer, leading to a traditional stack overflow in the kernel. Really not much to say about the vulnerability, copy_from_user
with no bounds check into a fixed sized buffer on the stack. Fuzzing technique was a little fun though, just iterated over everything with write permissions and wrote random garbage until they got a crash.