Readers of this blog will know that attackers are constantly finding new ways to hide their malware and avoid detection; after all, that’s what good malware does best! We have recently observed attackers leveraging both excessive amounts of unicode as well as peculiar includes and file extensions within their WordPress backdoors to conceal their malware and make it more difficult to find and detect.
In this post we’ll review what this malware does, what it looks like, and how to protect your website from this infection.
Let’s start from the beginning, shall we? In order to understand why this malware is using unicode, we must of course first understand what unicode is.
Naturally, computers and the web are not used only by English speakers with English-language keyboards; it is used by people all across the globe who speak many different languages. These different languages use different scripts, letters, and symbols to notate their words, and these need to be able to be inputted into computers in order to communicate across the web:
In most simple terms, unicode is a system that gives a unique code to each character or symbol from all the world’s languages, allowing them to be easily and consistently displayed on the internet. This enables people from different countries and cultures to communicate and share information without language barriers.
Moreover, unicode is also used in the unforgettable WingDings font, initially released by Microsoft in 1990. So, in addition to international typography, unicode is also used for fun and whimsical symbols and smiley faces:
As an example, here’s what the phrase “Hello world!” looks like using the beloved WingDings font:
☟︎♏︎●︎●︎□︎ ⬥︎□︎❒︎●︎♎︎✏︎
Now that we’ve established what unicode is, let’s explore how it is being incorporated into malicious backdoors within compromised WordPress environments.
Recently we started seeing some very strange looking backdoors lodged inside the file structure of compromised WordPress environments:
Every single variation of these backdoors were different. Although the content, formatting, and their choice of cute and funky unicode characters were unique to each sample, these are actually just a modification of malware that we have been tracking for quite a few months (the only difference being the injection of these unicode characters).
Here’s what the original (very messy) backdoor looks like:
In fact, at first glance, it seemed that each one of these backdoors were unique with its own assortment of cute and whimsical unicode symbols, which can make detecting them rather difficult.
They were most frequently found injected into index.php files in core WordPress directories:
Although we also observed them being placed deep into the following bogus directories and plugin files:
However, as you can tell by the screen shots, the unicode isn’t actually doing anything; it’s all commented out (grey). As it turns out, the unicode was just being used to pad the actual, functional backdoor code, and fill the files with pointless rubbish to confuse whomever was trying to decipher what it’s actually doing.
So, if the unicode content is just filler to confuse and distract, what happens when we scrape it all away?
Once the unicode is removed and the actual, functional PHP code is formatted we begin to see a pattern, and something which is clearly more like a traditional PHP remote execution backdoor:
All of the variations of this backdoor look just like this, however with variable names changed and other minor differences, but the meat and the potatoes are identical.
The main functionality of the backdoors are contained within an obfuscated function which differs slightly across each backdoor but is essentially the same:
Once we print out the functions of the array we can see a lot of the usual suspects:
Array ( [0] => create_function [1] => str_rot13 [2] => json_decode [3] => pack [4] => base64_decode [5] => file_get_contents [6] => H* [7] => } [8] => /* [9] => ARRAY [10] => of )
In summary, after all of the unicode content is removed and the functions are analysed, this backdoor can execute code passed as request cookie or extracts an encrypted remote URL from the “ARRAY” GET parameter and fetches the executable code from a remote server.
Examples of such remote URLs include:
We can see an example of one of the payloads fetched from those remote URLs here:
When we decode that big chunk of base64 at the bottom of the above screen capture we are greeted with our favourite .htaccess nuisance malware which was found to be one of the most common types of malware identified in our 2022 Threat Report:
These files are closely related to the Japanese SEO spam hack, which was also one of the most common infections that we have observed over the last years, so it seems that these unicode backdoor variants are a new tool being used by some of the most common spam threat actors on the web today.
In fact, lodged within the payload extracted from one of the malicious domains we found this very interesting snippet (translated from the original Chinese text):
Where we can see that these threat actors are actively working against other threat actors to actively remove their opponents malware from compromised websites, indicating a sort of “turf war” between competing groups. This is pretty interesting but we’ll save further analysis of this for another time!
Although the majority of these backdoors were found lodged within the file system in index.php files, other variations of this same backdoor had an additional step of obfuscation; namely, the use of non-standard file extensions.
In some instances we located index.php files like the following example:
We can see that the index.php file, rather than having the backdoor lodged inside itself, is simply calling another file lodged within the same directory. In the above example when we remove the base64 obfuscation we get this file name: GZpETXKDJzSfHN.3gp
So while the attackers are accessing the index.php in that directory, the actual functionality of the backdoor is lodged within that other file.
The .3gp file extension is, of course, not often used in websites. It is actually a video file format used for mobile devices. We have also seen the attackers using extensions like .mkv and .mp4 and other media/video file types.
However, the content contained within them are actually just other variations of the unicode backdoors mentioned previously.
Why are they doing this? To avoid detection, of course. Quite often website security scanners will limit their scans to commonly used file extensions for websites such as PHP, JS and HTML. Most often video/audio files are not going to contain malware, particularly not within website environments. So while the file extensions that the attackers are using do exist and are valid, the content within them is bogus. The footprint within the PHP files shown above is quite small, and it’s an attempt to evade detection from widely-used website security scanners.
We have seen very similar behaviour of including a payload from elsewhere in an index file in the very popular konami code backdoor.
Malware is always evolving to attempt to evade detection from security scanners so they can maintain access to compromised environments and distribute their malicious payloads to website visitors. It’s a constant battle between security researchers and attackers, and always will be.
If you’re a website owner and want to protect yourself from attacks, you can leverage the following steps to mitigate risk for your environment:
And as always, if you believe your website has been infected and you need a hand, we’re always happy to help clean up a malware infection.