I wanna tell you a story from not too long ago, where exploiting a JSON injection vulnerability in Samsung devices could trigger an attack chain that ended up with code execution on the device.
It seems like the plot from some bad hacker movie. But it wasn’t.
And it can serve as a lesson on how to abuse APIs who blindly trust the JSON in their payloads far too much.
Let’s look into it.
You know how over the past few years IoT has been all the rage? Everything is “smart” these days. From smart light bulbs to smart TVs. Smart fridges to smart hubs. Thermostats to cameras. If it had electronics, vendors were trying to make it “smarter” with software.
The thing is, many of these devices live in constrained environments. Typically running some sort of embedded Linux to drive code on system-on-chip (SOC) hardware. Which leaves its HTTP servers (and supporting libraries like JSON parsers) somewhat constrained.
And this was the case with the Samsung Smart Hub. Their mobile app could communicate remotely with the hub and control anything connected to it.
One of the features of the hub is the ability to connect to smart cameras and process its livestreams, using the RTSP protocol. This code runs in the video-core
process, which is running as root.
By sending a malicious POST request to the /credentials
endpoint, it was possible to modify the credentials used by the hub to connect to remote servers and taint the data in a way that led to SQL injection and ultimately remote code execution.
As root.
On the central controller managing everything IoT on your network.
ugh
This endpoint had no sanitization on the parameters throughout the processing of the JSON body. Moreover, the library Samsung relied on (json-c) was compiled with JSON_TOKENER_STRICT=0
, which allows for defining strings with both single and double quotes.
That little fact would allow attackers to inject arbitrary fields to create custom columns in the hub’s internal sqlite database. You can see where this is going. It ended with stacked queries and execution of arbitrary code.
It’s a fascinating attack chain. It was actually possible to insert an overly long ROP chain into the camera
table and then send a DELETE to the /cameras
endpoint which would ultimately cause the video-core process to try to read the data and ultimately crash, leading to a traditional stack-based buffer overflow.
You can read a great writeup here from Cisco TALOS. This became CVE-2018-3879, and when chained with CVE-2018-3880, had a CVSS rating of 9.9.
So, what did we learn here?
JSON Injection → SQL Injection → Buffer Overflow → ROP = PWNED
This is just one real world example. There are lots of others. But let’s explore WHY this works, and how we can leverage this when we are attacking APIs.
JSON injection is a vulnerability that allows an attacker to insert malicious data into JSON streams, potentially altering application behavior or triggering unintended actions.
Server-side JSON injection happens when data from an untrusted source is not sanitized properly by the server and is directly or indirectly utilized by the code. Like how the Samsung Smart Hub functioned.
There are several ways to do this, like through Structured Format Injection (SFI). I won’t go into exhaustive detail as I’ve written about this before in Exploiting an API with Structured Format Injection.
What I want you to think about is WHY this happens. There is no single answer. It’s more nuanced than that.
And it’s only getting worse as APIs get written in more and more different languages.
It’s all because of inconsistencies in JSON parsers, and how they are being used.
I think it’s fair to say that JSON has become the backbone of most API communications. Its simplicity is often overlooked in threat models, because we rely and trust their structure so heavily.
Yet, in modern web applications and APIs, there may be several parsers being used within the request pipeline, each with their own quirks and vulnerabilities. Discrepancies across parsers combined with multi-stage request processing can introduce serious vulnerabilities.
But why is this?
Even in the best parser implementations, small deviations from the specs are unavoidable.
JSON parsers also face challenges, as the official JSON RFC leaves topics like duplicate keys and number representation open-ended. Despite warnings about interoperability, most users of JSON parsers are unaware of these caveats.
Let’s be honest… when was the last time you read an RFC before using a library to handle data (de)serialization?
On top of all this is the fact the official RFC isn’t the only specification. You also have ECMAScript, JSON5, HJSON, and even Binary JSON (BSON).
It can be maddening. Interoperability between parsers expose security risks that many people don’t even realize exist.
Let me show you a few examples.
There are several interoperability security issues that exist between JSON parsers. BishopFox has some excellent research on the topic that I am summarizing here. If any of these issues are of interest to you, I highly recommend you check out their work on the topic.
I want you to consider this example:
fu = {“bar”: 1, “bar”: 2}
Is the value of fu[“bar”]
equal to 1 or 2? Or will it produce an error?
According to the official specification, any of those results is perfectly acceptable. That’s a problem if you are always expecting it to work a specific kind of way.
So if you have a frontend exposed API written in Python Flask, it uses a last-key precedence, and the result is 2.
But what if that payload is forwarded to a separate microservice on the backend for further processing? Let’s say it’s written in Golang. Well, that uses first-key precedence, and the result is 1.
See the problem here?
So you need to make sure that order precedence is properly understood across code that rely on the JSON objects. Otherwise this can become a possible attack vector where you can manipulate the business logic to work in ways not expected by using duplicate keys in a specific order.
This is why during recon it is important to Detect the Programming Language of API components. You want to understand what languages are being used, and try to find out how JSON objects are being parsed.
Key collisions occur when parsers handle special characters or comments in an inconsistent manner.
For example, in Python 2.x JSON parsers act differently in how they process some Unicode.
Consider this block of JSON:
{“bar”:1,”bar\ud888”:2}
The default JSON parser will cleanly handle this as two different keys. However, the popular ujson parser in Python truncates the Unicode, sees the keys as duplicates, and accepts the last-key precedence of Python. The result? The standard JSON parser sees the value of bar as 1, while the uJSON parser sees the value of 2.
Check out the screenshot below for a live example of this in action…
While we’ve been talking about the precedence of keys, we really have been showing the issues during deserialization. The fact is, it can happen during serialization too.
Sometimes the serialization and deserialization is itself inconsistent.
Take Java’s JSON-iterator as an example…
Input:
fu = {“bar”: 1, “bar”: 2}
Output:
fu[“bar”] // 1
fu.toString() // {“bar”: 2}
So in the same parser the values of key retrieval and serialization differ. The underlying data structure retained the value of the duplicate key, but the precedence between the serializer and deserializer was inconsistent.
The JSON RFC doesn’t prevent the serialization of duplicate keys. So you have to rely on understanding the order precedence for keys in both the serialization and deserialization of data across all components.
As an example, the C++ rapidjson parser will treat the same data differently:
Input:
fu = {“bar”: 1, “bar”: 2}
Output:
fu[“bar”] // 2
fu.toString() // {“bar”: 1, “bar”: 2}
As you can see, the same JSON objects can possibly hold different results when parsed. Reserializing these objects offers no protection as the data may differ, allowing attackers to smuggle values past sanitization logic.
This can result in business logic flaws, injection vulnerabilities, or other security risks.
Hopefully you can see how it becomes possible to abuse JSON and inject data that may cause the application to behave in ways the developer didn’t expect. Outside of the Structured Format Injection I previously mentioned, when you can manipulate how data traverses the components within the API infrastructure you can start to control how logic flows.
For example, if you know the JSON objects are directly serialized to the database (think MongoDB, Couchbase, DynamoDB, CosmosDB etc) and deserialized into external components that use different parsers, there becomes an opportunity to taint the data and see how it makes it way in and out of the API.
Consider an API request flow that could look something like this:
POST /user/create HTTP/1.1
...
Content-Type: application/json
{
"user": "dana",
"role": "administrator"
}
HTTP/1.1 401 Not Authorized
...
Content-Type: application/json
{"Error": "Assignment of internal role 'administrator' is forbidden"}
So we can see that the input validation won’t let us set the role to an administrator.
Understanding how the parser handles inputs allows you to potentially bypass input sanitization by exploiting the parser’s behavior to interpret data in ways you can manipulate.
Think about the Python example from earlier. Imagine if you used Unicode characters to truncate a value in the key named “role” in an account creation object. All of a sudden “administrator\ud888
” gets parsed as “administrator
” and privesc inside the API becomes possible.
It ends up looking like this:
POST /user/create HTTP/1.1
...
Content-Type: application/json
{
"user": "dana",
"role": "administrator\ud888"
}
HTTP/1.1 200 OK
...
Content-Type: application/json
{"result": "OK: Created user ‘dana’ with the role of ‘administrator’"}
And that’s JSON injection at its finest, thanks to quirky JSON parsers.
The attack on Samsung’s Smart Hub is just one example of how JSON injection can lead to a complex chain of vulnerabilities, from SQL injection to remote code execution.
As we’ve seen, the root cause often lies in inconsistencies in how JSON parsers handle data, particularly when multiple parsers with different quirks are involved. These vulnerabilities highlight the importance of understanding the nuances of how JSON is parsed and handled across different languages and components in your API infrastructure.
By thoroughly vetting how JSON objects are serialized, deserialized, and processed, you can start to figure out how to craft payloads that can bypass sanitization filters and affect business logic.
As APIs continue to be a cornerstone of modern applications, ensuring the security of how they handle data is more critical than ever. Hopefully I’ve given you a glimpse of that today.
Inject all the things. You never know how it’ll be processed until you try. 😈
Have you joined The API Hacker Inner Circle yet? It’s my FREE weekly newsletter where I share articles like this, along with pro tips, industry insights, and community news that I don’t tend to share publicly.
If you haven’t, subscribe at https://apihacker.blog.