We all know poor input validation is a critical attack vector for exploiting software. But did you know that a data set codenamed the Big List of Naughty Strings (BLNS) takes that to an entirely different level?
Yep. There is.
Let me show you how to use these naughty strings to break APIs.
So, the Big List of Naughty Strings is an evolving list of strings that are highly likely to cause issues when used as user input data. Max Woolf has maintained it since back in 2015 and now includes over 500 different strings that can potentially abuse inputs.
These strings attempt to test for several conditions, including how systems interpret special characters, emojis, and Unicode. It also attempts several forms of script injection, SQL injection, and even Server code injection to see how the system responds.
It’s typically stored in a .txt file, which by its very nature, may trip up your text viewer when opened. So you should use an editor that visually renders hidden Unicode and special characters.
You can find blns.txt here.
If you open that with your browser, you might notice that GitHub even warns you about hidden Unicode characters.
But here’s the thing… I don’t want you to use this version of the file.
I want you to use the version found in the FUZZING section currently maintained in Daniel Miessler’s SecLists repo called big-list-of-naughty-strings.txt.
And here’s why.
Max’s blns.txt file is fine but hasn’t been updated in years. More importantly, it includes destructive strings for things like SQL injection that will actually DROP database tables if the injection is successful.
You don’t want to destroy things when doing input validation testing.
So don’t be evil… use the SecLists version instead.
Conducting input validation checks in the context of API security testing is easy… if you use the right tools.
For me, this is the perfect place to use Postman and its Collection Runner, especially when you use a custom collection for your security tests.
If you aren’t doing that yet, I highly recommend you check out my article on The Beginners Guide to Writing API Security Tests in Postman.
When using naughty strings in tools like Postman, they too can bork the app when processing malicious strings. We will want to do a bit of pre-processing so we don’t trip up our testing tools.
The Postman Collection Runner includes functionality to load/import data for each run. This data can be in CSV or JSON file format. Postman includes some good documentation on working with data files, but it fails to mention a few important things.
Things that will break Postman…
… namely, that special characters, commas, and quotes will cause issues with the data load.
And that makes sense. CSVs rely on commas to separate columns, while JSON files can’t easily handle double quotes in the elements. It’s standard stuff.
That’s OK. We can handle this ourselves with a bit of Python code to process the BLNS file itself, encode the data, and structure it so that Postman can use it.
Max had already included a script to convert the BLNS text file into JSON in his repo. He also included a shell script that could Base64 encode the naughty strings so they wouldn’t trip up other tools.
I decided to merge the two ideas into a single Python script that also formats the JSON output in a format that Postman supports. Notice how I create a new property called “encodedNaughtyString” and place the Base64 encoded string in the value. You will need to know that later when we build a script to handle this data.
The code looks something like this:
#!/usr/bin/env python3
from argparse import ArgumentParser, Namespace
import os
import base64
import json
def main(srcFile: str, dstFile:str) -> None:
if not os.path.isfile(srcFile):
print( '-s argument is invalid. Is it a proper BLNS txt file? Aborting!' )
return
with open(srcFile, 'r') as f:
# put all lines in the file into a Python list
content = f.readlines()
# above line leaves trailing newline characters; strip them out
content = [x.strip('\n') for x in content]
# remove empty-lines and comments
content = [x for x in content if x and not x.startswith('#')]
# Base64 encode all content to make Postman Collection Runner parser to not break
content = [base64.b64encode(x.encode('utf-8')).decode('utf-8') for x in content]
# insert empty string since all are being removed
content.insert(0, "")
encodedContent: list = []
for c in content:
encodedContent.append({"encodedNaughtyString": c})
with open(dstFile, 'w') as f:
# write JSON to file; note the ensure_ascii parameter
json.dump(encodedContent, f, indent=2, ensure_ascii=False)
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument( '-s', '--src', help='The source BLNS txt file to convert', type=str, required=True)
parser.add_argument( '-d', '--dst', help='The destination filename of the encoded BLNS CSV', type=str, required=True)
args: Namespace = parser.parse_args()
main(args.src, args.dst)
You can also directly download the Python code here.
Usage is as simple as:
./txt_to_postman_b64_json.py -s big-list-of-naughty-strings.txt -d blns.json
Now that we have a data file that Postman can use let me demonstrate how you can use it for your own input validation testing.
So if you have read enough of my articles, you know I like to demonstrate my API hacking on OWASP’s Completely Ridiculous API (crAPI). It would seem a great place to demo input validation testing with naughty strings would be to attack the login form.
Let’s go do that.
The first step we want to accomplish is to walk through the login process and document how the API functions. For crAPI, the API endpoint we are testing is /identity/api/auth/login
.
Look for several behavioral characteristics, including:
We can author our tests to account for and filter out expected behavior by mapping it out. We only want to FAIL a test when a naughty string makes the endpoint respond in ways NOT INTENDED.
In our case, here is what can be learned by interrogating the login endpoint:
That’s helpful. From this information, we can author our tests to account for these conditions and ignore them. Anything else we see will be suspect and should fail the test so we can investigate further.
As the login form has two separate fields, we should set up individual tests for each one. I recommend you group these in its own folder under your main collection. I called mine “Login Input Validation”.
In that folder, duplicate the Login endpoint from the collection holding the API docs (my docs collection was imported as “OWASP crAPI API”) and place it in your new folder. Rename the new endpoint to “Inject tainted data in email field.” Duplicate it again and rename that to “Inject tainted data in password field.”
When you are done, it should look something like this:
During the test runs with the Collection Runner, we will want to inject the raw naughty strings directly into the payload of each request. To do this, we will use a placeholder to inject them directly into a collection variable called {{naughtyString}}.
Go to the Variables tab on the Collection and add a variable called “naughtyString”.
If you recall, my Python script encoded the naughty strings into Base64. We will need a way to decode that back into its raw naughty string format. We can do that in a Pre-Request Script.
The code looks like this:
// Need the atob library for Base64 decoding
var b64decode = require('atob');
// Get the base64 encoded naughty string injected by the Collection Runner
var encodedString = pm.variables.get("encodedNaughtyString");
// Set the decoded value as the "naughtyString" variable in the collection
pm.collectionVariables.set("naughtyString", JSON.stringify(b64decode(encodedString)))
A quick explanation of what the code does:
atob()
function into the sandbox so we can access it for Base64 decoding.encodedString
variable. “encodedNaughtyString” is the property name we wrote out in the Python script.encodedString
variable and then stringifies it into a JSON element. We do this so that special characters like double quotes are properly escaped in the JSON schema structure.{{naughtyString}}
that we set in the previous step.Head to the Body tab of the request. Update the field you want to test by inserting the collection variable {{naughtyString}}
.
NOTE: The collection variable being used was already quoted when we did the JSON.stringify()
. So do NOT include quotes around the variable as you typically do in the payload itself. If you forget this, tests will fail as the value will not be properly quoted/escaped.
OK, now it’s time to write our tests. We have several conditions to check for. For article brevity, I will just show you the code I added for the test conditions I am aware of for the email field:
pm.test("Test tainted input on email property", function () {
if( pm.response.code === 500 ) {
pm.response.to.have.body("UserDetailsService returned null, which is an interface contract violation");
}
else if( pm.response.code === 400 ) {
let jsonData = pm.response.json();
let expectedResponses = ["[must not be blank]", "[size must be between 3 and 60]"];
pm.expect(jsonData.details).to.contain.oneOf(expectedResponses);
}
else {
console.warn( "Unexpected response from login process in email field. Encoded injection str: " +
pm.variables.get("encodedNaughtyString"));
}
});
One thing to note is that you don’t see me checking for a successful login. That would be done in the normal test coverage for positive testing. As we are purposely trying to cause unexpected behavior, I am looking for negative conditions outside of the norms we know about.
Almost there. Time to see what happens.
Right-click on your test scenario folder and click “Run folder”.
For now, uncheck “Inject tainted data in password field.” Later, you can go back to that and update the Pre-Request Script, Body, and Test tabs to match the conditions for testing the password field.
On the right pane, Under Data click the button to Select file. Find the generated JSON file you created with my Python script and select it. Postman will load and parse your encoded naughty strings for you.
To verify that it loaded up correctly, you should see that Postman updated the Iterations field to the number of strings it could load. You can also click the Preview button to see what it parsed.
One last thing. Click the “Persist responses for a session”. This will let you look at the responses of each iteration (especially on failed tests) to debug how the API server responded.
It’s time. Click the orange button that says “Run Security Tests for crAPI”.
Watch what happens.
In our case, we tested over 500 different naughty strings in under 48 seconds against the email field in the login form, and crAPI handled it just fine.
Had it not, the test would have failed, and you could have immediately zoomed into it.
If you want to see what naughty string was sent in a request that caused the failure, you can click on the test and then the Request tab.
As you can see, injecting naughty strings to test input validation is relatively easy with Postman and the BLNS. You could very easily expand this to include your favorite payloads that are not on the naughty list and get a ton more input validation test coverage in no time.
Any time you see a place where you can inject data, you should consider using this approach. Need some more ideas? Maybe read my article on Attacking APIs by tainting data in weird places. This same approach could be used to tamper with headers, abuse query parameters, and taint payload data in any POST or PUT operation.
Have fun with it. Just don’t be evil.
Have you joined The API Hacker Inner Circle yet? It’s my FREE weekly newsletter where I share articles like this, along with pro tips, industry insights, and community news that I don’t tend to share publicly. If you haven’t, subscribe at https://apihacker.blog.
The post Breaking APIs with Naughty Strings appeared first on Dana Epp's Blog.
*** This is a Security Bloggers Network syndicated blog from Dana Epp's Blog authored by Dana Epp. Read the original post at: https://danaepp.com/breaking-apis-with-naughty-strings