Whether you’re developing dashboards for metrics or automating repetitive, time-consuming tasks with custom scripts, scripting is an essential skill for any cybersecurity professional.
During a recent penetration testing engagement, I frequently relied on secretsdump.py from the Impacket toolkit to extract credential dumps from multiple domain controllers. This resulted in a large collection of NTLMv2 hashes. To streamline the process, I created a Python script to parse the extracted hash files and reformat them for compatibility with Hashcat.
The script will prompt for input and output file paths, making it interactive and flexible. It assumes that the input file has lines formatted as username:domain:lmhash:nthash:, where the NT hash is the fourth part (index 3). We only need the NT hash to ingest into hashcat. It filters for hashes that are exactly 32 characters long, which is the length for NTLM hashes in hexadecimal format.
I will break down the script section by section, explaining what each part does, why it’s there, and how it contributes to the overall functionality.
Prompting for User Input:
The script starts by asking the for the input and output files. This makes the script reusable without hardcoding file names, allowing it to work with different datasets.
input_file = input("Enter the input file path (e.g., DC1-Credential-Dump.txt): ")
output_file = input("Enter the output file path (e.g., ntlmv1_nthashes.txt): ")What it does: The input() function displays a prompt and waits for for you to type a response, which is then stored in the variables input_file and output_file.
Opening Files with Context Managers:
Next, the script opens the input file for reading and the output file for writing using awith statement. This is Python’s way of handling files safely, ensuring they are closed automatically even if an error occurs.
with open(input_file, "r", encoding="utf-8", errors="ignore") as infile, open(output_file, "w") as outfile:What it does: open(input_file, “r”, encoding=”utf-8", errors=”ignore”): Opens the input file in read mode (“r”). It specifies UTF-8 encoding to handle text properly and ignores any encoding errors (e.g., invalid characters). open(output_file, “w”): Opens (or creates) the output file in write mode (“w”), overwriting it if it exists.
The with block manages the file handles (infile and outfile), closing them automatically at the end.
Why it’s useful: Prevents file leaks and makes the code cleaner. The errors=”ignore” option is practical for messy credential dumps that might contain non-UTF-8 data.
Reading and Processing Each Line:
Inside the with block, the script loops over each line in the input file. This is where the core extraction happens.
for line in infile:
parts = line.strip().split(":")What it does: for line in infile: Iterates through the file line by line, assigning each line to the variable line.line.strip(): Removes leading/trailing whitespace (like newlines or spaces) from the line..split(“:”): Splits the cleaned line into a list of parts using : as the delimiter. For example, “user:domain:lm:nthash” becomes [“user”, “domain”, “lm”, “nthash”].
Why it’s useful: Credential dumps use colon-separated formats. Stripping ensures clean splitting, avoiding empty parts from extra spaces.
Checking Line Structure and Extracting the Hash:
The script then verifies if the line has at least four parts (to access the NT hash) and extracts it if valid.
if len(parts) >= 4:
nthash = parts[3]What it does: len(parts) >= 4: Checks if the split list has at least four elements (indices 0–3). nthash = parts[3]: Assigns the fourth part (index 3) to nthash, assuming it’s the NTLM hash.
Why it’s useful: Skips malformed lines that might not contain a hash, preventing errors or invalid data in the output.
Validating and Writing the Hash:
Finally, it checks the hash length and writes it to the output file if it matches the expected format.
if len(nthash) == 32: # NT hash length
outfile.write(nthash + "\n")What it does: len(nthash) == 32: Verifies the hash is exactly 32 characters long (NTLM hashes are 128-bit MD4 digests in hex, so 32 hex chars). outfile.write(nthash + “\n”): Writes the hash followed by a newline to the output file.
Why it’s useful: Ensures only valid NTLM hashes are saved, filtering out noise like LM hashes or other data. The newline makes the output file easy to read or feed into tools like hashcat.
Comment note: The # NT hash length is an inline comment explaining the check — good for readability.
Conclusion
This script is a simple, efficient ETL (Extract, Transform, Load) tool for NTLM hashes:
To improve it, consider adding try-except blocks for file errors, logging, or add command line arguments instead of prompts.
Links: