Lucent Grid Learning · Malware Analysis

Volume 1 of 2

Malware Analysis
Foundations & Behavioural

Everything a SOC analyst, incident responder, or detection engineer needs to triage samples, extract IOCs, perform static and dynamic analysis, hunt in memory with Volatility, write YARA rules, and map findings to ATT&CK — without requiring assembly or reverse engineering skills.

15 chapters

~4 hrs reading

No assembly required

ATT&CK mapped

Two-Volume Series

Malware analysis is one of the broadest technical disciplines in security — spanning file format internals, behavioural monitoring, memory forensics, assembly language, reverse engineering, anti-analysis techniques, and family-specific deep dives. Covering the subject with the depth it deserves in a single volume would require sacrificing either breadth or rigour. This series is therefore split into two books, each with a distinct audience and purpose.

This is Book 1. It covers the practitioner skills — triage, static analysis, dynamic behavioural analysis, memory forensics, YARA, detection engineering, and IOC extraction. Book 2 covers the specialist skills — x86/x64 assembly, Ghidra, IDA Pro, unpacking, deobfuscation, anti-analysis, and full reverse engineering of ransomware, RATs, and rootkits.

You are here — Book 1

Foundations & Behavioural Analysis

Triage · Static analysis · Dynamic analysis · Memory forensics · YARA · Detection engineering · IOC extraction · IR integration

Book 2

Reverse Engineering & Advanced Techniques

x86/x64 assembly · Ghidra · IDA Pro · Unpacking · Deobfuscation · Anti-analysis · Ransomware RE · Rootkits

Continue to Book 2 →

Part I · Chapters 1–4

Foundations

What malware is, how to build a safe analysis environment, how to perform first triage, and the PE file format that contains almost all Windows malware

Chapter 01 · ~14 min · Foundations

What Malware Is

Taxonomy, the infection chain, the malware ecosystem, how samples reach analysts, and the mental model that drives analysis decisions

Every analysis decision — what to look for, which tool to use next, how to prioritise findings — flows from understanding what kind of malware you are looking at. A dropper behaves differently from an implant. A ransomware payload behaves differently from a stealer. The taxonomy is not academic classification for its own sake; it is the map that guides the analysis.

Definition

Malware (malicious software) is any software intentionally designed to disrupt, damage, or gain unauthorised access to a computer system. The key word is intentional — a buggy program that crashes your system is not malware. A program designed to encrypt your files and demand payment is.

The Malware Taxonomy

Type	Primary Behaviour	Key Indicators
Virus	Infects and modifies existing legitimate files to propagate	Modified file timestamps, appended code sections, checksum mismatches
Worm	Self-replicates across networks without user interaction	High outbound connection volume, scanning behaviour, dropped copies in network shares
Trojan	Masquerades as legitimate software while delivering malicious functionality	Legitimate-looking icon/metadata, suspicious imports, unexpected network connections
Dropper	Delivers and installs another malware payload; may not persist itself	Short-lived execution, file writes to temp directories, spawns child processes
Loader	Downloads and executes a next-stage payload from a remote source	Network requests to C2, memory allocation + execute pattern, no persistent file write
Stager	Minimal first-stage designed only to download a full implant	Very small size, few imports, single network request, shellcode execution pattern
Implant / RAT	Persistent remote access — keylogging, screenshot, file access, shell	Persistence mechanisms, C2 beaconing, broad capability imports
Ransomware	Encrypts victim files and demands payment for decryption key	Mass file operations, CryptEncrypt imports, vssadmin delete shadows, ransom note writes
Stealer	Exfiltrates credentials, cookies, browser data, crypto wallets	Browser database access, DPAPI calls, clipboard monitoring, data exfiltration network traffic
Rootkit	Hides presence of malware from OS and analysis tools	Hook patterns in kernel/userland, DKOM artefacts, cross-view discrepancies
Bootkit	Persists in the boot process — survives OS reinstall	MBR/VBR modifications, UEFI firmware writes, pre-OS network activity
Fileless	Executes entirely in memory — no persistent file on disk	PowerShell/WMI/registry execution, injected memory regions, no file artefacts
Wiper	Destroys data permanently — no recovery mechanism	Mass file deletion/overwrite, MBR destruction, no ransom demand

The Infection Chain

Real-world malware rarely operates as a single binary. Modern attacks use a staged delivery chain where each stage is purpose-built, small, and limited in what it does. Understanding the chain tells you where in it the sample you're analysing sits — and what to look for next.

Infection ChainTypical staged delivery — from phishing email to implant

Stage 1 — Initial Access Phishing email with malicious .docx attachment (also: drive-by download, supply chain, RDP brute force, vulnerable public-facing service) Stage 2 — Execution User opens .docx → macro executes → PowerShell download cradle T1059.001 PowerShell, T1566.001 Spearphishing Attachment Stage 3 — Stager / Loader PowerShell downloads stager.exe from attacker C2 Stager.exe: 8KB, single import (URLDownloadToFile), downloads main payload T1105 Ingress Tool Transfer Stage 4 — Dropper Main payload (dropper.exe) decrypts embedded implant, writes to disk T1027 Obfuscated Files or Information Stage 5 — Persistence Implant writes Run key, creates scheduled task T1547.001 Registry Run Keys, T1053.005 Scheduled Task Stage 6 — C2 and Objectives Implant beacons to C2 every 60s, awaits operator commands Operator: credential harvest → lateral movement → ransomware deployment T1071.001 Web Protocols (C2), T1486 Data Encrypted for Impact

The Malware Ecosystem

Understanding who writes malware and why shapes how you approach analysis. A sample from a sophisticated APT group will have anti-analysis features, custom encryption, and living-off-the-land techniques that a commodity crimeware tool won't bother with. A ransomware-as-a-service affiliate using a purchased builder produces something structurally different from a custom implant developed over months by a nation-state team.

Crimeware / eCrime — financially motivated. Commodity tooling, RaaS (Ransomware-as-a-Service), MaaS (Malware-as-a-Service). Broad targeting, high volume, moderate sophistication. Examples: LockBit, Emotet, RedLine Stealer.
APT (Advanced Persistent Threat) — nation-state or state-sponsored. Custom tooling, targeted victims, high sophistication, long-term persistence. Examples: Cobalt Strike customised variants, SUNBURST (SolarWinds), WannaCry (attributed to Lazarus Group).
Hacktivism — ideologically motivated. Often simpler tooling, DDoS-heavy, opportunistic targeting. Wipers sometimes used for destruction rather than financial gain.
Commodity vs Custom — commodity malware is purchased or rented; custom malware is purpose-built. Commodity samples have existing detection signatures and threat intel; custom samples require full analysis.

How Samples Reach Analysts

Incident response — extracted from a compromised endpoint during an active engagement. Highest context (you know the victim, the timeline, the infection vector).
VirusTotal / MalwareBazaar — community-submitted samples. Good for finding commodity malware; limited context about original deployment.
Threat intelligence feeds — commercial (Recorded Future, CrowdStrike Intelligence) and open (MISP communities, abuse.ch). Often include context about campaign and attribution.
Sandboxes — any.run, Hybrid Analysis, Joe Sandbox — sometimes expose unpacked or deobfuscated variants that don't appear in other feeds.
Honeypots — passive capture of samples actively scanning or exploiting internet-facing systems. Useful for seeing what is actively weaponised in the wild.

Key Takeaways — Chapter 1

Malware taxonomy is not classification for its own sake — dropper vs loader vs implant tells you what to look for, what stage of the chain you're in, and what the next stage will be
Modern attacks are staged — a single binary is rarely the whole story; identifying where a sample sits in the chain determines what to analyse next
ATT&CK technique IDs appear throughout this module because they are the common language between analysis findings and detection engineering
Commodity vs APT tooling determines how much anti-analysis effort to expect — a RaaS affiliate uses a purchased builder; a nation-state team builds custom implants with months of development time

Chapter 02 · ~13 min · Foundations

Building the Analysis Environment

VM isolation and snapshot workflow, FlareVM and REMnux setup, network isolation options, safe file handling, INetSim, and anti-analysis awareness at setup time

Malware analysis on a production machine is not a calculated risk — it is an error. The consequences range from personal data theft to network-wide compromise. The analysis environment must be isolated, reversible, and configured to surface behaviour rather than hide it. Getting this right before touching a single sample is what makes everything else in this module safe to do.

Why VMs and Why Snapshots

A virtual machine provides three essential properties for malware analysis. First, isolation — the guest OS is separated from the host OS and from the network. Second, reversibility — a snapshot taken before analysis can be restored in seconds, returning the environment to a clean state regardless of what the malware did. Third, observability — VMs can be paused, their memory dumped, and their network traffic captured at the hypervisor layer without the guest OS knowing.

Analysis VM WorkflowThe snapshot cycle — clean baseline, analysis, revert

Setup (one time): 1. Install Windows 10/11 in VM 2. Install FlareVM tooling (automated) 3. Configure network: host-only adapter 4. Take snapshot: "CLEAN_BASELINE" Per-sample analysis: 5. Restore from CLEAN_BASELINE 6. Start monitoring tools (ProcMon, Wireshark) 7. Transfer sample to VM via shared folder 8. Execute sample 9. Observe, document, collect artefacts 10. Save artefacts to host BEFORE reverting 11. Revert to CLEAN_BASELINE Never: run sample on host machine Never: use a bridged network adapter during analysis Never: leave a compromised VM running connected to production network

FlareVM — The Windows Analysis Toolkit

FlareVM (FLARE = FireEye/Mandiant Advanced Reverse Engineering team) is an automated PowerShell installer that turns a clean Windows installation into a fully equipped malware analysis workstation. It installs and configures tools across every analysis category.

FlareVMKey tools installed and their purpose

Static Analysis: pestudio PE file analysis — imports, strings, indicators CFF Explorer PE structure viewer and editor PE-bear PE format viewer with hex editor Detect-It-Easy Packer/compiler detection FLOSS Obfuscated string extraction Dynamic Analysis: Process Monitor File / registry / process event logging Process Hacker 2 Live process inspection, memory analysis Autoruns Persistence location enumeration Wireshark Network traffic capture FakeNet-NG Network simulation (fake DNS/HTTP/SMTP) Memory Analysis: Volatility 3 Memory forensics framework WinPmem Memory acquisition Reverse Engineering (Book 2 territory): Ghidra Static disassembly / decompilation x64dbg Dynamic debugger IDA Free Disassembler Install: (in clean Windows VM, run as Administrator) PS> Set-ExecutionPolicy Unrestricted -Force PS> iex (New-Object Net.WebClient).DownloadString('https://raw.githubusercontent.com/mandiant/flare-vm/main/install.ps1')

Network Isolation Options

Network configuration is the most consequential setup decision. Different analysis scenarios require different network postures:

Mode	Configuration	What Malware Sees	Use When
No network	No network adapter or disabled	No connectivity — network-dependent malware may terminate or behave differently	Samples that should behave without C2 (ransomware file encryption stage)
Host-only	Internal VM network only; no internet routing	Can reach host and other VMs on the same host-only network; no internet	Default for most analysis — captures traffic without exposing real internet
INetSim	REMnux VM on host-only network, running INetSim or FakeNet-NG	All DNS resolves to the REMnux IP; HTTP/HTTPS/SMTP/FTP all served fake responses	When malware must receive network responses to proceed (C2 check-in, download trigger)
Isolated VLAN	Physical network segment with no route to production	Full internet simulation or real internet via isolated egress	Heavily VM-aware samples that require physical hardware or real internet responses

Safe File Handling

Several conventions prevent accidental execution of samples outside the analysis environment:

Password-protected archives — the universal convention is password infected. All malware samples shared in the community are archived with this password. Windows will not auto-execute a file inside a password-protected ZIP.
The .mal extension — renaming sample.exe to sample.exe.mal prevents accidental double-click execution on Windows. Rename back inside the analysis VM before executing.
Disable autorun — in the analysis VM, disable Windows AutoPlay/AutoRun for all drive types. A USB drive or mounted ISO with a malicious autorun.inf file should not execute automatically.
Shared folders with caution — the shared folder between host and guest VM is a potential escape path. Keep it read-only for the guest, transfer samples in one direction only (host to guest), and disable it when not actively transferring files.

Anti-Analysis at Setup Time

Many malware families detect that they are running in a virtual machine and alter their behaviour — running benign code, sleeping for extended periods, or terminating. Before taking the clean baseline snapshot, configure the VM to be less fingerprint-able:

Set a realistic VM hostname (not "DESKTOP-FLAREVM" or "SANDBOX") — common sandbox hostnames are blocklisted by malware
Create a realistic user account name and populate recent documents/browser history to simulate a real user environment
Change the screen resolution to a realistic size (1920×1080) — some malware checks for very small sandbox resolutions
Ensure the VM has more than 2 CPU cores and 4GB RAM — resource thresholds are a common sandbox detection technique
Install VMware Tools or VirtualBox Guest Additions but be aware that their presence can be detected — some analysts deliberately remove or rename these components

Key Takeaways — Chapter 2

The snapshot cycle — restore clean baseline, analyse, revert — is the fundamental safety mechanism; it must be followed for every sample without exception
FlareVM automates the installation of the complete Windows analysis toolkit; take the clean baseline snapshot after FlareVM completes and before any sample touches the VM
INetSim on a REMnux VM provides fake internet services that satisfy C2 check-ins and trigger continued malware execution — essential for analysing loaders that require a network response to proceed
Anti-analysis at setup time — realistic hostname, username, screen resolution, and resources — reduces the rate at which samples detect the environment and alter their behaviour

Chapter 03 · ~14 min · Foundations T1027 · T1588

First Triage: Identifying What You Have

File type identification, hashing and VirusTotal, sandbox submission and critical reading of reports, MITRE ATT&CK mapping, and the triage decision tree

The first sixty seconds with a new sample determines how you spend the next sixty minutes. Triage is not shallow analysis — it is structured, efficient analysis aimed at answering one question as quickly as possible: what kind of problem is this, and what analysis approach does it require? Commodity malware with existing detection may need only IOC extraction; a novel sample needs full behavioural and static analysis.

File Type Identification

Never trust the file extension. Malware is routinely distributed with misleading extensions — a PDF icon and a .pdf.exe filename, a .doc file that is actually a PE binary, a .jpg that is shellcode. File type is determined by the magic bytes at the start of the file, not the extension.

Magic Bytes — Common File Type Identification

Offset Bytes ASCII Type 0000000 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF MZ.... ← Windows PE (EXE/DLL) 0000000 50 4B 03 04 14 00 00 00 08 00 BE 6C 4C 52 PK.... ← ZIP archive (also DOCX/XLSX) 0000000 D0 CF 11 E0 A1 B1 1A E1 00 00 00 00 00 00 ...... ← OLE2 (DOC/XLS/PPT with macros) 0000000 25 50 44 46 2D 31 2E 34 0A 25 C4 E5 F2 E5 %PDF-1 ← PDF document 0000000 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 .ELF.. ← ELF binary (Linux) 0000000 CF FA ED FE 07 00 00 01 03 00 00 80 02 00 .... ← Mach-O binary (macOS) 0000000 4D 53 43 46 00 00 00 00 6D 03 00 00 00 00 MSCF.. ← Cabinet file (often used in droppers)

Hashing and VirusTotal Workflow

First Triage — Hash and LookupComputing hashes and interpreting VirusTotal results

# Compute hashes (PowerShell in analysis VM) PS> Get-FileHash sample.exe -Algorithm MD5, SHA256 | Format-List Algorithm : MD5 Hash : A4B3C2D1E0F9A8B7C6D5E4F3A2B1C0D9 Algorithm : SHA256 Hash : 8E3F2A1B4C5D6E7F8A9B0C1D2E3F4A5B6C7D8E9F0A1B2C3D4E5F6A7B8C9D0E1F # Compute imphash (Python — identifies samples with same import structure) $ python3 -c "import pefile; pe=pefile.PE('sample.exe'); print(pe.get_imphash())" fcf2b3a7b8d9e4f1c5a2d6e0b3f7a9c1 # VirusTotal lookup results interpretation: Detections: 52/72 → Known malware family, high confidence Detections: 8/72 → Recent or obfuscated sample, limited coverage — still analyse Detections: 0/72 → Novel, custom, or very new sample — full analysis required ↑ A clean VT result does NOT mean the file is safe # ssdeep fuzzy hash — find structurally similar samples $ ssdeep -s sample.exe 3072:BsXrYkLTmPq8wXu9JKWB1nE5oTfRz:BsXwkLTPXu9JKWBcE5oTN,sample.exe $ ssdeep -m known_samples/ sample.exe sample.exe matches Emotet_variant_2024.exe (89% similar)

Reading Sandbox Reports Critically

Automated sandbox reports are extremely useful for rapid triage — but they require critical reading. The sandbox can only report what it observed during its analysis window (typically 60–120 seconds), which may not represent the malware's full behaviour. Several factors cause sandboxes to produce incomplete reports:

Sleep bombs — the malware calls Sleep(600000) (10 minutes) before doing anything. The sandbox times out. The report shows nothing. The malware is not benign.
VM detection — the malware detects it is running in a sandbox and executes benign code only. The report shows nothing suspicious. The malware is not benign.
User interaction requirements — the malware checks for mouse movement, window focus, or specific user actions before proceeding. The sandbox environment does not simulate these. The report is incomplete.
Network dependency — the malware requires a response from its C2 before proceeding. If the sandbox cannot simulate the response, the execution chain stops early.

The Triage Decision Tree

Triage DecisionWhat to do with your sample based on initial findings

VirusTotal: 40+ detections AND family name consistent across engines? → COMMODITY: Known family. Extract IOCs. Pivot to threat intel for TTPs. Check for YARA rules in community repos. Confirm detection coverage. Done. VirusTotal: Some detections (1-15) OR inconsistent family names? → VARIANT: Likely a known family with evasion modifications. Run full static + dynamic analysis. Compare to known family. Update or write new YARA rule for the variant. VirusTotal: 0 detections OR very recent first-seen? → NOVEL / CUSTOM: Unknown or very new sample. Full analysis pipeline: static → dynamic → (Book 2) reverse engineering. Submit to sandbox. Report to threat intel team. Write YARA rule. Packer detected AND high entropy sections? → PACKED: Real content hidden. Static analysis limited. Dynamic analysis to let packer unpack in memory. Memory dump at post-unpack point for secondary static analysis.

Key Takeaways — Chapter 3

File type is determined by magic bytes, not extension — always identify file type before analysis; a .pdf icon that is MZ (0x4D5A) is a PE binary
A VirusTotal score of 0/72 does not mean the file is clean — it means detection signatures don't yet exist; this is when full analysis is most needed
ssdeep fuzzy hashing finds structurally similar samples — a 90% match to a known Emotet variant is a strong triage signal even at a different SHA-256
Sandbox reports require critical reading — sleep bombs, VM detection, and user interaction requirements all produce incomplete reports; absence of sandbox findings is not evidence of safety
The triage decision (commodity / variant / novel / packed) determines the entire subsequent analysis approach — spend time getting this right before committing to a full analysis pipeline

Chapter 04 · ~15 min · Foundations

The PE File Format

DOS header, PE signature, COFF and Optional headers, section table, Import Address Table, resources, and reading PE structure with pestudio and CFF Explorer

Almost every piece of Windows malware is a PE (Portable Executable) file — an EXE, DLL, SYS, or one of several other variants. Every static analysis technique in the following chapters operates on the PE structure. Understanding what each field means and where it lives in the file is what makes pestudio's output intelligible rather than a list of numbers.

The PE Format

The Portable Executable format is the binary container that Windows uses for executable code. It encodes everything the Windows loader needs to map the binary into memory and execute it: where the code is, which DLLs it needs, where to find the entry point, what permissions each section requires, and where the resources live.

PE Structure Overview

PE File Structure — Annotated Header Bytes

0x000000 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 ← DOS Header: MZ signature 0x000002 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 B8 00 (rest of DOS header) 0x00003C E8 00 00 00 ← e_lfanew: offset to PE signature (0xE8) ... 0x0000E8 50 45 00 00 ← PE Signature: "PE\0\0" 0x0000EC 64 86 ← Machine: 0x8664 = AMD64 (x64) 0x0000EE 06 00 ← NumberOfSections: 6 0x0000F0 A4 B3 C2 65 ← TimeDateStamp: Unix timestamp of compile 0x000100 0B 02 ← Magic: 0x020B = PE32+ (64-bit) 0x000118 00 10 40 00 00 00 00 00 ← ImageBase: 0x400000 (default for EXE) 0x000128 02 00 ← Subsystem: 0x0002 = Windows GUI 0x0003 = Windows CUI (console)

The Section Table — What Each Section Tells You

pestudio — Section AnalysisReading sections for malware indicators

Section VSize RSize Entropy Flags Notes ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ .text 0x12400 0x12600 5.84 RX Normal code section .rdata 0x04200 0x04400 4.21 R Read-only data (imports, strings) .data 0x00800 0x01000 2.10 RW Initialised global variables .rsrc 0x08000 0x08200 7.92 R Resources — HIGH ENTROPY: may be encrypted payload UPX0 0x1C000 0x00000 0.00 RWX UPX packed — VSize >> RSize: decompresses at runtime UPX1 0x0F000 0x0E800 7.98 RWX UPX compressed code — high entropy: encrypted/packed Suspicious indicators: • Section named UPX0/UPX1 → UPX packer detected • RWX (Read/Write/Execute) on section → self-modifying or injecting code • VirtualSize >> RawSize → section expands in memory: unpack target • Entropy > 7.0 on non-.text section → packed or encrypted data • Unusual section names (.ndata, .themida, CODE, DATA) → custom packer

The Import Address Table — Capability from DLL Imports

The Import Address Table (IAT) lists every external function the binary calls, organised by DLL. This single data structure is often the most informative piece of static analysis — the imports are a capability manifest. Before running a single instruction, you can make an educated assessment of what the malware can do based purely on what it imports.

API Function	DLL	Implied Capability	Signal
CreateRemoteThread	kernel32.dll	Process injection — creates a thread in another process	🔴 High
VirtualAllocEx	kernel32.dll	Allocates memory in a remote process — injection prerequisite	🔴 High
WriteProcessMemory	kernel32.dll	Writes data into another process's memory — injection	🔴 High
CryptEncrypt / CryptGenKey	advapi32.dll	Cryptographic operations — ransomware, data obfuscation	🔴 High
IsDebuggerPresent	kernel32.dll	Anti-debugging detection	🔴 High
GetProcAddress / LoadLibraryA	kernel32.dll	Dynamic API resolution — hides true imports from static analysis	🔴 High
RegSetValueEx / RegCreateKey	advapi32.dll	Registry writes — persistence, configuration storage	🟡 Medium
CreateService / OpenSCManager	advapi32.dll	Service installation — persistence as Windows service	🟡 Medium
WNetAddConnection	mpr.dll	Network share connection — lateral movement, worm propagation	🟡 Medium
URLDownloadToFile	urlmon.dll	Downloads a file from URL — dropper/loader behaviour	🟡 Medium
ShellExecute / CreateProcess	shell32/kernel32	Executes another process — payload launch, privilege escalation	🟡 Medium
GetClipboardData	user32.dll	Clipboard monitoring — stealer behaviour	🟡 Medium
InternetOpen / InternetConnect	wininet.dll	HTTP/internet communication — C2, download	🟢 Context
FindFirstFile / FindNextFile	kernel32.dll	File system enumeration — ransomware file traversal, stealer	🟢 Context

When the Import Table Is Empty or Minimal

A packed or obfuscated binary may show very few imports — sometimes only GetProcAddress and LoadLibraryA. This is intentional: the packer stub uses these two functions to resolve everything else dynamically at runtime, hiding the real capability from static import analysis. When you see this pattern, import table analysis is limited — defer to dynamic analysis where the resolved APIs will be visible in process monitoring output.

Key Takeaways — Chapter 4

The PE section table reveals packing (UPX section names, high entropy, RWX permissions, VirtualSize significantly larger than RawSize) before a single byte of code is executed
The Import Address Table is a capability manifest — CreateRemoteThread + VirtualAllocEx + WriteProcessMemory in the same binary is a near-certain injection indicator
A minimal IAT containing only GetProcAddress and LoadLibraryA signals dynamic import resolution — the packer is hiding the real capability; defer capability assessment to dynamic analysis
The compile timestamp in the PE header can indicate when malware was built — but it is trivially faked; treat it as a soft indicator, not ground truth
RWX section permissions are unusual and suspicious — legitimate software rarely needs sections that are simultaneously readable, writable, and executable

Part II · Chapters 5–7

Static Analysis

Extracting intelligence from a binary without executing it — strings, imports, entropy, packing detection, obfuscation recognition, and code signing

Chapter 05 · ~15 min · Static Analysis T1027 · T1140

Strings and Import Analysis

strings vs FLOSS, what strings reveal, stack strings and obfuscated strings, import capability mapping, and correlating indicators into a capability hypothesis

Strings extraction is the highest-value-per-effort technique in static malware analysis. A single run of the right tool against an unpacked sample can surface C2 domains, file paths, registry keys, mutex names, PDB paths, encryption keys, and error messages that define what the malware does before a single instruction executes. The challenge is that modern malware increasingly obfuscates its strings — and the standard strings tool misses the obfuscated ones.

strings vs FLOSS

The standard strings utility extracts sequences of printable ASCII or Unicode characters of a minimum length. This finds plaintext strings embedded directly in the binary. It misses strings that are built on the stack character by character (stack strings), strings that are XOR-encoded, strings that are base64-encoded, and strings that are concatenated in small pieces. FLOSS (FLARE Obfuscated String Solver) addresses all of these.

strings vs FLOSSWhat each tool finds — and what FLOSS finds that strings misses

$ strings sample.exe | grep -E "http|\.com|\.exe|HKEY|\\\\Software" http://update-service.net/check HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run C:\Users\Public\svchost32.exe ... (only finds plaintext strings) $ floss --no-static sample.exe FLOSS DECODED STRINGS: http://update-service.net/check ← also found by strings Mozilla/5.0 (Windows NT 10.0; Win64; x64) FLOSS STACK STRINGS (built char-by-char at runtime): CreateMutexA ← API name built on stack to hide from IAT Global\\MicrosoftUpdate ← mutex name — unique sample identifier 192.168.4.22 ← hardcoded C2 IP, never in static strings FLOSS TIGHT STRINGS (loop-decoded): cmd.exe /c whoami > C:\Users\Public\out.txt powershell.exe -enc SQBFAFgA...

What Strings Reveal — The Intelligence Categories

String Type	Examples	Intelligence Value
C2 Infrastructure	`update-cdn.net/api/v2/check`, `185.220.101.45`	Network IOCs — block at firewall, hunt in DNS/proxy logs
File system artefacts	`C:\ProgramData\svchost32.exe`, `%TEMP%\~tmp.bat`	File IOCs — hunt on endpoints, forensic artefact locations
Registry keys	`HKCU\Software\Configs\uuid`, `HKLM\SYSTEM\CurrentControlSet\Services\SvcName`	Persistence mechanism, configuration storage location
Mutex names	`Global\\MicrosoftUpdate_v2`, `LocalXPMutex`	Unique sample/family identifier — hunt on live systems, prevents re-infection
PDB paths	`C:\Users\developer\Documents\project\Release\payload.pdb`	Development artefact — username, directory structure, project name of author
Error messages	`Failed to connect: %s`, `Injection error code: %d`	Reveals internal operation names and functionality
Commands	`vssadmin delete shadows /all /quiet`, `net stop "Windows Defender"`	Direct ATT&CK technique identification
Encoded data	Long base64 strings, hex-encoded payloads	May contain embedded next-stage payload — extract and analyse separately

Reading Obfuscated Strings — What FLOSS Reconstructs

Stack strings appear as isolated single characters in the raw binary. They are invisible to strings but FLOSS executes the relevant code fragments in an emulator and captures the fully-formed string after the stack-building loop completes. The hex dump below shows what the raw binary looks like for a stack-built API name — each byte loaded individually with a MOV instruction — versus the same string stored plaintext.

Stack String vs Plaintext String — Raw Binary Bytes

Plaintext "CreateRemoteThread" in .rdata (visible to strings): 0x00403020 43 72 65 61 74 65 52 65 6D 6F 74 65 54 68 72 65 CreateRe ← contiguous bytes 0x00403028 61 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ad...... found by strings Stack string "cmd.exe" built byte-by-byte in .text (invisible to strings): 0x00401080 C6 45 F8 63 ← MOV [ebp-0x8], 'c' 0x00401084 C6 45 F9 6D ← MOV [ebp-0x7], 'm' 0x00401088 C6 45 FA 64 ← MOV [ebp-0x6], 'd' 0x0040108C C6 45 FB 2E ← MOV [ebp-0x5], '.' 0x00401090 C6 45 FC 65 ← MOV [ebp-0x4], 'e' 0x00401094 C6 45 FD 78 ← MOV [ebp-0x3], 'x' 0x00401098 C6 45 FE 65 ← MOV [ebp-0x2], 'e' FLOSS reconstructs: "cmd.exe"

Building the Capability Hypothesis

The output of strings analysis and import analysis should be combined into a capability hypothesis before moving to dynamic analysis. This hypothesis directs what to watch for in ProcMon and Wireshark — if your static analysis suggests persistence via Run key and C2 via HTTP, you know exactly which ProcMon filters and Wireshark display filters to configure.

Capability HypothesisCombining strings + imports into an analysis prediction

Static findings from sample.exe: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Strings found: http://update-service.net/api/check HKCU\Software\Microsoft\Windows\CurrentVersion\Run C:\ProgramData\MicrosoftUpdate\svchost32.exe Global\\MicrosoftUpdateMutex Mozilla/5.0 (Windows NT 10.0) Imports: RegSetValueEx, RegCreateKeyEx ← registry persistence InternetOpenUrl, InternetReadFile ← HTTP comms CopyFile, CreateProcess ← self-copy + execution IsDebuggerPresent ← anti-debug Capability Hypothesis: 1. ANTI-DEBUG check on startup (T1622) 2. SELF-COPY to C:\ProgramData\MicrosoftUpdate\svchost32.exe (T1036.005) 3. PERSISTENCE via HKCU Run key pointing to copied binary (T1547.001) 4. C2 CHECK-IN via HTTP GET to update-service.net/api/check (T1071.001) 5. Likely DOWNLOADER or simple RAT based on limited import set Dynamic analysis watchlist: ProcMon: filter to svchost32.exe process Registry: watch for HKCU\...\Run key writes Files: watch for writes to C:\ProgramData\MicrosoftUpdate\ Network: Wireshark filter: http.host contains "update-service"

Key Takeaways — Chapter 5

FLOSS finds obfuscated strings that strings misses — stack strings, XOR-encoded strings, and tight loop decoded strings; always run both
Mutex names are among the most unique per-sample indicators — they can be used to detect presence of the malware on live systems and to cluster related samples in threat intel
PDB paths are development artefacts that frequently survive into released malware — they can reveal developer usernames, directory structures, and sometimes project names
Combine strings and import findings into a capability hypothesis before running dynamic analysis — this directs tool configuration and makes dynamic analysis significantly more efficient
Long base64 strings in static analysis may be embedded next-stage payloads — decode them and treat the decoded output as a separate sample for analysis

Chapter 06 · ~14 min · Static Analysis T1027 · T1045

Entropy, Packing, and Obfuscation Detection

Shannon entropy, what high-entropy sections reveal, packer detection with Detect-It-Easy, manual UPX unpacking, dynamic API resolution and API hashing

Most malware distributed in the wild is packed or obfuscated. Packing serves two purposes: it reduces file size (the original purpose, now irrelevant) and it defeats signature-based detection and static analysis (the current purpose). Understanding what packing looks like — in entropy measurements, in section characteristics, in the import table — tells you when your static analysis is working on a stub rather than the real code, and what to do about it.

Shannon Entropy

Shannon entropy measures the information density (randomness) of data on a scale from 0 to 8 bits per byte. Plaintext English text has entropy around 3.5–4.5. Compiled code has entropy around 5.5–6.5. Compressed or encrypted data has entropy approaching 7.5–8.0 — because compression and encryption both produce output that is nearly indistinguishable from random bytes.

Entropy Analysispestudio and python — interpreting entropy per section

Entropy scale reference: 0.0 - 3.5 → Very low: likely zeroes or repeated patterns 3.5 - 6.5 → Normal: English text, compiled code, typical data 6.5 - 7.2 → Elevated: compressed data, some encryption 7.2 - 8.0 → High: packed/encrypted/compressed — strongly suggests packing Python entropy calculation for each section: $ python3 -c " import pefile, math, collections pe = pefile.PE('sample.exe') for s in pe.sections: data = s.get_data() cnt = collections.Counter(data) entropy = -sum(c/len(data) * math.log2(c/len(data)) for c in cnt.values() if c) print(f'{s.Name.decode().strip():12} entropy={entropy:.2f}') " .text entropy=5.92 ← normal compiled code .rdata entropy=4.18 ← normal read-only data .data entropy=2.44 ← normal initialised data UPX0 entropy=7.98 ← packed: essentially random bytes UPX1 entropy=7.95 ← packed

Packer Detection with Detect-It-Easy

Detect-It-Easy (DIE)Identifying packers, compilers, and protectors

$ die sample1.exe Packer: UPX(3.96)[NRV,best] ← UPX 3.96 with NRV compression PE: 64-bit $ die sample2.exe Protector: Themida/WinLicense(-)[-] PE: 32-bit, Visual C++ 2019 ← Commercial protector — requires dynamic analysis / debug to unpack $ die sample3.exe Packer: Custom(-)[-] ← Unknown/custom packer Compiler: MASM32 ← Custom packer — requires dynamic analysis, dump at OEP (Book 2) $ die clean.exe Compiler: Microsoft Visual C++ 2022 v17.x Linker: Microsoft Linker v14.x ← No packer detected — static analysis on the real code

Manual UPX Unpacking

UPX is by far the most common packer encountered in commodity malware. It is trivial to unpack — the UPX tool itself can unpack a UPX-packed binary in a single command. Custom and commercial packers require the dynamic unpacking techniques covered in Book 2.

UPX UnpackingUnpacking a UPX-compressed binary and comparing before/after

$ upx -d sample.exe -o unpacked.exe Ultimate Packer for eXecutables File size Ratio Format Name -------------------- ------ ----------- ----------- 81920 <- 42496 51.87% win64/pe unpacked.exe Unpacked 1 file. Compare import tables before and after: Before (packed): 2 imports — GetProcAddress, LoadLibraryA After (unpacked): 47 imports visible including: CreateRemoteThread, VirtualAllocEx, WriteProcessMemory InternetOpenUrlA, InternetReadFileEx RegSetValueExA, CryptEncrypt, IsDebuggerPresent Note: if upx -d fails ("not packed with UPX"), the UPX section names were renamed to defeat this detection — requires dynamic unpacking.

Dynamic API Resolution — Hiding Imports

Even without packing, malware can hide its true capabilities by resolving API functions at runtime rather than declaring them in the import table. Two common techniques: using GetProcAddress to look up functions by name at runtime, and API hashing — looking up functions by a computed hash of their name so even the string "CreateRemoteThread" doesn't appear in the binary.

Dynamic API Resolution PatternHow malware hides capabilities from static import analysis

Legitimate code (static import — visible in IAT): #include <windows.h> CreateRemoteThread(hProcess, ...); // visible in imports Malware pattern 1 — GetProcAddress resolution: HMODULE hK32 = LoadLibraryA("kernel32.dll"); FARPROC pCRT = GetProcAddress(hK32, "CreateRemoteThread"); // String still visible in FLOSS — look for it ((CreateRemoteThread_t)pCRT)(hProcess, ...); Malware pattern 2 — API hashing (ROR13 hash): DWORD hash = 0x72E2CAE7; // ROR13("CreateRemoteThread") FARPROC fn = resolve_by_hash(hash); // No string in binary at all — FLOSS won't find "CreateRemoteThread" // Requires dynamic analysis or RE to identify (Book 2) Detection in FLOSS output: look for API names appearing as stack strings Detection in dynamic analysis: ProcMon shows actual API calls regardless

Key Takeaways — Chapter 6

Entropy above 7.2 on any section strongly suggests packing or encryption — static analysis on such a section analyses the wrapper, not the actual malware code
Detect-It-Easy identifies the specific packer or compiler — this determines the unpacking approach: UPX is trivially unpacked with upx -d; commercial protectors and custom packers require dynamic techniques
After UPX unpacking, the real import table becomes visible — often revealing injection, cryptographic, and network capabilities that were hidden by the two-function stub
Dynamic API resolution via GetProcAddress still leaves the API name string in the binary (findable by FLOSS); API hashing leaves nothing readable — deferred to Book 2 reverse engineering
The presence of GetProcAddress and LoadLibraryA as the only imports is itself an indicator — it means the binary resolves everything else dynamically, and the real import table is hidden

Chapter 07 · ~15 min · Static Analysis

YARA Rule Writing

YARA syntax, writing rules from static indicators, the PE module, entropy and math modules, community rule sets, testing and false positive management

YARA is the universal pattern-matching language for malware analysis. A YARA rule encodes what you know about a malware sample — its strings, byte patterns, structural characteristics — into a portable, shareable, deployable detection artefact. Writing YARA rules is the bridge between analysis and detection: the findings from static analysis become the rules that hunt for the same malware across an entire environment.

YARA Rule Structure

A YARA rule has three sections: meta (descriptive information — name, author, date, description), strings (the patterns to search for — text, hex, or regex), and condition (the logical expression that determines a match — must all strings be present? any of them? a specific combination?).

Basic Rule Syntax

YARA RuleBasic rule structure — text, hex, and regex patterns

rule MalwareFamily_Loader_v2 { meta: author = "Lucent Grid Research" date = "2024-03-15" description = "Detects MalwareFamily loader based on C2 URI and mutex" hash = "8e3f2a1b4c5d6e7f8a9b0c1d2e3f4a5b" tlp = "TLP:WHITE" strings: // Text strings — ASCII by default $c2_uri = "/api/v2/check" ascii nocase $mutex = "Global\\MicrosoftUpdateMutex" wide ascii $useragent = "Mozilla/5.0 (Windows NT 10.0; Win64" ascii // Hex pattern — VirtualAllocEx + WriteProcessMemory sequence $inject_seq = { 48 8B C? 4C 8B [1-4] E8 ?? ?? ?? ?? 48 85 C0 74 } // Regex — matches any IPv4 address hardcoded in the binary $ip_pattern = /\b(?:25[0-5]|2[0-4]\d|[01]?\d\d?)(?:\.(?:25[0-5]|2[0-4]\d|[01]?\d\d?)){3}\b/ condition: // File must be a PE AND contain C2 URI AND (mutex OR injection) uint16(0) == 0x5A4D // MZ header check and $c2_uri and ($mutex or $inject_seq) and $useragent }

The PE Module — Structure-Aware Rules

YARA RulePE module — import-based and section-based detection

import "pe" rule Injector_PE_Indicators { meta: description = "Process injector — imports CreateRemoteThread + VirtualAllocEx" condition: uint16(0) == 0x5A4D // Has both injection APIs imported and pe.imports("kernel32.dll", "CreateRemoteThread") and pe.imports("kernel32.dll", "VirtualAllocEx") // Compiled recently (within last 2 years — adjust timestamp range) and pe.timestamp > 1640000000 // Has a section with RWX permissions (suspicious) and for any s in pe.sections: ( s.characteristics & pe.SECTION_MEM_EXECUTE != 0 and s.characteristics & pe.SECTION_MEM_WRITE != 0 ) } rule Packed_High_Entropy { meta: description = "Detects PE with high entropy non-text section — likely packed" import "math" condition: uint16(0) == 0x5A4D // Any section with entropy > 7.2 that isn't .text and for any s in pe.sections: ( not (s.name == ".text") and math.entropy(s.raw_data_offset, s.raw_data_size) > 7.2 and s.raw_data_size > 0x1000 ) }

Testing and False Positive Management

YARA TestingTesting rules against malware samples and clean files

# Test rule against a single sample $ yara -r rules/malwarefamily.yar samples/sample.exe MalwareFamily_Loader_v2 samples/sample.exe # Test against entire sample directory $ yara -r rules/ samples/malware/ 2>/dev/null | sort | uniq -c 42 MalwareFamily_Loader_v2 18 Injector_PE_Indicators 3 Packed_High_Entropy # Test against clean Windows system files (false positive check) $ yara -r rules/malwarefamily.yar C:\Windows\System32\ 2>/dev/null (no output) ← no false positives on clean system files # Bad sign — too many clean matches: $ yara -r rules/injector.yar C:\Windows\System32\ Injector_PE_Indicators C:\Windows\System32\notepad.exe Injector_PE_Indicators C:\Windows\System32\mspaint.exe ... (rule is too broad — needs tightening)

Community Rule Sets

YARA-Forge — automatically compiled community rules from multiple sources, deduplicated and tested. Best starting point for operational deployment.
Neo23x0 (Florian Roth) — extensive, actively maintained rule sets covering hundreds of malware families. High quality and widely used.
Mandiant/FLARE rules — from the same team that wrote FLOSS. Technically rigorous rules focused on advanced threats.
abuse.ch / YARAhub — community-contributed rules, often tied to current threat intelligence.

Key Takeaways — Chapter 7

YARA conditions should combine multiple indicators — a rule that fires on a single common string will have high false positives; combining C2 URI + mutex + structural indicator dramatically reduces them
The PE module enables structure-aware rules — import-based detection that fires on the capability signature rather than string content is harder to evade by recompilation
Always test rules against clean system files before deployment — a rule that matches notepad.exe will generate thousands of false positives in production
The uint16(0) == 0x5A4D condition (MZ header check) should be in almost every Windows malware rule — it restricts the rule to PE files and eliminates a large class of false positives
YARA-Forge and Neo23x0 provide production-quality community rules — extend and adapt them for variants rather than writing from scratch for every known family

Go Deeper

YARA official documentation and source YARA-Forge — curated community rule sets Neo23x0 signature-base — extensive malware YARA rules

Part III · Chapters 8–11

Dynamic Analysis

Running the malware and observing its behaviour — process monitoring, network traffic, memory forensics, and process injection detection

Chapter 08 · ~16 min · Dynamic Analysis T1547 · T1055 · T1112

Behavioural Analysis: Process and System Monitoring

ProcMon filter workflow, what to look for in the event stream, persistence registry locations, Process Explorer for injection detection, Autoruns, and ATT&CK correlation

Dynamic analysis begins the moment you execute the sample. From that point, three things are happening simultaneously: the malware is doing what it was designed to do, your monitoring tools are recording every action, and the capability hypothesis you built from static analysis is being confirmed or revised. The skill is not in running the tools — it is in filtering the noise and recognising the signal.

Process Monitor — The Event Stream

Process Monitor (ProcMon) captures every file system, registry, process, and network event on the system in real time. A single minute of execution on a Windows machine produces thousands of events — the vast majority from legitimate system processes. The workflow is filter first, analyse second.

Process MonitorFilter configuration and event stream interpretation

Filter configuration (before executing sample): Filter → Add → Process Name → is → sample.exe → Include Filter → Add → Process Name → is → cmd.exe → Include Filter → Add → Process Name → is → powershell.exe → Include (include any process the malware is likely to spawn) Filter → Add → Path → contains → \AppData\ → Include Filter → Add → Path → contains → \Temp\ → Include Filter → Add → Path → contains → Run → Include ← catches Run key writes Event stream — key findings after executing sample.exe: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Time Process Operation Path / Detail 14:23:01 sample.exe Process Start sample.exe 14:23:01 sample.exe RegOpenKey HKLM\SOFTWARE\VMware, Inc. ← VM check 14:23:01 sample.exe RegOpenKey HKLM\SOFTWARE\VirtualBox ← VM check 14:23:02 sample.exe WriteFile C:\ProgramData\MsUpdate\svchost32.exe 14:23:02 sample.exe RegSetValue HKCU\Software\Microsoft\Windows\CurrentVersion\Run Value: MsUpdate = C:\ProgramData\MsUpdate\svchost32.exe 14:23:03 sample.exe Process Create C:\ProgramData\MsUpdate\svchost32.exe 14:23:03 svchost32.exe TCP Connect 185.220.101.45:443 14:23:04 svchost32.exe TCP Send 885 bytes 14:23:04 svchost32.exe TCP Receive 124 bytes

Key Persistence Registry Locations

The following registry locations are the most commonly used by malware for persistence. Every one of these should appear in your ProcMon filters and your Autoruns review:

Registry Path	Scope	ATT&CK	Notes
`HKCU\...\CurrentVersion\Run`	Current user	T1547.001	Most common persistence location — executes on user login
`HKLM\...\CurrentVersion\Run`	All users	T1547.001	Requires admin — persists for all user logins
`HKLM\SYSTEM\CurrentControlSet\Services\`	System	T1543.003	Service installation — persists across all users, starts before login
`HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon`	System	T1547.004	Winlogon helper DLL — executes at logon
`HKLM\SOFTWARE\Microsoft\Windows NT\...\Image File Execution Options\`	System	T1546.012	Debugger hijacking — intercepts legitimate process execution
`HKCU\SOFTWARE\Classes\*\shellex\ContextMenuHandlers\`	Current user	T1546.015	COM hijacking via context menu handler
`HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnce`	All users	T1547.001	Executes once on next login then deletes itself — often used by droppers

Process Explorer — Live Injection Detection

Process ExplorerIdentifying process injection in the live process list

Process Explorer colour coding: Purple background → Packed/encrypted image on disk Red background → Process is terminating Light blue → Process running as current user Pink → Service process Injection indicators in Process Explorer: 1. Verify image path — right-click any process → Properties → Image tab svchost.exe path: C:\ProgramData\MsUpdate\svchost32.exe Expected: C:\Windows\System32\svchost.exe ← MISMATCH 2. Suspicious DLLs — View → Lower Pane → DLLs notepad.exe has DLL: C:\Users\Public\inject.dll ← not a system DLL 3. Memory strings — right-click process → Properties → Strings tab Searching for C2 strings in a legitimate process's memory Found "http://evil.c2.com/check" in svchost.exe memory → Malware has injected into this legitimate process 4. Threads with no module — View → Lower Pane → Threads Thread start address: 0x1A3B0000 → [No module] ← injected shellcode Legitimate threads have a module name (e.g. ntdll.dll, kernel32.dll)

Autoruns — What Persists Across Reboots

Autoruns from Sysinternals enumerates every autostart location on the system — every place where something is configured to run automatically. After executing malware, run Autoruns and look for new entries added since the clean baseline. Entries highlighted in yellow indicate the backing file is missing (the malware may have already deleted its dropper). Entries highlighted in red indicate the backing file is not signed.

Key Takeaways — Chapter 8

ProcMon filter configuration is the most important setup step — filtering to the malware's process name and likely child processes before execution eliminates noise and makes the event stream readable
The Run and RunOnce registry keys under both HKCU and HKLM are the most common persistence mechanisms — filter for them in ProcMon and check them first in Autoruns
Process Explorer's image path verification immediately surfaces masquerading — a binary named svchost.exe in anything other than System32 is malware
Threads with no backing module in Process Explorer indicate injected shellcode — legitimate threads always have a module name associated with their start address
Autoruns run after malware execution immediately shows what has persisted — yellow entries (missing file) indicate a dropper that cleaned itself up after installing the implant

Chapter 09 · ~15 min · Dynamic Analysis T1071 · T1568 · T1048

Behavioural Analysis: Network Traffic

INetSim and FakeNet-NG setup, C2 communication patterns, protocol identification, DNS analysis, extracting network IOCs, and writing Suricata rules

Network traffic analysis during dynamic analysis captures the malware's communication behaviour — what it connects to, what protocol it uses, what it sends, what it expects to receive. This produces the network IOCs that defenders can deploy immediately: C2 domains and IPs for firewall block lists, HTTP patterns for proxy rules, JA3 hashes for IDS signatures. It also reveals the C2 protocol structure, which becomes the foundation for more targeted detection.

Setting Up Network Simulation

INetSim (Internet Services Simulator) or FakeNet-NG runs on the REMnux VM and simulates internet services — DNS, HTTP, HTTPS, SMTP, FTP, IRC. When the analysis VM's DNS is pointed at REMnux, all DNS queries resolve to the REMnux IP, and all connections to simulated services receive plausible fake responses. This allows loaders that require a C2 response to continue executing past the initial check-in.

INetSim / FakeNet-NGWhat the malware sees vs what actually happens

Analysis VM DNS: pointing to REMnux (192.168.56.102) Malware queries: update-service.net INetSim DNS response: 192.168.56.102 ← REMnux IP for all names Malware: GET /api/v2/check HTTP/1.1 Host: update-service.net User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64) Cookie: session=abc123def456 INetSim HTTP response: 200 OK — fake response body Malware continues executing (received the "200 OK" it needed) Without INetSim: DNS query for update-service.net → NXDOMAIN (real DNS, host-only network) Malware: connection failed → terminates or sleeps → analysis incomplete

Reading the C2 Check-In in Wireshark

WiresharkAnalysing C2 check-in traffic — extracting IOCs

Wireshark display filter: http.request Request 1 — Initial check-in: GET /api/v2/check?id=A4B3C2D1&v=2.1 HTTP/1.1 Host: update-service.net User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) X-Session: 7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d ← session identifier (IOC) Accept: application/json Network IOCs extracted: C2 Domain: update-service.net C2 URI path: /api/v2/check User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Header: X-Session (custom header — strong detection indicator) GET params: id= (bot ID), v= (version) Wireshark display filter: dns DNS queries made by sample (chronological): update-service.net A record ← primary C2 backup-cdn.org A record ← secondary C2 (fallback) aaabbbccc111.evil.net A record ← DGA-generated? (high entropy subdomain)

Inspecting the Raw Beacon Packet

When a malware beacon uses a custom binary protocol over raw TCP rather than HTTP, Wireshark shows the raw bytes. Understanding the packet structure from a hex view is the first step toward writing a protocol parser or detection rule.

Custom TCP C2 Beacon — Raw Packet Bytes (Wireshark hex view)

Offset 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ASCII Field 0x000000 DE AD BE EF 00 00 00 01 00 00 00 28 00 00 00 00 ....(... ← Magic: 0xDEADBEEF 0x000010 A4 B3 C2 D1 E0 F9 A8 B7 C6 D5 E4 F3 A2 B1 C0 D9 ................ ← Bot ID (16 bytes) 0x000020 01 60 A4 C3 D2 E1 00 00 00 00 00 00 00 .`.............. ← Cmd: 0x01=check-in 0x000030 57 49 4E 31 30 2D 44 45 53 4B 54 4F 50 00 00 00 WIN10-DESKTOP.. ← Hostname (16 bytes) 0x000040 D3 C7 A2 1F ← CRC32 checksum Protocol reverse engineered: magic | version | bot_id | command | hostname | crc32 Detection: Snort content rule matching magic bytes 0xDEADBEEF at TCP payload offset 0

Identifying C2 Protocols

Protocol Pattern	Indicators in Traffic	Family Examples
HTTP/S C2	Regular GET/POST to same host, consistent interval, custom headers, encoded body	Emotet, Cobalt Strike default, most commodity RATs
DNS C2	High-frequency queries, long subdomain labels, TXT/NULL record responses, high entropy subdomains	SUNBURST, DNSMessenger, dnscat2
Raw TCP/UDP	Non-standard port, binary protocol (not HTTP), fixed-size packets, encrypted with no TLS handshake	Custom implants, older RATs
Legitimate service abuse	Connections to Pastebin, Discord, Telegram, GitHub — benign domains, unusual content/timing	Various commodity malware, living-off-the-land C2
ICMP C2	ICMP echo with non-zero data payload, large payload, encoded content in ICMP body	Rare but present in APT tooling

Writing a Suricata Rule from Traffic Analysis

Suricata RuleC2 detection based on custom HTTP header and URI pattern

# Rule derived from dynamic analysis — detects the X-Session header + /api/v2/check URI alert http $HOME_NET any -> $EXTERNAL_NET any ( msg:"MALWARE MalwareFamily C2 Check-In"; flow:established,to_server; http.method; content:"GET"; http.uri; content:"/api/v2/check"; startswith; http.header; content:"X-Session:"; # custom header IOC http.user_agent; content:"Windows NT 10.0"; threshold:type limit, track by_src, count 1, seconds 60; classtype:trojan-activity; sid:9000001; rev:1; )

Key Takeaways — Chapter 9

INetSim/FakeNet-NG is essential for analysing loaders and stagers — without fake internet responses, network-dependent malware terminates early and the full execution chain is never observed
Custom HTTP headers are among the strongest network IOCs — the X-Session header pattern in this chapter is far more specific than a domain or IP that will change
DNS queries are a complete picture of C2 infrastructure including fallback domains — capture all DNS queries, not just successful connections
Suricata rules derived from dynamic analysis target the actual communication protocol — they detect the behaviour regardless of which IP or domain the C2 migrates to
Connections to legitimate services (Discord, Telegram, GitHub) as C2 require content-level analysis — the destination domain is not suspicious but the usage pattern is

Chapter 10 · ~16 min · Dynamic Analysis T1055 · T1620 · T1027

Memory Analysis with Volatility

Why memory analysis matters, acquiring memory, Volatility 3 core plugins, malfind for injection detection, carving executables, and correlating with ProcMon

Memory analysis is the technique that finds what disk analysis misses. Fileless malware executes entirely in memory with no persistent file on disk. Injected code lives in the memory of a legitimate process — it appears nowhere in the file system. Encryption keys, decrypted configuration, and plaintext C2 credentials exist in memory during execution and nowhere else. Volatility provides a framework for interrogating a memory image to surface all of these.

Memory Acquisition in the Analysis VM

WinPmemAcquiring a memory image from the running analysis VM

# Acquire memory with WinPmem (run as Administrator in analysis VM) PS> .\winpmem_mini_x64_rc2.exe memory.raw Will write a raw memory image to memory.raw Imaging memory: 4096 MB 8589934592 bytes written Done # Alternatively: VirtualBox/VMware raw memory dump # VirtualBox: Machine state → take snapshot → extract .sav file # VMware: pause VM → .vmem file in VM directory is the raw memory dump # Transfer memory.raw to analysis host (via shared folder) # File size = RAM size — typically 4-8 GB

Core Volatility 3 Plugins

Volatility 3Core plugin workflow — from process list to injection detection

# Process listing — see all processes including hidden ones $ vol -f memory.raw windows.pslist PID PPID Name Offset Threads Handles Wow64 4 0 System 0xe... 147 - False 1024 748 svchost.exe 0xa... 12 384 False 3892 2048 svchost32.exe 0xb... 4 89 False ← suspicious name 2048 1024 sample.exe 0xc... 2 45 False # Process tree — reveals unusual parent-child relationships $ vol -f memory.raw windows.pstree ... explorer.exe (1892) ...... sample.exe (2048) ......... svchost32.exe (3892) ← svchost spawned by malware, not services.exe # Command line for each process $ vol -f memory.raw windows.cmdline 3892 svchost32.exe "C:\ProgramData\MsUpdate\svchost32.exe" (legitimate svchost.exe would show: C:\Windows\System32\svchost.exe -k netsvcs) # Network connections $ vol -f memory.raw windows.netscan Proto LocalAddr ForeignAddr State PID Owner TCP 10.0.0.5:49312 185.220.101.45:443 ESTABLISHED 3892 svchost32.exe

malfind — The Injection Detection Plugin

The malfind plugin finds memory regions that are: executable, not backed by a file on disk, and contain what appears to be executable code (PE header or shellcode). This is the primary signature of process injection — the injected code lives in allocated memory with no corresponding file.

Volatility — malfindFinding injected code in process memory

$ vol -f memory.raw windows.malfind PID Process Start End Tag Protection ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1024 svchost.exe 0x1F3A0000 0x1F43FFFF VadS PAGE_EXECUTE_READWRITE MZ.... ← PE header found in unbacked memory Hex dump of the suspicious region: 0x1f3a0000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 MZ.............. 0x1f3a0010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........@....... ← MZ header at the start confirms this is an injected PE (DLL or EXE) ← PAGE_EXECUTE_READWRITE (RWX) is unusual — legitimate mapped DLLs are RX ← VadS tag = private memory (not mapped from a file on disk) # Dump the injected PE for separate analysis $ vol -f memory.raw windows.dumpfiles --pid 1024 --output-dir ./dumped/ Dumped: file.0x1f3a0000.dat (the injected PE — analyse as a separate sample)

Finding Encryption Keys and Strings in Memory

Volatility — strings and yarascanExtracting C2 config and encryption keys from process memory

# Extract all strings from a process's memory $ vol -f memory.raw windows.strings --pid 3892 | grep -E "http|\.com|key|pass" http://update-service.net/api/v2/check AES_KEY=4a6f686e536d6974684a6f686e536d69 ← AES key visible in plaintext in memory Mozilla/5.0 (Windows NT 10.0; Win64; x64) # Scan memory with a YARA rule $ vol -f memory.raw windows.yarascan --yara-rules rules/malwarefamily.yar Offset Rule Process PID 0x1F3A1240 MalwareFamily_Loader_v2 svchost.exe 1024 ← injected PE matched the YARA rule

Key Takeaways — Chapter 10

Fileless malware and process injection leave no file artefacts — memory analysis is the only technique that finds them reliably
malfind finds injected code by looking for executable, file-unbacked memory regions containing PE headers or shellcode — it is the first plugin to run when injection is suspected
PAGE_EXECUTE_READWRITE (RWX) on a private memory region is the canonical injection signature — legitimate mapped DLLs are read-execute (RX), not writable
Volatility's yarascan applies YARA rules to process memory — the same rules written from static analysis can find injected copies in memory without a separate file artefact
Encryption keys, decrypted C2 configuration, and plaintext strings that are obfuscated on disk often appear in plaintext in process memory during execution — always run strings against the suspicious process memory region

Chapter 11 · ~16 min · Dynamic Analysis T1055.001 · T1055.002 · T1055.012

Process Injection and Memory Forensics

Classic DLL injection, reflective DLL injection, process hollowing, thread hijacking, parent process spoofing — what each technique produces in Volatility output

Process injection is the technique of running code within the address space of another, legitimate process. It is used to hide malware behind trusted process names, bypass application whitelisting, inherit the privileges and network access of the host process, and evade process-based detection. Each injection technique leaves a characteristic footprint — understanding that footprint is what lets you identify which technique was used from a memory image alone.

Classic DLL Injection

The most common and well-understood injection technique. The attacker allocates memory in the target process, writes the path to their malicious DLL, and creates a remote thread that calls LoadLibrary with that path. Windows loads the DLL into the target process as if it were legitimate.

Classic DLL Injection — Volatility SignatureWhat DLL injection looks like in dlllist and malfind output

# DLL injection adds the malicious DLL to the target's DLL list $ vol -f memory.raw windows.dlllist --pid 1024 Base Size Name Path 0x7ff... 512K ntdll.dll C:\Windows\System32\ntdll.dll 0x7fe... 832K kernel32.dll C:\Windows\System32\kernel32.dll 0x1a3... 148K inject.dll C:\Users\Public\inject.dll ← not in System32! 0x7fd... 256K user32.dll C:\Windows\System32\user32.dll Detection indicators: • DLL path outside System32/SysWOW64/Program Files • DLL with random or generic name in public/temp directories • DLL not present in the clean baseline's DLL list for this process

Reflective DLL Injection

Reflective DLL injection improves on classic injection by eliminating the need to write a file to disk. The malicious DLL contains a custom loader (the reflective loader) that maps the DLL into memory from a byte array, resolves its own imports, and calls its entry point — all without calling LoadLibrary. The DLL never appears on disk and doesn't appear in the standard module list.

Reflective DLL Injection — Volatility SignatureWhy it doesn't appear in dlllist but does appear in malfind and vadinfo

# NOT visible in dlllist — not loaded via LoadLibrary $ vol -f memory.raw windows.dlllist --pid 1024 (no malicious DLL — it wasn't registered with the loader) # BUT visible in malfind — it's executable private memory with a PE header $ vol -f memory.raw windows.malfind --pid 1024 0x2A1B0000 PAGE_EXECUTE_READ VadS MZ... ← PE header, no file backing # vadinfo shows the VAD entry has no file mapping $ vol -f memory.raw windows.vadinfo --pid 1024 --address 0x2A1B0000 VAD node: 0x2A1B0000 - 0x2A27FFFF Tag: VadS ← private/anonymous — not mapped from a file Protection: PAGE_EXECUTE_READ FileObject: None ← no file on disk Key difference from classic injection: Classic: dlllist shows the DLL / malfind may not fire (RX mapped section) Reflective: dlllist shows nothing / malfind fires (private RX/RWX with PE header)

Process Hollowing

Process hollowing creates a legitimate process in a suspended state, unmaps its original executable image from memory, and maps the malicious image in its place. The process name, PID, and path all appear legitimate — the content is entirely replaced.

Process Hollowing — Volatility DetectionImage base mismatch and VAD entry anomalies

# The PEB (Process Environment Block) stores the image base from the PE header # After hollowing, the PEB image base points to the malicious image # but the VAD entry for the original executable is gone $ vol -f memory.raw windows.cmdline PID 4012 notepad.exe "C:\Windows\System32\notepad.exe" (looks legitimate) $ vol -f memory.raw windows.malfind --pid 4012 0x400000 PAGE_EXECUTE_WRITECOPY VadS 4d 5a 90 00 ← PE header at 0x400000 (default image base) But this is VadS (private) not Vad (file-mapped) → hollowing confirmed $ vol -f memory.raw windows.dumpfiles --pid 4012 --output-dir hollow_dump/ Dumped: hollow_dump/4012.notepad.exe.0x400000.dmp (contains the malicious replacement image — analyse as a new sample) Detection summary: • Legitimate process name but malfind fires at image base address • VAD entry type is VadS (private) instead of Vad (file-mapped) at image base • Memory content at image base doesn't match the file on disk (imphash mismatch)

Key Takeaways — Chapter 11

Classic DLL injection appears in dlllist — look for DLLs outside System32/Program Files directories; reflective injection does not appear in dlllist but does appear in malfind
Process hollowing is detected by the VadS (private) tag at the expected image base address — a file-mapped legitimate process would show Vad (file-mapped) at that address
When malfind fires, always dump the region and analyse it as a separate sample — the injected PE may be a completely different malware family with its own IOCs
Parent process spoofing (PPID spoofing) makes malware appear to have been spawned by a legitimate parent — Volatility's pstree shows the actual parent PID; cross-reference with event logs that may show the real creator process
The combination of malfind + dlllist + vadinfo gives a complete picture of a process's memory — use all three before concluding whether injection is present and which technique was used

Part IV · Chapters 12–15

Detection, Intelligence, and Response

Turning analysis findings into detection capability — IOC extraction, ATT&CK mapping, sandbox analysis at scale, malware family triage, and IR integration

Chapter 12 · ~14 min · Detection & Intel

IOC Extraction and Threat Intelligence

IOC taxonomy, extraction workflow, IOC quality and decay, MISP and STIX/TAXII sharing, VirusTotal pivoting, and infrastructure analysis

Analysis findings become intelligence when they are structured, contextualised, and shared. An IP address in a Wireshark capture is a data point. That same IP address formatted as a STIX indicator, linked to a malware family, a campaign, and a threat actor, with a confidence score and a decay date, and shared via TAXII to every SIEM in your sector — that is threat intelligence. This chapter covers the transformation.

The IOC Taxonomy

Type	Examples	Strength	Decay Rate
Atomic	IP address, domain, URL, email address, file hash	Easy to deploy; widely supported	Fast — IPs and domains rotate in days to weeks; hashes evaded by recompile
Computed	imphash, ssdeep, TLSH, JA3 hash	Survives recompilation; clusters variants	Medium — survives until attacker changes import structure or TLS config
Behavioural	Registry key paths, mutex names, file path patterns, C2 URI structure	Hardest to evade — requires significant code change	Slow — behavioural indicators can be valid for months to years
Network pattern	HTTP header structure, beacon interval, custom protocol fields, Suricata rules	Protocol-level — survives C2 infrastructure rotation	Slow — valid until attacker changes the communication protocol

IOC Extraction Workflow

IOC ExtractionComplete IOC list from a single sample analysis session

Source: static analysis (strings, imports, PE structure) Hash (SHA-256): 8e3f2a1b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0a1b2c3d4e5f6a7b8c9d0e1f Hash (imphash): fcf2b3a7b8d9e4f1c5a2d6e0b3f7a9c1 Mutex: Global\MicrosoftUpdateMutex PDB path: C:\Users\dev01\Projects\loader\Release\loader.pdb Compile stamp: 2024-02-28 14:33:21 UTC Source: dynamic analysis (ProcMon, Autoruns) Drop path: C:\ProgramData\MsUpdate\svchost32.exe Registry key: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\MsUpdate Spawned process: C:\ProgramData\MsUpdate\svchost32.exe (child of sample.exe) Source: network analysis (Wireshark) C2 domain: update-service.net C2 URI: /api/v2/check C2 IP: 185.220.101.45 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Custom header: X-Session: [32 hex chars] Beacon interval: ~60 seconds ± 5s jitter Source: memory analysis (Volatility) Injected into: svchost.exe (PID 1024) Injected hash: a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6 (dumped injected PE) Memory string: AES_KEY=4a6f686e536d6974684a6f686e536d69

Infrastructure Pivoting

A single C2 domain or IP is a starting point, not an endpoint. Infrastructure analysis — finding related domains, certificates, and IPs operated by the same threat actor — dramatically expands the IOC set and provides early warning of future campaigns.

VirusTotal Graph + Passive DNSPivoting from one C2 domain to an entire infrastructure cluster

Starting point: update-service.net (C2 domain) Step 1 — VirusTotal: check IP history for the domain update-service.net resolved to: 185.220.101.45 (current) ← start here 185.220.101.46 (3 weeks ago) 185.220.100.12 (6 weeks ago) Step 2 — Pivot to IP: what else resolves to 185.220.101.45? Passive DNS for 185.220.101.45: update-service.net cdn-delivery.net microsoft-updates.org windowsupdate-cdn.com ← All likely same actor, same infrastructure Step 3 — Certificate transparency: what certs share the same subject? $ curl "https://crt.sh/?q=%.service.net&output=json" | jq '.[].name_value' "update-service.net" "backup-service.net" "support-service.net" ← Infrastructure pattern: [purpose]-service.net Step 4 — WHOIS: registrar, registration date, registrant email Registrar: NameSilo (common in crimeware infrastructure) Created: 2024-02-01 (10 days before sample compile date) Registrant email: privacy-protected (common)

Key Takeaways — Chapter 12

Behavioural IOCs (mutex names, registry key paths, URI structure, custom HTTP headers) age better than atomic IOCs (IPs, domains, hashes) — prioritise them for detection rules
Infrastructure pivoting via passive DNS, certificate transparency, and WHOIS turns one C2 domain into a cluster of related infrastructure — dramatically expanding early warning capability
imphash clusters samples that share the same import structure even when recompiled — a new binary with the same imphash as a known malware family is almost certainly from the same family
Sharing IOCs via MISP/STIX-TAXII multiplies their value — an indicator that protects one organisation protects every organisation in the sharing community simultaneously
Registration date close to compile date, privacy-protected WHOIS, and recently issued certificates are soft indicators of malicious infrastructure — no single one is conclusive, but the combination is telling

Chapter 13 · ~14 min · Detection & Intel ATT&CK

ATT&CK Mapping and Detection Engineering

Mapping analysis to techniques, building a Navigator layer, writing Sigma rules from behavioural findings, detection coverage assessment, and layered detection design

MITRE ATT&CK is the common language between malware analysis and detection engineering. When an analyst finds that a sample modifies the Run registry key, they can tag that finding as T1547.001. When a detection engineer asks which techniques are covered by current rules, they reference the same taxonomy. ATT&CK makes the conversation between analysts, detection engineers, and defenders possible without ambiguity about what behaviour is being described.

Mapping Sample Findings to ATT&CK

ATT&CK MappingTranslating analysis findings to technique IDs

Analysis finding ATT&CK Technique ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Phishing .docx with macro T1566.001 Spearphishing Attachment Macro executes PowerShell T1059.001 PowerShell IsDebuggerPresent import T1622 Debugger Evasion VM detection via registry queries T1497.001 Virtualization/Sandbox Evasion Self-copy to ProgramData T1036.005 Match Legitimate Name/Location HKCU Run key write T1547.001 Registry Run Keys Spawns copy of self T1543 Create or Modify System Process CreateRemoteThread injection T1055.001 DLL Injection HTTP GET to C2 every ~60s T1071.001 Web Protocols Custom X-Session header T1071.001 Web Protocols (sub-technique) vssadmin delete shadows (ransomware) T1490 Inhibit System Recovery Mass file encryption T1486 Data Encrypted for Impact

Writing Sigma Rules from Behavioural Findings

Sigma rules translate behavioural observations — what ProcMon saw, what event IDs fired, what Autoruns found — into portable SIEM detection rules. A Sigma rule written for Splunk can be converted to Sentinel KQL, Elastic EQL, or QRadar AQL by the Sigma converter.

Sigma RuleT1547.001 — Malware persistence via Run key (from ProcMon finding)

title: Suspicious Run Key Write — Non-Standard Binary Path description: Detects writes to HKCU/HKLM Run key pointing to unusual locations status: stable tags: - attack.persistence - attack.t1547.001 detection: selection: EventID: 13 # Sysmon Registry Value Set TargetObject|contains: - '\CurrentVersion\Run\' - '\CurrentVersion\RunOnce\' filter_legit: Details|contains: # Exclude known-good paths - 'C:\Program Files\' - 'C:\Program Files (x86)\' - 'C:\Windows\' condition: selection and not filter_legit falsepositives: - Portable applications that legitimately run from AppData level: medium

Detection Coverage Assessment

Once a sample's ATT&CK techniques are mapped, you can assess which of those techniques your current detection stack covers. This answers the question: "If this malware ran in our environment today, which stages would we detect, and which would be blind spots?"

Coverage AssessmentMapping sample techniques to detection rules — identifying gaps

Technique Coverage Notes ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ T1566.001 ✓ Email gateway Attachment scanning, macro detection T1059.001 ✓ EDR PowerShell script block logging rule T1622 ✗ NO COVERAGE No rule for debugger presence checks T1497.001 ✗ NO COVERAGE No rule for VM registry checks T1036.005 ✓ EDR Sigma rule: svchost outside System32 T1547.001 ✓ SIEM Run key write to non-standard path T1055.001 ✗ NO COVERAGE No memory-level injection detection T1071.001 ✓ Proxy Suspicious User-Agent / X-Session header T1486 ✗ NO COVERAGE No detection for mass file extension change Priority gap remediation: 1. T1055.001: Deploy Sysmon Event 8 (CreateRemoteThread) rule 2. T1486: Deploy rule: high-rate file rename events (Sigma T1486) 3. T1497.001: Low priority — evasion, not execution; address after #1 and #2

Key Takeaways — Chapter 13

ATT&CK mapping is the translation layer between analysis findings and detection engineering — every behavioural observation should be tagged with its technique ID before the analysis report is written
Sigma rules are portable — write once, convert to Splunk SPL, Sentinel KQL, or Elastic EQL via sigmac; the same behavioural finding generates detection for every SIEM simultaneously
Detection coverage assessment reveals which phases of an attack chain are visible and which are blind — prioritise filling gaps in execution and persistence techniques before anti-analysis techniques
A single malware sample's ATT&CK profile can drive an entire sprint of detection engineering work — map first, then systematically build rules for each uncovered technique
Layered detection is more resilient than single-indicator detection — a malware family that changes its C2 domain should still fire on its Run key write, its injection behaviour, and its network protocol pattern

Chapter 14 · ~13 min · Detection & Intel

Sandbox Analysis and Automated Tools

Sandbox architectures, reading reports critically, evasion techniques that defeat sandboxes, improving sandbox results, malware clustering with imphash and ssdeep, automated pipelines

At scale, manual analysis of every sample is impossible. A mid-sized organisation's email gateway may block hundreds of suspicious attachments per day. Sandboxes provide automated dynamic analysis — executing samples and reporting behaviour without an analyst manually operating the tools. Understanding how sandboxes work, where they fail, and how to make them more effective is the skill that makes automated analysis operationally useful rather than a false confidence generator.

Sandbox Architectures

Type	Mechanism	Examples	Strengths / Weaknesses
Agent-based	Software agent in guest OS monitors API calls, file/registry events	Cuckoo Sandbox, CAPE	Rich data; agent itself is detectable by sophisticated malware
Hook-based	Hooks Windows API functions at user or kernel level to intercept calls	Any.run, Joe Sandbox	Very detailed API logging; hooks create detectable timing differences
Hypervisor/hardware	Monitoring at the hypervisor level — invisible to guest OS	DRAKVUF, PolySwarm	Near-undetectable; hardware-level observation; expensive to deploy
Emulation	Executes code in a software emulator rather than real hardware	Speakeasy, Qiling	No risk to host; detectable via emulation artefacts; useful for shellcode

Sandbox Evasion — Why Reports Are Incomplete

Sandbox Evasion TechniquesHow malware detects sandboxes and what to do about it

Technique 1 — Sleep bomb Malware: Sleep(600000); // 10-minute sleep before doing anything Sandbox: times out after 120 seconds → reports nothing suspicious Counter: extend sandbox timeout; patch sleep calls (replace with NOP) Technique 2 — VM artefact detection Malware: check for VMware registry keys, VirtualBox process names check MAC address OUI (00:0C:29 = VMware, 08:00:27 = VirtualBox) Sandbox: terminates silently if VM detected Counter: use customised VM images; change MAC address OUI Technique 3 — Resource checks Malware: if (RAM < 4GB || CPUCores < 2) exit(); Sandbox: many run with minimal resources Counter: provision sandbox VMs with 4+ cores, 8GB+ RAM Technique 4 — User interaction check Malware: monitor mouse movement; if no movement in 30s, exit Sandbox: no simulated user interaction by default Counter: enable human simulation in sandbox settings Technique 5 — Language/locale check Malware: if (GetSystemDefaultLCID() in [RU, UA, BY, KZ]) exit(); // CIS exclusion — common in crimeware Counter: run sandbox with locale set to target country Technique 6 — Process list check Malware: enumerate processes; if procmon.exe or wireshark.exe exists, exit Counter: rename analysis tools (procmon.exe → svcutil.exe)

Malware Clustering — Finding Variants at Scale

Sample ClusteringUsing imphash, ssdeep, and TLSH to group malware families

Scenario: 500 unknown samples from a SIEM alert over 30 days Goal: cluster into families for efficient analysis Step 1 — imphash clustering (groups by import structure) $ python3 -c " import pefile, os, collections hashes = collections.defaultdict(list) for f in os.listdir('samples/'): try: pe = pefile.PE(f'samples/{f}') hashes[pe.get_imphash()].append(f) except: pass for h, files in sorted(hashes.items(), key=lambda x: -len(x[1])): if len(files) > 3: print(f'{h}: {len(files)} samples') " fcf2b3a7b8d9e4f1c5a2d6e0b3f7a9c1: 187 samples ← likely same family/builder a8b2c4d6e0f1a3b5c7d9e1f3a5b7c9d1: 89 samples 9e3f1a5b7c2d4e6f8a0b2c4d6e8f0a2b: 44 samples ... (147 unique imphashes — likely custom or small clusters) Step 2 — ssdeep clustering of the 187-sample group $ ssdeep -s -r samples/cluster1/ > hashes.txt && ssdeep -m hashes.txt samples/cluster1/ 96% match: sample_042.exe and sample_178.exe ← very similar, likely same campaign 71% match: sample_001.exe and sample_099.exe ← same family, different config Result: 500 samples → 3 families requiring deep analysis + 147 unique samples Instead of 500 analyses, perform 3 family analyses + triage of 147 unique samples

Key Takeaways — Chapter 14

A clean sandbox report does not mean a sample is benign — sleep bombs, VM detection, user interaction requirements, and process list checks all produce clean reports from malicious samples
Improving sandbox results requires customising the analysis environment — realistic hostname, MAC address, resource allocation, user simulation, and extended timeouts defeat the most common evasion classes
imphash clustering groups samples by import structure — a cluster of 187 samples with the same imphash likely came from the same builder and need only one deep analysis
ssdeep fuzzy hashing within an imphash cluster identifies configuration variants — samples at 95%+ similarity are the same binary with different embedded config; 60–80% similarity suggests same family with code changes
Automated triage pipelines (hash → VT lookup → sandbox → cluster → YARA) let analysts focus manual time on genuinely novel samples rather than re-analysing known commodity malware

Chapter 15 · ~14 min · Detection & Intel

Malware Families, IR Integration, and Career

Applying Book 1 techniques to ransomware, stealers, loaders, and RATs — the analysis-to-containment loop, analyst career path, certifications, and practice resources

The preceding fourteen chapters covered individual techniques in isolation. This final chapter applies them together — showing what each family type looks like when you run the full Book 1 analysis pipeline, and how those findings flow directly into incident response actions. It also closes the loop on the career path: where malware analysis fits in security operations, and how to build the skill through structured practice.

Ransomware Triage — What Book 1 Finds

Malware Family — Ransomware

Ransomware's distinguishing characteristic is mass file system activity — the encryption loop. Even without knowing the cryptographic implementation, Book 1 techniques surface every observable indicator needed for immediate incident response.

Ransomware AnalysisWhat each Book 1 technique finds on a ransomware sample

Static (strings + imports): Strings: "vssadmin delete shadows /all /quiet" ← T1490 "net stop %s" ← stops backup services "README_DECRYPT.txt" ← ransom note filename Imports: CryptGenKey, CryptEncrypt ← encryption capability FindFirstFileW, FindNextFileW ← file enumeration MoveFileExW ← file rename (extension change) Dynamic (ProcMon): 14:23:01 ransomware.exe Process Create cmd.exe /c vssadmin delete shadows /all /quiet 14:23:02 ransomware.exe Process Create cmd.exe /c net stop "Windows Backup" 14:23:03 ransomware.exe WriteFile C:\Users\...\document.docx.LOCKED 14:23:03 ransomware.exe DeleteFile C:\Users\...\document.docx 14:23:03 ransomware.exe WriteFile C:\Users\...\README_DECRYPT.txt ... (thousands of file encrypt/rename events in seconds) Network: beacon to C2 on first execution (exfiltrates encryption key) POST /key HTTP/1.1 Host: ransom-c2.onion.ws Body: [base64 AES key] IR actions from analysis: IMMEDIATE: isolate all endpoints showing MoveFileExW to .LOCKED extension IMMEDIATE: block ransom-c2.onion.ws at firewall/proxy HUNT: Event ID 4688: vssadmin / net stop across all endpoints RESTORE: Shadow copies (may already be deleted); restore from backup

Stealer — What Book 1 Finds

Stealer AnalysisBrowser credential theft visible in ProcMon and network capture

Static (strings): "Login Data" ← Chrome SQLite credential database filename "CryptUnprotectData" ← DPAPI decryption API (decrypts Chrome master key) "cookies" ← browser cookie database "wallet.dat" ← Bitcoin wallet file Dynamic (ProcMon): stealer.exe ReadFile C:\Users\victim\AppData\Local\Google\Chrome\ User Data\Default\Login Data stealer.exe ReadFile C:\Users\victim\AppData\Roaming\Mozilla\Firefox\ Profiles\...\logins.json stealer.exe TCP Send → 185.220.101.45:443 (2.3 MB exfiltration) IR actions: Force password reset for ALL user accounts on affected endpoint Revoke all active sessions (cookies stolen) Hunt for Login Data access events across all endpoints

The Analysis-to-Containment Loop

The value of malware analysis in an IR context is measured by the speed of the loop from finding to containment action. Analysis findings drive five categories of IR action:

Network isolation — C2 domains and IPs block at firewall and proxy immediately; DNS sinkholing for domains the C2 might rotate to
Endpoint hunting — YARA scans across all endpoints for the sample's static indicators; Sigma rules deployed to SIEM for behavioural indicators; Event ID hunting for specific actions the malware performs
Containment — isolate affected endpoints identified by hunting; disable persistence mechanisms found in analysis (Run keys, services, scheduled tasks)
Recovery — remove malware components at the paths identified in analysis; verify removal with YARA scan; restore from backup where data was destroyed
Lessons learned — detection gaps identified in the ATT&CK coverage assessment become the next sprint's detection engineering work

Malware Analyst Career Path

Role	Focus	Key Skills	Certification
Malware Analyst (Tier 1)	Triage, IOC extraction, sandbox analysis, YARA rules	Book 1 content — static, dynamic, Volatility, YARA	BTL1, GCFE
Malware Analyst (Tier 2)	Reverse engineering, unpacking, anti-analysis, family deep dives	Book 1 + Book 2 — assembly, Ghidra, IDA, debuggers	GREM, FOR610
Detection Engineer	Turning analysis into Sigma/YARA/Suricata rules, coverage assessment	ATT&CK, Sigma, SIEM query languages, threat intel integration	GCDA, vendor certs
Threat Intelligence Analyst	Clustering, attribution, campaign tracking, IOC sharing	MISP, STIX, pivoting, geopolitical context, reporting	GCTI, CTI certifications

Practice Resources

MalwareBazaar (abuse.ch) — free, searchable database of malware samples. Filter by tag, file type, or family. The primary free source for real samples.
any.run (free tier) — interactive sandbox — you control the mouse and keyboard inside the analysis environment. Excellent for samples requiring user interaction.
theZoo (GitHub) — curated repository of live malware samples for educational use. Handle with care — these are real, functional malware binaries.
Malware Traffic Analysis (malware-traffic-analysis.net) — exercises in the form of PCAP files and questions. Excellent for network analysis and IOC extraction practice.
FLARE-VM malware analysis challenges — Mandiant's periodic challenge samples released publicly after competitions.
Blue Team Labs Online / CyberDefenders — structured labs that include malware analysis scenarios with guided questions and scoring.

Key Takeaways — Chapter 15

Ransomware's Book 1 signature is mass file rename events + vssadmin shadow copy deletion + a single large network POST on first execution — all visible without assembly knowledge
Stealer behaviour is visible in ProcMon as file reads to browser profile directories — the Chrome Login Data SQLite file and Firefox logins.json accessed by an unknown process is an immediate containment trigger
The analysis-to-containment loop converts findings directly into IR actions — C2 domains to firewall blocks, persistence paths to registry cleanup, ATT&CK coverage gaps to detection rules
GREM (GIAC Reverse Engineering Malware) is the senior certification for the discipline — FOR610 is the corresponding SANS course; both assume Book 1 skills and teach Book 2 content
MalwareBazaar provides free real samples for practice — always analyse in an isolated VM, never on a production machine or connected to a production network

Go Deeper

MalwareBazaar — free malware sample repository any.run — interactive sandbox (free tier available) Malware Traffic Analysis — PCAP exercises FlareVM — automated Windows analysis environment setup MITRE ATT&CK — adversary technique knowledge base

Continue Your Learning

Malware Analysis Book 2: Reverse Engineering & Advanced Techniques

Book 1 covered the practitioner skills — triage, behavioural analysis, memory forensics, and detection engineering. Book 2 goes to the code level: reading x86/x64 assembly, reverse engineering with Ghidra and IDA Pro, unpacking custom packers, defeating anti-analysis techniques, and performing full reverse engineering of ransomware, RATs, and rootkits.

12 chapters · ~3 hrs · Prerequisites: Book 1 or equivalent experience
Assembly fundamentals · Ghidra workflow · IDA Pro · x64dbg dynamic debugging
Custom unpacking · Deobfuscation · API hashing · Anti-debugging · Anti-VM
Shellcode analysis · Rootkit internals · Ransomware RE · C2 protocol reversal

Continue to Book 2 →

Malware AnalysisFoundations & Behavioural

What Malware Is

The Malware Taxonomy

The Infection Chain

The Malware Ecosystem

How Samples Reach Analysts

Building the Analysis Environment

Why VMs and Why Snapshots

FlareVM — The Windows Analysis Toolkit

Network Isolation Options

Safe File Handling

Anti-Analysis at Setup Time

First Triage: Identifying What You Have

File Type Identification

Hashing and VirusTotal Workflow

Reading Sandbox Reports Critically

The Triage Decision Tree

The PE File Format

PE Structure Overview

The Section Table — What Each Section Tells You

The Import Address Table — Capability from DLL Imports

Strings and Import Analysis

strings vs FLOSS

What Strings Reveal — The Intelligence Categories

Reading Obfuscated Strings — What FLOSS Reconstructs

Building the Capability Hypothesis

Entropy, Packing, and Obfuscation Detection

Shannon Entropy

Packer Detection with Detect-It-Easy

Manual UPX Unpacking

Dynamic API Resolution — Hiding Imports

YARA Rule Writing

Basic Rule Syntax

The PE Module — Structure-Aware Rules

Testing and False Positive Management

Community Rule Sets

Behavioural Analysis: Process and System Monitoring

Process Monitor — The Event Stream

Key Persistence Registry Locations

Process Explorer — Live Injection Detection

Autoruns — What Persists Across Reboots

Behavioural Analysis: Network Traffic

Setting Up Network Simulation

Reading the C2 Check-In in Wireshark

Inspecting the Raw Beacon Packet

Identifying C2 Protocols

Writing a Suricata Rule from Traffic Analysis

Memory Analysis with Volatility

Memory Acquisition in the Analysis VM

Core Volatility 3 Plugins

malfind — The Injection Detection Plugin

Finding Encryption Keys and Strings in Memory

Process Injection and Memory Forensics

Classic DLL Injection

Reflective DLL Injection

Process Hollowing

IOC Extraction and Threat Intelligence

The IOC Taxonomy

IOC Extraction Workflow

Infrastructure Pivoting

ATT&CK Mapping and Detection Engineering

Mapping Sample Findings to ATT&CK

Writing Sigma Rules from Behavioural Findings

Detection Coverage Assessment

Sandbox Analysis and Automated Tools

Sandbox Architectures

Sandbox Evasion — Why Reports Are Incomplete

Malware Clustering — Finding Variants at Scale

Malware Families, IR Integration, and Career

Ransomware Triage — What Book 1 Finds

Stealer — What Book 1 Finds

The Analysis-to-Containment Loop

Malware Analyst Career Path

Practice Resources

Malware Analysis Book 2: Reverse Engineering & Advanced Techniques

Malware Analysis
Foundations & Behavioural