Learn how password hashes work, why they can be cracked offline, and how John the Ripper uses wordlists and rules to recover plaintext credentials.
Password Hash Cracking
When applications store passwords, they should never store the plaintext. Instead they store a cryptographic hash — a one-way function that converts the password into a fixed-length string. To verify a login, the application hashes the submitted password and compares it to the stored hash.
Offline cracking occurs when an attacker gains access to a file containing password hashes (e.g. /etc/shadow, a database dump) and tries to recover the original passwords by hashing candidate passwords and comparing them — without ever touching the live system. John the Ripper (JtR) is the most widely used offline password cracker, pre-installed on Kali Linux and required knowledge for CEH and PenTest+ certification exams.
Why Hashes Break — and Why It Matters for Defence
A hash function takes an input (the password) and produces a fixed-length output (the digest). The same password always produces the same hash. Different passwords produce different hashes. You cannot mathematically reverse a hash to recover the original password. So how does cracking work?
The wax seal analogy: Imagine a medieval king who verifies letters by pressing them into wax with his unique signet ring. Anyone can verify the seal came from the king's ring — but they can't reconstruct the ring from the wax impression. Password cracking doesn't reconstruct the ring. Instead, it has a bag of thousands of rings and presses each one into fresh wax, then checks which impression matches the one on the letter. It's a search problem, not a mathematical reversal. Given enough candidate rings (passwords), you'll eventually find the match.
This is exactly how John the Ripper works: it takes candidate passwords from a wordlist or generates them algorithmically, computes the hash of each one using the same algorithm as the target system, and checks whether the result matches any hash in the target file. The search terminates when a match is found — the candidate that produced that hash is the original password.
The security implications flow directly from this: cracking speed depends on two things — how many candidates per second the hardware can process, and how large the search space is (how many possible passwords exist). Slow algorithms (bcrypt, Argon2) artificially reduce the first. Long, random passwords exponentially increase the second. Both are necessary; neither alone is sufficient.
The Linux Shadow File — Why It Matters
On Linux systems, /etc/passwd is readable by all users — it contains account information but not passwords. Password hashes are stored separately in /etc/shadow, which is readable only by root. This separation exists precisely because any user can read /etc/passwd — if it also contained hashes, any local user could crack them offline.
When a penetration tester gains root access to a Linux system, reading /etc/shadow is standard post-exploitation procedure. The unshadow utility combines the two files into the format John expects, merging the account metadata from passwd with the hash data from shadow.
# /etc/shadow entry format: alice:$6$rounds=5000$salt$hashvalue:19000:0:99999:7::: ↑ ↑ ↑ ↑ ↑ user algorithm rounds salt hash # Hash algorithm identifiers: $1$ MD5crypt (legacy, fast to crack) $5$ SHA-256crypt (moderate speed) $6$ SHA-512crypt (slower, current Linux default) $2b$ bcrypt (very slow, ideal) # The algorithm prefix tells John what cracking mode to use
John the Ripper Modes
John the Ripper (JtR) automatically detects hash types and supports three main cracking modes, from fastest to most thorough. Understanding when to use each mode is as important as knowing the commands — starting with the wrong mode wastes hours, while the right sequence cracks most real-world passwords in minutes.
john hashfile.txt # auto-detect format, run all modes john --format=sha512crypt hashfile.txt # specify format explicitly john --wordlist=rockyou.txt hash.txt # wordlist mode — fastest, try first john --wordlist=rockyou.txt --rules hash.txt # wordlist + rule mutations john --incremental hash.txt # brute force — slowest, last resort john --show hashfile.txt # display all previously cracked passwords john --list=formats # list all supported hash formats # Mode order for practical use: # 1. Single crack (uses account metadata — very fast, catches trivial passwords) # 2. Wordlist (rockyou.txt — catches most real-world passwords in minutes) # 3. Wordlist + rules (catches policy-compliant mutations like Password1!) # 4. Incremental (exhaustive brute force — only for short/simple passwords)
Cracking in Practice
After gaining root access to a Linux system, extracting and cracking all local account password hashes is standard post-exploitation procedure. The unshadow utility combines /etc/passwd (account metadata) with /etc/shadow (password hashes) into a single file John can process. Cracked passwords are tested for reuse across SSH, databases, and other services throughout the environment.
# Combine the two files into John's expected format: unshadow /etc/passwd /etc/shadow > combined.txt # Run wordlist attack first — catches most passwords quickly: john --wordlist=/usr/share/wordlists/rockyou.txt combined.txt password123 (alice) admin (root) # Add rules to catch policy-compliant mutations: john --wordlist=rockyou.txt --rules=jumbo combined.txt S3cur3Pass! (deploy) # Show all cracked passwords across all sessions: john --show combined.txt 3 password hashes cracked, 1 left
The value of cracking local accounts extends beyond that single machine. Developers and system administrators routinely reuse passwords. A password cracked from one server's shadow file should be tested against every other service in the environment — SSH on other hosts, VPN credentials, web application logins, and database access.
Password policies requiring uppercase, numbers, and symbols don't create random passwords — they create predictable mutations of common words. Rules transform wordlist entries to match these patterns, catching passwords like Password1! that a plain wordlist misses entirely.
The predictable modification analogy: If a workplace requires you to wear a formal hat, most people don't buy a random hat — they take their favourite cap and add a ribbon to make it technically "formal." Password policies work the same way: people take a word they know and apply the minimum changes to pass the check. Rules codify exactly these minimum changes and try them all automatically.
# Jumbo rules try hundreds of common transformations: john --wordlist=rockyou.txt --rules=jumbo hashfile.txt # For each word "password", tries: password → Password → PASSWORD → p@ssword password → password1 → Password1 → Password1! password → passw0rd → P@ssw0rd → P@ssw0rd1! # Single rule set — faster, covers most common patterns: john --wordlist=rockyou.txt --rules=best64 hashfile.txt # List available rule sets: john --list=rules
Attempting to crack a hash in the wrong format produces zero results — John either fails silently or misidentifies the algorithm. Identifying the hash type before cracking saves significant time. Hash format is identifiable from its length, character set, and prefix. The hash-identifier tool automates this, and is itself on the CEH practical tool list.
# Visual identification by length and prefix: 5f4dcc3b5aa765d61d8327deb882cf99 ← MD5: 32 hex chars 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8 ← SHA-1: 40 hex chars $6$rounds=5000$salt$hashvalue ← SHA-512crypt: starts $6$ $2b$12$EixZaYVK1fsbw1ZfbX3OXe ← bcrypt: starts $2b$ aad3b435b51404eeaad3b435b51404ee ← LM hash: 32 hex, often repeated # Automated identification: hash-identifier 5f4dcc3b5aa765d61d8327deb882cf99 Possible Hashs: MD5 # List John's supported formats and find the right one: john --list=formats | grep -i sha john --format=sha512crypt hashfile.txt
John can crack password-protected files by first extracting a crackable hash representation using helper tools. This applies to ZIP archives, PDF files, Office documents, SSH keys, and more. The process is always the same: extract-to-hash, then crack the hash offline with no interaction with the original file needed beyond the initial extraction.
# Step 1: Extract the hash from the protected file: zip2john protected.zip > zip.hash protected.zip/secret.txt:$pkzip2$1*2*2*0*... # Step 2: Crack the extracted hash: john --wordlist=rockyou.txt zip.hash secretzip (protected.zip/secret.txt) # Step 3: Use the recovered password: unzip -P secretzip protected.zip # Other helper tools follow the same extract-then-crack pattern: # pdf2john → PDF files # office2john → Word, Excel, PowerPoint # ssh2john → encrypted SSH private keys # keepass2john → KeePass .kdbx databases
When wordlist and rule-based attacks fail, incremental mode performs exhaustive brute force — trying every possible combination of characters up to a specified length. This is slow and should only be used for short passwords (6 characters or fewer) or when the character set is known to be limited (e.g., digits only, lowercase only). For anything above 8 characters, incremental mode is computationally impractical on standard hardware.
# Built-in character sets: john --incremental=Digits hash.txt # digits only (0-9) john --incremental=Alpha hash.txt # lowercase letters john --incremental=Alnum hash.txt # letters + digits john --incremental hash.txt # full charset — very slow # Practical guide: # 4-digit PIN (0000-9999): seconds # 6-char lowercase: minutes to hours # 8-char mixed: days to weeks on CPU # 10+ chars mixed: effectively infeasible on CPU alone # Use hashcat (GPU mode) for serious brute force above 8 chars
Where Hash Cracking Appears in Real Assessments
A penetration tester exploits a SQL injection vulnerability on a legacy e-commerce application and dumps the users table. The password column contains 32-character hex strings — MD5 hashes with no salt. Running John against the dump with rockyou.txt and basic rules, 74% of 8,000 hashes crack within 20 minutes. The cracked passwords include the site administrator's credentials.
Beyond that single application, the cracked email:password pairs are tested for reuse. The administrator used the same password for their corporate email account. From a SQL injection in a web form, the assessment team now has access to the email account of the company's head of IT — including internal communications, cloud service login links, and sensitive project documents.
This cascade — SQLi → hash dump → offline crack → password reuse → email compromise — is a documented pattern in real breaches. The hash algorithm choice (unsalted MD5 vs bcrypt) is the single change that would have contained the breach to the original application.
During a red team engagement, initial access is gained via a web application vulnerability. The web server runs as www-data, but a local privilege escalation via a misconfigured sudo rule elevates to root. The tester reads /etc/shadow, finding SHA-512crypt hashes for four accounts: root, deploy, backup, and admin.
John cracks two of them within an hour using rockyou.txt with jumbo rules: deploy with password Deploy2023! and backup with Backup99. The deploy account is used for deployments across twelve servers in the environment. The same password works via SSH on nine of them. From one web application vulnerability, the assessment team has lateral movement to nine production servers — all because of password reuse across a deployment account.
What Hash Cracking Reveals About Password Security Posture
- Hashing algorithm is the most impactful single decision. MD5 and SHA-1 hashes from a stolen database can be cracked at billions of attempts per second. bcrypt at cost 12 reduces this to thousands. The same 8-character password that cracks in seconds against MD5 requires years against bcrypt. For stored passwords, algorithm choice overwhelms all other controls.
- Every hash file is a credential intelligence source. Even partially cracked hash sets reveal an organisation's password patterns — the typical length, common prefixes, favourite substitutions. This intelligence directly improves targeted rule creation for harder passwords in the same set.
- Offline cracking bypasses all online controls. Account lockout, rate limiting, MFA, and CAPTCHA all protect the live authentication service. None of them matter once the hash file is obtained. The only controls that matter at that point are the strength of the hash algorithm and the randomness of the passwords.
- Password reuse turns a single hash crack into a domain-wide event. The average person reuses passwords across 5–14 services. One cracked hash from a low-value system routinely unlocks corporate email, VPN, cloud consoles, and domain accounts. Password managers and unique-per-service passwords are the only effective mitigation.
What You Need to Know
You've covered the theory. Now apply it hands-on in the simulated environment.
Start Lab — John the Ripper →← Return to all labs