Analyzing Modern Phishing Hosting Infrastructure

The alert lands in your queue: a credential phishing link in an otherwise clean email. The URL isn't some garbage-string domain hosted on a compromised WordPress site. It's using HTTPS, served by Cloudflare, and resolves to a perfectly reputable IP address. The page looks pixel-perfect. Ten years ago, you'd just block the IP and the domain and move on. Today, that's just the first, and least effective, step in a much longer fight.

This isn't your parents' phishing. Attackers have professionalized their operations, adopting the same tools legitimate businesses use for resilience and performance. They treat their phishing sites like production web applications, deploying them behind Content Delivery Networks (CDNs), using decentralized file systems for hosting, and exfiltrating data through commercial APIs. For a SOC analyst, this means the old playbook of 'block IP, block domain' is dangerously insufficient.

Understanding this modern phishing hosting infrastructure isn't academic. It's the key to effective remediation, pivoting to find related campaign activity, and actually disrupting the actor's operations instead of just playing whack-a-mole with disposable domains.

The CDN Shell Game: Hiding in Plain Sight

Content Delivery Networks are the internet's bedrock. They cache content geographically close to users, provide DDoS mitigation, and offer free, auto-renewing SSL/TLS certificates. For a threat actor, this is a gold mine. Wrapping a phishing site in a service like Cloudflare or Akamai provides an instant credibility boost and a thick layer of operational security.

Why It Works

The core benefit is origin obfuscation. When you resolve the domain of a Cloudflare-protected site, you get a Cloudflare IP, not the attacker's server IP. This breaks the simplest and most common IOC—the malicious IP address. You can't block it, because you'd be blocking thousands of legitimate sites. Even if you submit an abuse report, the CDN is merely the proxy; the actual malicious files live on an 'origin server' hidden somewhere else. Takedown responsibility is diffused.

Furthermore, the free SSL certificate makes the phishing link look trustworthy to the end user. The little padlock in the browser bar is a powerful psychological cue, even though it only signifies an encrypted connection, not a trustworthy destination. The entire setup costs the attacker nothing and makes their campaign significantly more resilient to knee-jerk blocking.

The Analyst's Pivot

Your job is to peel back the CDN layer. Tools that search historical DNS records or look for technology fingerprints (like Censys or SecurityTrails) can sometimes uncover the true origin IP. Attackers make mistakes. They might forget to proxy a subdomain, or the origin IP might have been exposed before they configured the CDN correctly. Finding that origin IP gives you a high-fidelity IOC that you *can* block and report to the actual hosting provider.

Decentralized Deception: The IPFS Nightmare

Just when you get good at finding origin servers, a new wrinkle appears: phishing pages hosted on the InterPlanetary File System (IPFS). IPFS is a peer-to-peer protocol for storing and sharing data in a distributed file system. Content isn't addressed by location (like an IP address) but by its cryptographic hash. To put it bluntly: there is no single server to take down.

When an attacker uploads their phishing kit to IPFS, the content is replicated across multiple nodes run by volunteers. The URL a victim receives will typically point to a public IPFS gateway—a server that acts as a bridge between the traditional web and the IPFS network. The URL might look something like `ipfs.io/ipfs/QmXo...` or use a custom domain that CNAMEs to a gateway.

A piece of content is available from any node that has it, for as long as at least one node in the network is pinning it.

Gateways: The Centralized Chokepoint

The decentralized nature is the challenge. You can't file an abuse report with 'the IPFS network'. However, the campaign's reliance on public gateways is its weak point. These gateways are centralized entities, often run by reputable companies, and they have abuse policies. While taking down a gateway won't remove the content from IPFS itself, it breaks the specific URL used in the phishing email. Your focus shifts from taking down the content to taking down the access path.

Analysts must identify which IPFS gateway is being used and report the full malicious URL to the gateway operator. It's still a cat-and-mouse game—the attacker can simply switch to another gateway for their next campaign—but it neutralizes the immediate threat.

Exfiltration via API: Telegram as a C2 Drop Box

The infrastructure story doesn't end with hosting the page. The attacker also needs to receive the stolen credentials. The old method was a PHP script on the phishing server that would write credentials to a local file or email them to the attacker. This is noisy and creates artifacts on the server that can be discovered.

Modern kits don't do that. Instead, they use APIs as a fire-and-forget data exfiltration channel. A favorite is the Telegram Bot API. The phishing page's client-side JavaScript captures the username and password, then makes a simple POST request directly to `api.telegram.org`. The request sends the credentials as a message to a private channel or chat controlled by the attacker.

This is brutally effective. The request is an outbound HTTPS connection from the victim's own browser to a legitimate domain (Telegram). It doesn't touch the phishing server. There's nothing to find in the server logs. By the time your analysis begins, the credentials are long gone, sitting in a chat log outside the reach of any takedown process.

When you find a Telegram `chat_id` and bot token hardcoded in the phishing page's JavaScript, you have confirmation of data exfiltration. The incident response priority immediately shifts from infrastructure takedown to user account compromise: force a password reset, invalidate sessions, and check for downstream malicious activity.

From a Single Email to the Wider Campaign

An analyst's value isn't just closing a single ticket. It's using one artifact to uncover the attacker's entire setup. To do that, you need to pivot from the data you have, and the email headers are still one of the richest sources of truth, even when authentication mechanisms fail.

Reading the `Received` Chain

Email authentication standards like SPF (RFC 7208) and DKIM (RFC 6376) are vital, but they can be misleading, especially with forwarded mail breaking SPF alignment. The `Received` headers, however, are a stamped, chronological travelogue of the email's journey from the sender's mail submission agent (MSA) to your user's inbox. Read them from bottom to top. The first `Received` header, added by the sender's own outbound relay, will often contain the true source IP of the script or client that sent the mail. Even if the `From` address is spoofed, this IP is a high-confidence indicator.

Received: from mail.attackerserver.com (mail.attackerserver.com [198.51.100.23]) by mx.corporate-inbound.com (Postfix) with ESMTPS id 4Mq... for <user@example.com>; Tue, 14 May 2024 10:30:01 +0000 (UTC) — Example Received Header

In that one line, you have a hostname and an IP (`198.51.100.23`) to investigate. Is it a known-bad IP? Does the hostname's domain have a poor reputation? This is where you pivot to passive DNS, WHOIS records, and threat intelligence feeds to see what else is associated with that infrastructure.

Unpacking the URL Payload

The phishing URL itself is another pivot point. Don't just browse to it. Use a command-line tool like `curl -v` or an online sandbox to inspect the HTTP response headers and page content safely. Look for redirect chains. A URL shortener might lead to a compromised-but-legitimate site, which then redirects to the final credential harvester behind a CDN. Each step is a potential pivot point revealing more of the attacker's infrastructure.

Pay close attention to the domain and TLD. Attackers love using lookalike domains or new TLDs like `.xyz`, `.top`, or `.live` that are cheap and have lax registration policies. The domain's registrar and creation date are crucial data points. A domain registered yesterday hosting a Microsoft login page is never legitimate.

The takeaway

Chasing down every phishing site is an unwinnable war of attrition. The infrastructure is too cheap, too resilient, and too easy to replace. Attackers can spin up a new CDN-protected, IPFS-hosted site with API-based exfiltration in minutes. Our response times are measured in hours or days. The math doesn't work.

The operational goal, then, must shift. Instead of focusing solely on takedowns, analysts need to focus on extracting and correlating infrastructure patterns. By identifying the origin hosts, the specific Telegram bots, the domain registration habits, and the TTPs of the phishing kit itself, you build a profile of the threat actor. This intelligence allows for proactive defense—blocking patterns, not just instances. Tools like MailSleuth.AI can automate the drudgery of header analysis and infrastructure lookups, giving you the time to focus on that higher-level strategic analysis.

Phishing's New Maze: Tracing Attacks Through CDNs, IPFS, and APIs