A Practical EML Header Analysis Workflow

The alert tone is an old, familiar sound. Another user-reported phish, another ticket in your queue with a subject line like `FW: Urgent: Invoice Overdue`. Attached is the raw message, a single `.eml` file. The clock is ticking. This isn't just about finding a bad link; it's about building a defensible case for why this message is malicious, tracking its origin, and determining the blast radius.

This is the moment of truth for an analyst. Anyone can run a URL through a sandbox. A real investigation requires you to deconstruct the delivery mechanics of the email itself. The headers don't lie, but they can definitely mislead. Having a repeatable, structured workflow is the only way to move from confusion to conviction, ticket after ticket.

Forget flimsy checklists. We're going to build a proper triage process: from ensuring you have the right evidence to dissecting authentication results and writing a report that other teams can actually use. This is how you turn a routine phish report into a genuine intelligence artifact.

Step 1: Acquire the Unadulterated Message

Your analysis is only as good as your evidence. If you start with compromised data, your conclusion will be worthless. A user hitting 'Forward' on a suspicious email is the original sin of email analysis. The moment they do, their own mail client becomes a new Mail Transfer Agent (MTA) in the chain, rewriting headers, altering body content, and hopelessly contaminating the evidence.

The only acceptable submission is the original message, forwarded as an attachment. This wraps the original `.eml` — with its pristine headers and body — inside a new, harmless email wrapper. You are analyzing the attachment, not the email it arrived in. Your first step is always to confirm you have this. If not, kick the ticket back with instructions. Don't waste your time on mangled headers.

Why the rigidity? Because critical headers like `Authentication-Results` are added by *your* environment's mail gateway when it first processes the email. When a user forwards a message, all of that context is lost and replaced with a new set of headers showing a legitimate internal user sending an email. You're no longer analyzing an external threat; you're analyzing a benign internal message.

Step 2: The Right Tools for Header Deconstruction

An `.eml` file is just structured text, but staring at a 500-line header block in Notepad is a recipe for a headache and missed indicators. You need tools that can parse this wall of text into a structured, traversable format.

Command-Line Parsers for Automation

For analysts comfortable on the command line, tools like the Python `eml-parser` library are invaluable. They can be scripted to extract specific headers, decode base64-encoded content, and pull out URLs and attachment hashes automatically. This is essential for bulk processing. If you get a phishing campaign with 50 submissions, you're not going to triage them by hand. You'll script a parser to pull the `Return-Path`, subject, and sending IP from all 50 files to identify the common pattern.

Visual Header Analyzers

For one-off, deep-dive analysis, a visual tool is often faster. Online analyzers and client-side tools can ingest raw headers and present the delivery path visually, flag authentication failures, and surface key data points like the signing domain for DKIM. The goal is to let the machine handle the tedious parsing so you can focus on interpretation. What does it mean when the `Received` chain shows a hop through an unexpected residential IP address? Or when the DKIM signature is valid but for a completely unrelated domain? These are questions the human analyst needs to answer.

Step 3: Following the Hops and Checking the Receipts

This is the core of your investigation: tracing the message's journey and verifying its claimed identity. Two sets of headers are your guide: `Received` and `Authentication-Results`.

Tracing the `Received` Headers

Read the `Received` headers from the bottom up. The bottom-most header is the first hop, where the email originated from the sender's machine or an outbound mail server. Each subsequent header above it represents an MTA that received and relayed the message. You are reading the story of the email's journey through the internet to your mail gateway.

In this trace, you're looking for anomalies. Does the originating IP geolocate to a country that doesn't match the sender's purported language or business? Does the chain show a hop through a known-malicious relay or a consumer-grade ISP? Imagine a vendor's calendar invite that appears to pass through two extra, undocumented MTAs in Eastern Europe before hitting your gateway. That's a massive red flag. The path itself tells a story.

Decoding the `Authentication-Results` Header

This single header, added by your inbound gateway, is the most important one for judging authenticity. It's the summary of your gateway's validation checks against the three core email authentication standards: SPF (RFC 7208), DKIM (RFC 6376), and DMARC (RFC 7489).

Authentication-Results: mx.yourcompany.com; spf=fail (sender IP is 203.0.113.55) smtp.mailfrom=trustedbank.com; dkim=pass (signature was verified) header.d=cunning-attacker.net; dmarc=fail (dkim alignment failed) action=reject header.from=trustedbank.com

This line tells a complete story. SPF failed because the sending IP isn't authorized for `trustedbank.com`. A DKIM signature passed, but it was for a domain the attacker controls (`cunning-attacker.net`), not the domain shown in the `From:` header. This is a classic misalignment. Because neither the failed SPF nor the misaligned DKIM pass could align with `trustedbank.com`, the DMARC check failed, resulting in a `reject` policy action. You now have concrete, RFC-backed proof of forgery.

Step 4: From Indicators to a Defensible Verdict

Individual findings are just data points. A scoring rubric helps you synthesize them into a consistent verdict. This isn't a numerical score, but a logic tree. You're not adding up points; you're evaluating a pattern.

Your rubric should weigh different failures. A hard `spf=fail` or `dmarc=fail` is a strong indicator of spoofing. A `dkim=fail` due to a `body hash did not verify` error might just be a misconfigured mailing list forwarder rewriting the message body, not necessarily malicious. Context is everything. Is the failed SPF from a known business partner who uses a problematic third-party marketing service? Or is it from a bank that has a strict `p=reject` DMARC policy, making any failure highly suspect?

Beyond authentication, look at the content. Is there `From:` header display name abuse, where the visible name is 'CEO John Smith' but the actual address is `js1298@freemail.biz`? Is the `Return-Path` (also called the envelope sender) pointed to a completely different domain than the `From:` header? This is where bounces go, and attackers often point it to a domain they control to collect information on valid vs. invalid mailboxes.

Combine these factors. A message with authentication failures, display name deception, and a suspicious link is an easy `malicious`. A message with a `dmarc=pass` from a trusted sender that just contains aggressive marketing language is `bulk` or `spam`. The rubric provides the structure to justify your final determination: `Clean`, `Bulk`, `Spam`, or `Malicious`.

Step 5: The Write-up and When to Escalate

Your analysis is useless if it stays in your head. The output of your triage is a concise, actionable report logged in your incident tracking system. It doesn't need to be a novel. It needs to contain the key artifacts that justify your verdict and enable other teams to act.

A good report includes: the `From:` header address, the `Return-Path` address, the originating source IP from the first `Received` header, the DKIM signing domain (`d=`), the final verdict, and a one-sentence justification. For example: 'Verdict: Malicious. Message failed DMARC due to SPF and DKIM misalignment. Purporting to be from a known financial institution but sent from an unrelated third-party domain.' This gives the network team the IP to block, the security team the domain to watch, and the threat intel team the TTP to log.

So when does a single phishing ticket become a capital-I Incident? Escalation is required when the triage uncovers evidence of a targeted campaign, not just opportunistic spam. Are multiple executives receiving similarly-worded emails? Is the message a Business Email Compromise (BEC) attempt that uses no links or attachments, relying solely on social engineering? Does the attacker's domain look like a deliberate typo-squat of a key partner? These situations require immediate escalation for a broader hunt across all mailboxes, not just closing one ticket.

The Final Hop: Your Report

This entire workflow–from acquiring the `.eml` to writing the report–is about speed and accuracy. It's about building a system that lets you ignore the noise and focus on the signals. Many user-reported emails are benign. Your job is to prove it or disprove it quickly and move on to the next threat.

The more you practice this structured analysis, the faster you'll become at spotting the subtle tells of a sophisticated attack. You'll recognize the signature of a certain forwarding service that always breaks DKIM, or a specific pattern in the `Received` headers that indicates a particular phishing kit. This institutional knowledge is the real goal.

Your workflow is the foundation. Whether you build it with custom scripts or use purpose-built platforms like MailSleuth.AI to accelerate the parsing and correlation, the underlying logic remains. Don't just close the ticket. Understand the delivery path, justify your verdict, and make the next analyst's job a little bit easier.

The takeaway

A structured header triage workflow isn't just about being thorough; it's about being efficient. It transforms the overwhelming chaos of an email header into a clear narrative of a message's journey and intent. It's the difference between guessing and knowing.

The next time an `.eml` file lands in your queue, don't just scan for a link. Deconstruct it. Trace its path. Question its identity. You’re not just an analyst; you are the final, critical mail gateway applying the logic that the automated systems may have missed.

From .eml to Verdict: A Header Triage Workflow for Analysts