Here’s a question most organizations can’t answer: if an attacker got past your perimeter right now — phished a credential, exploited a public-facing app, compromised a vendor — how far would they get before anyone noticed? Not how far could they theoretically get. How far would they actually get, against your real defenses, your real analysts, your real detection rules, at 2 PM on a Wednesday when half the SOC is in a meeting?
That’s the question red team operations answer. Not “do you have vulnerabilities” — pentests answer that. Red teaming answers “can a motivated adversary achieve their objectives against your full defensive stack, including the humans?” And the answer, for most organizations, is uncomfortable.
The TLDR
Red teams simulate real-world adversaries to test an organization’s detection and response capabilities end-to-end. Blue teams are the defenders — the SOC analysts, incident responders, and security engineers who detect, investigate, and contain threats. Purple teaming is the collaborative model where red and blue work together, sharing attack techniques and defensive gaps in real time to accelerate improvement. Each serves a different purpose. Red teams measure your real-world resilience. Blue teams provide the operational defense. Purple teams close the feedback loop between offense and defense so that every attack simulation directly improves your detection coverage.
The Reality
Most organizations conflate red teaming with penetration testing. They’re fundamentally different activities with different goals, methodologies, and outcomes.
A pentest asks: “What vulnerabilities exist?” It’s scoped, the SOC is usually informed, and the goal is to find and document security weaknesses. The output is a findings report.
A red team engagement asks: “Can an adversary achieve a specific objective?” The SOC is typically not informed (or only a small cell is aware). The goal is to simulate a realistic attack and test the organization’s ability to detect and respond. The output includes not just what the team accomplished, but whether anyone noticed, how long it took to detect, and how effective the response was.
MITRE ATT&CK changed the game for both sides. Before ATT&CK, red teams described their actions in ad hoc terms and blue teams had no common framework to map detections against. Now, a red team can say “we executed T1003.001 (LSASS Memory Dumping) and the blue team did not detect it,” and that maps directly to a specific gap in detection coverage. It turned adversarial testing from storytelling into measurement.
The MITRE ATT&CK Evaluations program takes this further — testing commercial security products against real-world adversary emulation plans and publishing the results. If you’re choosing an EDR product, these evaluations show you exactly which techniques each product detects and which it misses. That’s the kind of data that used to require a red team engagement to produce.
How It Works
Red Team Operations
A red team’s mission is to emulate a real adversary. Not just use the same tools — think the same way, follow the same operational security practices, pursue the same objectives. The best red teams start by selecting a threat actor relevant to the target organization (a healthcare company worries about different adversaries than a defense contractor) and building an emulation plan based on that actor’s documented TTPs.
Adversary Emulation. The red team builds a campaign plan that mirrors a real threat actor’s behavior. MITRE’s Center for Threat-Informed Defense publishes adversary emulation plans for groups like APT29, FIN6, and Sandworm. These plans detail the specific techniques the group uses at each stage of the kill chain — from initial access through lateral movement to data exfiltration. The red team follows the plan, adapting as needed based on what they encounter.
Assumed Breach. Many red team engagements start from the assumption that the adversary has already gained initial access. This isn’t cheating — it’s realism. Phishing works. Supply chain compromises work. Credential stuffing works. Starting from inside the network lets the engagement focus on the higher-value questions: “Once they’re in, how far do they get?”
Operational Security. Real adversaries try not to get caught. Good red teams do the same. They use encrypted C2 channels, avoid triggering known detection signatures, operate during business hours to blend in, and clean up artifacts. If the red team isn’t practicing OPSEC, they’re just a noisy pentest and the engagement loses its value as a test of detection capabilities.
Objectives-Based Testing. Red teams work toward specific objectives: “Access the crown jewels database,” “Exfiltrate 10 GB of data without detection,” “Achieve domain admin within 48 hours,” “Deploy simulated ransomware across 50 endpoints.” These objectives are defined in advance with organizational leadership and directly measure business risk, not just technical vulnerability.
Blue Team Operations
The blue team is your standing defense. SOC analysts, incident responders, detection engineers, security architects — everyone whose job is to prevent, detect, and respond to attacks. In the context of red team exercises, the blue team operates under normal conditions, responding to whatever they detect without knowing the red team is active.
Detection Engineering. Writing, tuning, and maintaining detection rules across SIEM, EDR, NDR, and other platforms. Detection engineers map their rule coverage to MITRE ATT&CK to identify gaps. A mature blue team can produce a heat map showing which ATT&CK techniques they can reliably detect and which ones are blind spots.
Alert Triage and Investigation. When detections fire, analysts investigate. The quality of this process determines whether a red team’s activity gets caught or dismissed as a false positive. Blue team effectiveness isn’t just about having the right detections — it’s about having analysts who can recognize real adversary behavior when the alert is ambiguous.
Incident Response. When the blue team confirms malicious activity, incident response kicks in. Containment, eradication, recovery. In a red team exercise, the blue team’s response is evaluated: Did they contain the right systems? Did they identify the full scope of compromise? Did they miss the persistence mechanism the red team planted?
Threat Intelligence Integration. Blue teams consume threat intelligence to stay current on adversary TTPs. When a new campaign targeting your sector is reported by CISA or sector ISACs, the blue team should be updating detection rules and hunting for indicators of compromise.
Purple Team Operations
Purple teaming is where the real acceleration happens. Instead of red and blue operating in isolation — red attacks, blue defends, they compare notes at the end — purple teaming brings them together in real time.
The Collaborative Model. A purple team exercise typically works like this: the red team announces “we’re going to execute T1059.001 (PowerShell)” and walks the blue team through the specific commands they’ll run. The blue team watches their tooling in real time. Did the SIEM alert? Did EDR flag the behavior? Did the detection rule fire correctly? If it didn’t, why not — and can the blue team write a detection on the spot?
ATT&CK Coverage Mapping. Purple teams work through ATT&CK techniques systematically. Execute a technique, verify detection, document the result, move to the next technique. Over the course of an exercise, you build an empirically validated detection coverage map. Not “we think we detect this” — “we tested it and confirmed we detect this, and here’s the evidence.”
Continuous Purple Teaming. The most mature organizations don’t treat purple teaming as a periodic exercise. They integrate it into their operational rhythm. Detection engineers write a new rule, the red team validates it works against the real technique, the blue team confirms the alert is actionable. This loop runs continuously.
Purple Team Resources. MITRE ATT&CK Navigator provides the visual framework for mapping coverage. Atomic Red Team (by Red Canary) provides small, portable test cases for individual ATT&CK techniques — perfect for purple team exercises. MITRE Caldera is an automated adversary emulation platform that can run ATT&CK-based campaigns.
Tabletop Exercises
Not every adversarial exercise requires live testing. Tabletop exercises — structured discussion-based simulations — are the entry point for organizations that aren’t ready for full red team engagements, and they remain valuable even for mature programs.
A tabletop scenario might look like this: “It’s Friday at 4:30 PM. Your EDR vendor notifies you of a zero-day exploit being actively used against their product. You have 15,000 endpoints running the affected version. What do you do?”
The facilitator walks the team through the scenario step by step. Who makes the call to isolate affected systems? Who notifies executive leadership? What’s the communication plan if media picks it up? Where are the gaps in your playbook?
CISA’s Tabletop Exercise Packages (CTEPs) provide ready-to-run scenarios. NIST SP 800-84 — Guide to Test, Training, and Exercise Programs provides the framework for designing and evaluating exercises.
The value isn’t in getting the answers right. It’s in discovering the questions you can’t answer.
Measuring Effectiveness
Red and purple team exercises produce data. That data should drive metrics.
- Detection coverage — Percentage of ATT&CK techniques your tooling reliably detects.
- Mean Time to Detect (MTTD) — How long between the red team’s action and the blue team’s alert.
- Mean Time to Respond (MTTR) — How long between detection and effective containment.
- Kill chain depth — How far the red team progressed before detection. Did the blue team catch initial access, or did they not notice until data exfiltration?
- Detection-to-response ratio — Of the techniques that were detected, how many resulted in effective response actions?
Track these across exercises and over time. The goal isn’t perfection — it’s measurable improvement.
How It Gets Exploited
The adversarial testing model has its own failure modes.
Red team theater. Organizations run red team exercises but the scope is so constrained, the objectives so limited, or the engagement so short that the results are meaningless. A two-day red team engagement against a Fortune 500 company is a checkbox, not a test.
Blue team hero syndrome. If the blue team knows (officially or through the grapevine) that a red team exercise is happening, their behavior changes. They’re more alert, more suspicious, more aggressive in their triage. That’s not their normal operating posture, and the exercise results don’t reflect real-world detection capability. Operational security around exercise timing matters.
Findings without follow-through. A red team reports that they achieved domain admin in 72 hours without detection. Executive leadership nods gravely. Nothing changes. Six months later, the real attackers follow essentially the same path. The value of adversarial testing is zero if the findings don’t drive remediation.
Ignoring the assumed breach model. Organizations that insist on testing only external attack paths are testing a decreasingly relevant scenario. With the prevalence of credential compromise, supply chain attacks, and insider threats, the question isn’t if an attacker gets inside — it’s what happens next. The 2023 CISA advisory on assumed breach reflects this reality.
What You Can Do
- Start with tabletop exercises. They’re low-cost, low-risk, and immediately valuable. Use CISA’s CTEPs as starting scenarios. Run them quarterly.
- Map your detection coverage to ATT&CK. Use the ATT&CK Navigator to visualize which techniques you can detect and which are blind spots. You can’t fix gaps you haven’t mapped.
- Invest in purple teaming before red teaming. If your detection coverage has known gaps, a red team exercise will just confirm them expensively. Purple team exercises collaboratively close those gaps first. Run red team exercises when you’re ready to be tested at full adversary realism.
- Use Atomic Red Team for DIY testing. The Atomic Red Team library provides test cases for individual ATT&CK techniques that your detection engineers can run against your own tooling. Free, open source, and immediately useful.
- Demand objectives-based red team engagements. “Find vulnerabilities” is a pentest objective. “Achieve access to the customer database without triggering a SOC alert” is a red team objective. The latter tests what matters.
- Track metrics over time. MTTD, MTTR, detection coverage, kill chain depth. Run exercises regularly and measure improvement. If the numbers aren’t improving, the program isn’t working.
Related Deep Dives
- Penetration Testing — the structured vulnerability assessment that feeds red team operations
- Threat Hunting — the blue team skill that makes purple team exercises meaningful
Sources & Further Reading
- MITRE ATT&CK Framework — The common language for adversary behavior
- MITRE ATT&CK Evaluations — Vendor product testing against real adversary emulation
- MITRE Caldera — Automated adversary emulation platform
- Atomic Red Team (Red Canary) — ATT&CK technique test cases
- CISA Tabletop Exercise Packages — Ready-to-run exercise scenarios
- NIST SP 800-84 — Guide to Test, Training, and Exercise Programs
- NIST SP 800-53 Rev 5 — CA-8 (Penetration Testing) — Federal control requirements for security testing
- ISC2 Security Assessment Resources — Professional development for offensive and defensive security