Complete Beginner's Guide

Incident
ResponseConcepts · Methodology · Tools · Roles

Everything you need to understand how organizations detect, contain, investigate, and recover from cyberattacks — the frameworks, tools, team structures, playbooks, and metrics that define the modern IR discipline.

NIST SP 800-61 SANS PICERL SOAR Playbooks Digital Forensics Threat Intelligence MTTD / MTTR Tabletop Exercises Chain of Custody
194
avg days to identify a breach — IBM 2024
64
avg days to contain once identified
$4.88M
average total cost of a data breach in 2024
54%
cost reduction with a tested IR plan in place
01 — Foundation

What Is Incident Response?

Starting from zero — no prior experience assumed.

Imagine a building catches fire. The fire department doesn't just show up and start spraying water randomly. They have a plan: who's incident commander, which crew handles evacuation, which handles the hose, how they communicate with each other and the building's management, how they investigate the cause afterward. Incident Response is exactly this — but for cyberattacks. A structured, practiced, and documented process that tells everyone exactly what to do, when, and in what order when a security breach occurs. Without it, organizations improvise under pressure — and usually make it worse.
🔥

What Counts as an Incident?

An incident is any event that actually or potentially compromises the confidentiality, integrity, or availability of information or systems. A successful ransomware attack: incident. A phishing email that an employee clicked and credentials were stolen: incident. A misconfigured cloud storage bucket that exposed customer data: incident. A failed login attempt: not an incident — just an event. The IR team's first job is always to determine: event or incident?

⏱️

Why a Process Matters

When a breach happens, every minute of delay is money. The IBM Cost of a Data Breach report finds that organizations with tested IR plans contain breaches 54% faster and spend significantly less on recovery. Without a plan, security teams reinvent the wheel under extreme pressure — missing evidence, taking the wrong containment steps, failing to notify the right people on time, and violating regulatory notification deadlines.

🔄

Reactive vs. Proactive IR

Reactive IR is what most people picture — responding to an active attack. Proactive IR is the growing discipline of not waiting for the alarm to sound. It includes threat hunting (actively searching for signs of compromise that haven't triggered alerts), purple teaming (simulating attacks to test response), and tabletop exercises (practicing IR decisions in a safe scenario). Mature organizations invest in both equally.

Event vs. Alert vs. Incident: These three terms are often confused. An Event is any observable occurrence in a system (a login, a file access, a network connection). An Alert is a notification from a security tool that a specific event matches a rule or threshold. An Incident is an alert that has been validated by a human analyst as a real security event requiring a structured response. Most alerts are false positives — the analyst's first job is triage.
194
avg days to identify a breach globally
64
avg additional days to contain it
$4.88M
average breach cost 2024 — IBM report
$1.49M
savings with a high-level IR team vs. none
70%
of breaches involve a human element (phishing, credentials)
54%
faster containment with a tested IR plan

02 — Structure

IR Frameworks:
NIST vs SANS

Two frameworks define how most organizations structure their incident response. Understanding both is essential for any security professional.

NIST
SP 800-61 Rev. 3 (Updated April 2025) — National Institute of Standards & Technology

The US government's gold standard IR framework. Revised in April 2025 to align with NIST CSF 2.0. Structured as four broad phases that are explicitly cyclical — you may loop back to Detection & Analysis mid-containment if new information surfaces. Widely adopted in federal agencies, regulated industries, and enterprise organizations worldwide.

  • 01 Preparation
  • 02 Detection & Analysis
  • 03 Containment, Eradication & Recovery
  • 04 Post-Incident Activity
SANS
PICERL Model — SANS Institute (Incident Handler's Handbook)

The practitioner-focused framework from SANS Institute — favored by hands-on IR teams for its more granular, step-by-step breakdown. The difference from NIST is largely notational: SANS splits NIST's combined "Containment, Eradication & Recovery" into three distinct phases, making the steps explicit. The underlying concepts are identical.

  • 01 Preparation
  • 02 Identification
  • 03 Containment
  • 04 Eradication
  • 05 Recovery
  • 06 Lessons Learned
Which should you use? Neither is "better" — it depends on your context. NIST is required for US federal agencies and is the most broadly recognized. SANS is preferred by many practical IR teams for its granularity. Most organizations adopt one as their primary framework and adapt it to their needs. The key is consistency: pick one, document it, train to it, and test it regularly. A well-executed SANS framework beats a poorly-executed NIST framework every time.
AspectNIST SP 800-61SANS PICERL
OriginUS Government (NIST)SANS Institute (private training org)
Number of phases4 phases6 phases
C/E/R handlingCombined into one phaseThree separate distinct phases
Cyclical natureExplicitly cyclical — can loop backSequential but iterative in practice
Best forPolicy documentation, compliance, federal useOperational teams, hands-on checklists
2025 updateRev. 3 aligned with NIST CSF 2.0Stable — no major recent update
Mandatory forUS federal agencies, many regulated sectorsNot mandated — widely adopted voluntarily

03 — The NIST Lifecycle

NIST Phases Deep Dive

The four NIST phases — what actually happens in each one and what the team produces.

1

Preparation — Before the Fire

Everything you do before an incident occurs. Build and train the CSIRT (Computer Security Incident Response Team). Write the Incident Response Plan (IRP) and associated playbooks. Deploy and tune detection tools (SIEM, EDR, NDR). Establish communication trees — who calls who, when, via which channels. Create a jump kit: a pre-staged collection of tools, forensic media, spare hardware, and documentation an analyst can grab and go. Define incident severity classifications so everyone knows what P1 vs. P4 means without debate during the chaos. Conduct tabletop exercises quarterly and full simulations annually. Preparation quality is the single biggest determinant of how well every other phase goes.

2

Detection & Analysis — Is This Real?

An alert fires. The team's first job: is this a real incident or a false positive? If real: what type, what scope, what severity? Detection sources include SIEM alerts, EDR detections, NDR anomalies, user reports ("something weird happened"), threat intelligence matches, and third-party notifications (law enforcement, vendors, security researchers). Analysis involves collecting indicators (IOCs), examining logs and telemetry, determining the attack vector, scoping how many systems are affected, and classifying the incident type and severity. This phase also demands documentation from the first moment — everything observed, every action taken, every decision made, with timestamps. This log becomes the incident record.

3

Containment, Eradication & Recovery — Stop, Clean, Restore

Containment stops the bleeding: isolate affected endpoints from the network, block malicious IPs and domains at the firewall, disable compromised accounts, revoke stolen credentials, sinkhole C2 domains. Two containment strategies exist — short-term (fast, possibly disruptive, like pulling a server off the network) and long-term (sustainable, allowing business to continue during extended incidents). Eradication removes the threat: delete malware, patch the exploited vulnerability, close the initial access vector, remove persistence mechanisms (scheduled tasks, registry run keys, backdoors). Recovery restores operations: rebuild affected systems from known-clean images or backups, restore data, reconnect systems to the network with enhanced monitoring, and verify normal operation before declaring recovery complete.

4

Post-Incident Activity — Learn and Improve

Within two weeks of the incident resolving: hold a structured lessons-learned meeting with everyone who was involved. Answer the key questions: What happened? How was it detected? What was done well? What was slow or missed? What tools were missing? What playbooks were inadequate? Produce an After-Action Report (AAR) documenting the full incident timeline, business impact, root cause, and concrete improvement recommendations. Update playbooks based on gaps found. Feed discovered IOCs and TTPs back into detection rules. Use the findings to justify budget requests. This phase is where organizations get better — treating it as a checkbox is how they get breached the same way twice.


04 — The SANS Breakdown

SANS PICERL Steps:
The Practitioner's View

SANS splits C/E/R into three explicit steps, which many hands-on IR teams find clearer during active response.

P — Preparation

Same as NIST Phase 1

Building the team, plan, tools, and muscle memory before anything bad happens. SANS emphasizes checklists — the Incident Handler's Handbook includes explicit preparation checklists for Windows and Unix environments covering what to document, what tools to have ready, and what policies to establish in advance.

I — Identification

Detect + Validate + Scope

Detecting that something happened and confirming it's a real incident. Key questions: What triggered the alert? Is this a true positive? What systems are involved? How long has the attacker been present (dwell time)? What data may have been accessed? Identify the initial access vector. This maps to the "Detection & Analysis" phase in NIST.

C — Containment

Stop the Spread

Limit the damage from expanding further. Isolate affected systems, block attacker access paths, preserve evidence before taking action. SANS explicitly distinguishes short-term containment (immediate emergency actions that might be disruptive) from long-term containment (sustainable controls that allow business to continue while the full eradication is being planned).

E — Eradication

Remove the Threat

Find and eliminate everything the attacker left behind: malware, backdoors, modified files, compromised credentials, persistence mechanisms, rogue accounts. Patch the vulnerability that was exploited. Validate that the threat is completely removed before moving to recovery — incomplete eradication is one of the most common causes of re-infection.

R — Recovery

Restore Normal Operations

Bring systems back online safely. Restore from known-clean backups. Rebuild compromised systems from scratch where trust cannot be re-established. Monitor restored systems intensively for signs of recurrence for 30–90 days. Validate that business functions are working normally before declaring full recovery. Communicate the "all clear" to stakeholders.

L — Lessons Learned

Get Better

SANS explicitly requires the lessons-learned meeting within two weeks of the incident closing. The output is a formal report. SANS recommends comparing response metrics against previous incidents — did MTTD and MTTR improve? Did previously identified gaps get fixed? Were training investments paying off? This structured reflection is what separates maturing IR programs from stagnant ones.


05 — People

The IR Team:
Roles & Structures

A CSIRT (Computer Security Incident Response Team) is the organizational structure behind incident response. Here are the key roles.

Incident Commander
The Decision-Maker

Owns the incident. Makes the final call on containment strategies, escalation decisions, and external communications. Not necessarily the most technical person — the IC must coordinate across teams, manage communications, and keep the response from becoming chaotic. In large incidents, the IC may not touch a keyboard at all.

Lead Analyst / IR Lead
The Technical Driver

Directs the technical investigation. Assigns analysis tasks to team members, interprets findings, builds the attack timeline, and recommends containment and eradication actions to the IC. Needs deep knowledge of the kill chain, attacker TTPs, forensics, and the organization's environment.

SOC Analysts (L1/L2/L3)
The Investigation Workforce

L1 analysts handle initial triage — validating alerts and escalating real incidents. L2 analysts dig deeper into confirmed incidents. L3 analysts (senior or threat hunters) handle the most complex cases and build detection improvements. During active incidents, multiple analysts are typically assigned in parallel — one on network logs, one on endpoint forensics, one on timeline reconstruction.

Digital Forensics Analyst
The Evidence Expert

Specializes in collecting and analyzing digital evidence in a forensically sound manner — disk images, memory captures, network captures, log preservation. Ensures chain of custody is maintained so evidence is admissible if law enforcement gets involved. May also lead malware analysis and reverse engineering efforts.

Threat Intelligence Analyst
The Context Provider

Enriches the investigation with external threat context — which threat actor group uses these TTPs, what other victims have been seen, are these IOCs linked to a known campaign. Feeds IOCs into detection tools and updates threat intel platforms. Helps analysts understand what the attacker was trying to achieve and what they might do next.

Communications Lead
The Translator

Manages internal and external communications during the incident — briefing executives (in non-technical language), coordinating with legal and PR teams on breach notification requirements, liaising with law enforcement if needed, and managing vendor communications. During a major breach this person is one of the most critical — poor communication during an incident can cause as much damage as the breach itself.

Legal & Compliance
The Regulatory Guardrail

Ensures the organization meets its legal notification obligations — GDPR requires notification within 72 hours of becoming aware of a breach; many US states have their own deadlines. Advises on evidence preservation requirements and attorney-client privilege considerations for sensitive IR communications. Engages law enforcement if criminal activity is involved.

System / Network Admins
The Environment Experts

Not always formally part of the CSIRT but critical participants during response. They know the environment — where the critical servers live, how the network is segmented, what "normal" looks like in the SIEM. They execute containment actions: disabling accounts in AD, isolating VLANs, deploying firewall rules, and rebuilding systems during recovery.

External IR Retainer
The Surge Capacity

Many organizations maintain a retainer with an external IR firm (CrowdStrike Services, Mandiant, Unit 42, Secureworks) for surge capacity during major incidents. The retainer means pre-negotiated rates, faster engagement, and pre-established data sharing agreements. For smaller organizations, the external firm may be the primary IR capability rather than a supplement.

CSIRT vs SOC vs CERT: A SOC (Security Operations Center) monitors for threats continuously and handles routine alerts. A CSIRT handles declared incidents — the escalation from the SOC. A CERT (Computer Emergency Response Team) is an older term, often used by government entities and universities. In practice, many organizations use these terms interchangeably, and the same people often perform all three functions. The key distinction is that IR is a structured, time-boxed response to a specific declared event.

06 — Technology

The IR Toolkit:
Every Tool Category Explained

Modern incident response is tool-intensive. Here's every major category and what role it plays in the response workflow.

Detection Layer

SIEM

Collects and correlates logs from all sources. The central investigation database. Analysts query the SIEM to trace attacker activity across systems, reconstruct attack timelines, and find the scope of compromise.

Key vendors: Splunk, Microsoft Sentinel, IBM QRadar, Exabeam, Elastic SIEM
Endpoint Layer

EDR / XDR

Deep endpoint telemetry — every process, file, network connection, registry change. During IR, the EDR provides the attack timeline on each endpoint, enables remote isolation, live terminal access, and file quarantine without a physical visit.

Key vendors: Palo Alto Cortex XDR, CrowdStrike Falcon, Microsoft Defender, SentinelOne
Automation Layer

SOAR

Security Orchestration, Automation and Response. Runs automated response playbooks triggered by SIEM alerts or analyst actions — blocking IPs, isolating endpoints, resetting passwords, creating tickets. Reduces response time from hours to seconds for routine actions.

Key vendors: Palo Alto XSOAR, Splunk SOAR, Microsoft Sentinel (built-in), ServiceNow SecIR
Forensics Layer

Digital Forensics Tools

Acquire and analyze disk images, memory dumps, and network captures. Used to recover deleted files, extract malware from memory, reconstruct user activity, and build evidence chains for legal proceedings.

Key tools: Volatility (memory), Autopsy / FTK (disk), Wireshark / Zeek (network), KAPE (triage)
Intelligence Layer

Threat Intelligence Platforms

Aggregate, manage, and operationalize threat intelligence. Match IOCs from the incident against known threat actor infrastructure. Provide context: which group, which campaign, what TTPs they use next, and what other organizations have seen.

Key vendors: MISP (open source), Recorded Future, ThreatConnect, OpenCTI, CrowdStrike TI
Network Layer

NDR / Network Forensics

Network Detection and Response. Captures and analyzes network traffic — C2 communications, lateral movement, data exfiltration. During IR, full packet captures (pcaps) are invaluable for reconstructing exactly what data left the network and where it went.

Key vendors: Darktrace, ExtraHop, Vectra AI, Corelight (Zeek-based)
Case Management

Incident Tracking

Documents every action, finding, and decision throughout the incident lifecycle. Provides the official record for post-incident review, legal proceedings, and insurance claims. Should be accessible to all team members and timestamped automatically.

Key tools: TheHive (open source), ServiceNow SIR, Jira, PagerDuty, Atlassian OpsGenie
Malware Analysis

Sandbox & Reverse Engineering

Safely execute suspicious files to observe their behavior. Static analysis examines the file without running it. Dynamic analysis watches what it does when it runs. Reverse engineering disassembles the code to understand its full capability.

Key tools: Any.run, Hybrid Analysis, WildFire (Palo Alto), VirusTotal, Cuckoo Sandbox, IDA Pro, Ghidra
Vulnerability Context

Vulnerability Management

During IR, the VM platform answers: what vulnerabilities exist on the affected systems, which were potentially exploited, and what's the fastest path to patching the initial access vector before the attacker returns.

Key vendors: Rapid7 InsightVM, Tenable, Qualys, Palo Alto XSIAM
The IR Stack Architecture: Think of the tools in a flow — EDR/NDR collect raw telemetry → SIEM normalizes and correlates it → Threat Intelligence enriches alerts with context → SOAR automates the first response actions → Digital Forensics digs deeper into confirmed incidents → Case Management documents everything. Each layer feeds the next. A mature IR capability has all of these integrated, not running as silos.

07 — Automation & Process

IR Playbooks:
Scripting the Response

A playbook is a documented, step-by-step response procedure for a specific type of incident. The difference between a chaotic response and a smooth one is almost entirely whether a good playbook existed and was followed.

A playbook is like a recipe — but for stopping cyberattacks. Every time a specific type of incident happens, everyone follows the same recipe, in the same order, without debate about what to do next. Manual playbooks are checklists followed by humans. Automated playbooks (in SOAR) are executed by machines at speeds no human can match — responding in seconds to what previously took hours.
Example Playbook
Ransomware Detection & Response
T+0
DETECT & ALERT
EDR behavioral engine detects rapid file encryption (ransomware behavioral signature). SIEM correlation rule fires. SOAR creates a P1 incident automatically and pages the on-call IR analyst via PagerDuty. Timestamp the first alert — this starts your MTTD clock.
T+5m
TRIAGE
Analyst validates the alert. Checks: how many files encrypted? Which endpoint(s)? Is this spreading laterally? Is the endpoint isolated or still networked? Confirms incident classification: Ransomware P1. Activates the CSIRT — notifies IC, IR Lead, legal, and comms per the notification tree.
T+10m
CONTAIN
Isolate affected endpoint(s) from the network via EDR one-click isolation. Disable the compromised user account in Active Directory / Okta. Block known ransomware C2 IPs and domains at the firewall. Preserve memory and disk images of affected systems before any cleanup (evidence preservation). Identify the initial infection vector — was it a phishing email? RDP brute force? Vulnerable VPN?
T+30m
SCOPE
Query SIEM for lateral movement indicators — did the attacker move to other systems before triggering the ransomware? Check EDR telemetry on all endpoints in the same network segment. Search for the ransomware binary hash across the entire fleet. Determine blast radius: how many systems are affected, which data stores were accessible from compromised accounts.
T+1–4h
ERADICATE
Identify and remove all ransomware artifacts (executables, scheduled tasks, registry persistence). Reset all credentials that may have been accessible from compromised systems. Patch the initial access vector. Conduct full malware scan on all systems that had any connection to the affected network segment.
T+4–48h
RECOVER
Restore affected systems from known-clean backups (validate backup integrity before restoring). Rebuild systems that cannot be trusted from clean images. Restore data files from the most recent clean backup. Reconnect systems with enhanced monitoring. Validate normal business operations. Notify affected parties per legal/compliance requirements.
T+14d
LESSONS LEARNED
Hold the post-incident review. Document: initial access vector, dwell time before detection, containment speed, gaps in playbook, detection rule improvements needed. Update the playbook. Feed IOCs into SIEM and EDR detection rules. Brief leadership on lessons learned and improvement investments needed.
Common Playbook Types

One Playbook Per Incident Type

Mature IR programs maintain separate playbooks for: ransomware, phishing, business email compromise (BEC), data exfiltration, insider threat, account takeover, DDoS, supply chain compromise, and cloud misconfiguration. Each has unique detection signals, containment steps, evidence to collect, and notification requirements — a one-size-fits-all approach fails under pressure.

SOAR Automation

Machine-Speed Response

Modern SOAR platforms automate the mechanical parts of playbooks — enriching alerts with threat intelligence, isolating endpoints via EDR API, blocking IPs at the firewall, resetting accounts in Active Directory, creating case records, and paging the right people. What took a human analyst 45 minutes of manual steps takes an automated playbook under 60 seconds — critically important during fast-moving ransomware or worm incidents.

Runbooks vs. Playbooks

Know the Difference

A playbook is a high-level, strategic guide for handling an incident type — the overall process flow, decision points, escalation paths. A runbook is a low-level, technical procedure for a specific task within that playbook — step-by-step instructions for how to isolate a Linux server, or how to extract memory from a Windows machine. Playbooks reference runbooks; runbooks are the how-to details.


08 — Evidence Science

Digital Forensics:
Evidence That Holds Up

Digital forensics is the discipline of collecting, preserving, and analyzing digital evidence in a way that maintains its integrity — so it's usable in court, in regulatory proceedings, or in insurance claims.

Chain of Custody

The Foundational Rule

Every piece of evidence must have a documented, unbroken chain of custody — who collected it, when, how, where it's been stored, and who has accessed it. If the chain of custody is broken, the evidence may be inadmissible in legal proceedings. IR teams must log every evidence handling event from the moment of collection through final disposition.

Order of Volatility

Collect the Ephemeral First

Digital evidence exists on a spectrum of volatility — some disappears in seconds, some lasts years. The forensic rule: always collect the most volatile evidence first. The order: (1) CPU registers & cache, (2) RAM / memory, (3) Network state & connections, (4) Running processes, (5) Open files, (6) Disk contents, (7) Logs and configuration files. Memory especially — it's gone the moment the machine reboots.

Disk Forensics

What's On the Drive

Forensic disk acquisition creates a bit-for-bit copy (image) of a storage device without modifying the original. Analysts examine the image — finding deleted files (which often aren't truly deleted), browsing history, artifact files (prefetch, registry hives, event logs, shellbags), and malware that was installed. Tools: FTK Imager, dd, Autopsy, EnCase, X-Ways.

Memory Forensics

What Was Running

RAM contains evidence that never touches the disk — encryption keys, plaintext passwords, injected shellcode, process hollowing artifacts, network connections, and running malware that only exists in memory. Memory forensics has become essential as attackers increasingly use fileless malware. The gold standard tool is Volatility — an open-source Python framework for analyzing memory dumps from any OS.

Network Forensics

What Moved Across the Wire

Full packet capture (pcap) allows analysts to reconstruct exactly what data was transmitted — what the attacker downloaded, what data was exfiltrated, what C2 commands were issued. Zeek (formerly Bro) generates rich network metadata logs. Wireshark analyzes individual packet captures. During IR, network forensics often proves data exfiltration occurred (or didn't) — critical for breach notification decisions.

Log Forensics

The Paper Trail

Logs are the most commonly used forensic source — Windows Event Logs, Linux auth logs, web server logs, firewall logs, VPN logs, authentication logs. Key Windows events every IR analyst must know: 4624 (successful logon), 4625 (failed logon), 4648 (logon with explicit credentials), 4688 (process creation), 4720 (account created), 7045 (new service installed). The SIEM is the primary log forensics tool, but raw log access is often needed for deep analysis.

KAPE — Kroll Artifact Parser and Extractor: KAPE is the go-to IR triage tool for rapid artifact collection from Windows systems. It's free, extremely fast, and collects the most forensically significant artifacts (event logs, registry hives, browser history, prefetch files, $MFT, LNK files, jump lists) in minutes — without a full disk image. Most IR analysts run KAPE as their first step on a potentially compromised Windows system to get fast, rich forensic data while full imaging continues in the background.

09 — Context & Intelligence

Threat Intelligence
in Incident Response

Threat intelligence transforms a raw incident from "something bad happened" to "a specific threat actor used this technique to achieve this objective — and here's what they do next."

IOCs — Indicators of Compromise

The Artifacts of an Attack

IOCs are specific, observable artifacts that indicate a system was involved in a malicious activity. They include: IP addresses (C2 servers, attacker infrastructure), domain names (malicious domains, C2 domains), file hashes (MD5, SHA-256 of malicious files), URLs (phishing pages, payload delivery), email addresses (attacker accounts), registry keys (persistence locations), and mutex names (malware behavioral markers). IOCs are operationalized by loading them into the SIEM and EDR as detection rules — so if any system in the environment communicates with a known-bad IP, an alert fires.

TTPs — Tactics, Techniques & Procedures

The Attacker's Behavior Patterns

TTPs describe how an attacker operates — the sequence of techniques they use, not just the specific artifacts they leave. MITRE ATT&CK is the universal vocabulary for TTPs — a knowledge base of 14 tactics (Initial Access, Execution, Persistence, Privilege Escalation, etc.) with hundreds of specific techniques under each. Knowing the TTPs of the threat actor in your environment tells you: what they've done, what they're likely to do next, and what detection rules to build. TTPs are much harder for attackers to change than IOCs.

The Pyramid of Pain: Security researcher David Bianco created the Pyramid of Pain — a model showing how much pain each type of indicator causes an attacker when you block it. Hash values (easiest for attacker to change — recompile the binary) are at the bottom. TTPs (hardest to change — requires fundamentally different attack approach) are at the top. The lesson: blocking a single IP is trivially bypassed. Detecting and responding to TTPs forces the attacker to completely retool.
MITRE ATT&CK

The Universal Attack Language

MITRE ATT&CK is the most important knowledge base in modern IR. It catalogs real-world attacker behaviors across 14 tactics and 200+ techniques, with sub-techniques. During an incident, analysts map observed behaviors to ATT&CK techniques — creating a visual "ATT&CK Navigator" heatmap of what the attacker did. This shows what they haven't done yet, enabling proactive hunting for the next phase. Every major security vendor maps their detections to ATT&CK.

Kill Chain

Lockheed Martin's Attack Model

The Cyber Kill Chain (Lockheed Martin, 2011) models attacks as seven sequential stages: Reconnaissance → Weaponization → Delivery → Exploitation → Installation → Command & Control → Actions on Objectives. Understanding which kill chain stage the attacker is in during an active incident tells the IR team: how far they've progressed, what evidence to look for, and what containment actions matter most right now. Disrupting any stage stops the attack.

Threat Intel Sources

Where Intelligence Comes From

Open source (OSINT): VirusTotal, AlienVault OTX, Abuse.ch, MISP. Commercial feeds: Recorded Future, Mandiant, CrowdStrike, Palo Alto Unit 42. Government: CISA Alerts, FBI Flash Reports, ISAC sharing communities (FS-ISAC for finance, H-ISAC for healthcare). Internal: IOCs discovered in your own previous incidents — often the most actionable intelligence of all.


10 — Measurement

IR Metrics:
How You Know You're Improving

You can't improve what you don't measure. These are the KPIs every mature IR program tracks.

MTTD
Mean Time to Detect

How long from the first moment of compromise until the organization knows an incident occurred. The global average is 194 days. Organizations with mature detection capabilities reduce this to hours or days.

MTTR
Mean Time to Respond

How long from detection to full containment of the threat. The faster this is, the less data is lost, the fewer systems are compromised, and the lower the overall cost. SOAR automation dramatically reduces MTTR.

MTTC
Mean Time to Contain

How long from detection to the point where the attacker can no longer spread or exfiltrate data. Distinct from MTTR — containment stops the bleeding; full response includes eradication and recovery.

Dwell Time
Time Inside Before Detection

How long an attacker was present in the environment before being detected. The longer the dwell time, the more damage done. Shorter dwell time is the primary goal of proactive threat hunting programs.

FPR
False Positive Rate

What percentage of alerts are false positives. A high FPR means analysts waste time on noise. Good SIEM tuning, RBA, and ML-based alerting reduce FPR without reducing true positive detection.

Breakout Time
Time to Lateral Movement

How long from initial access until the attacker begins moving to other systems. CrowdStrike's adversary intelligence sets a "1-10-60" benchmark: detect in 1 min, investigate in 10 min, contain in 60 min to beat average breakout times.

The 1-10-60 Rule: CrowdStrike's widely-adopted benchmark for IR speed: detect a threat in 1 minute, understand it in 10 minutes, contain it in 60 minutes. Meeting this benchmark requires automation (SOAR), integrated tooling (SIEM + EDR + TI), and practiced processes. Organizations that achieve 1-10-60 fundamentally outpace the average attacker breakout time and dramatically reduce breach costs.

11 — What You'll Face

Common Incident Types:
What They Look Like

Each incident type has its own patterns, indicators, playbook requirements, and IR priorities. Here are the most common categories every IR professional must know.

Ransomware

The Most Disruptive

Malware that encrypts files and demands payment for the decryption key. Modern ransomware operations (RaaS — Ransomware as a Service) involve human operators who spend weeks inside the network before deploying encryption — mapping backups, stealing data for double extortion, and maximizing impact. Key IR priorities: isolate immediately, determine dwell time, check backup integrity, identify initial access vector. Recovery without paying requires clean, tested, offline backups.

Phishing / BEC

The Most Common Entry Point

Phishing delivers malicious links or attachments. Business Email Compromise (BEC) impersonates executives or vendors to trigger fraudulent wire transfers or sensitive data disclosure — costing organizations $50B+ globally. IR focuses on: identifying who clicked, what credentials were harvested, whether email accounts were accessed, whether any financial transactions were initiated, and notifying affected users. Immediate password resets and MFA enforcement are critical containment steps.

Account Takeover

The Credential-Based Attack

An attacker uses stolen, guessed, or phished credentials to access legitimate accounts — bypassing most perimeter defenses since they look like the real user. Signs: logins from new countries or devices, impossible travel, access to unusual resources, large data downloads. IR involves: identifying all sessions active under the compromised account, reviewing all actions taken, revoking all tokens, resetting credentials, and investigating how credentials were obtained.

Insider Threat

The Most Complex

A current or former employee, contractor, or partner intentionally misuses access — exfiltrating data, sabotaging systems, or facilitating external attackers. IR is complicated by the attacker using legitimate credentials and authorized access paths. UEBA (User and Entity Behavior Analytics) is the primary detection mechanism. IR must balance speed (stopping damage) with sensitivity (avoiding wrongful accusations before evidence is confirmed), and typically involves HR and legal from the first moment.

Data Exfiltration

The Stealthy Theft

Data being copied and sent outside the organization — customer records, IP, financial data, credentials. May occur as part of a ransomware attack (double extortion) or as a standalone espionage operation. Key evidence: unusually large outbound transfers, connections to cloud storage services (Mega, Dropbox), DNS tunneling, staged archive files (.zip, .rar) created in unusual locations. Network forensics (pcap analysis) is essential to quantify what was taken.

Supply Chain Attack

The Hardest to Detect

Attackers compromise a trusted third party (software vendor, MSP, hardware supplier) to reach the real target. SolarWinds (2020) is the defining example — attackers inserted malicious code into a software update delivered to 18,000+ organizations. IR is especially difficult because the initial access vector appears completely legitimate — trusted software from a trusted vendor. Requires broad hunting across all systems that received the compromised update, not just suspicious endpoints.


12 — Practice & Readiness

Testing IR Readiness:
Tabletops & Simulations

An IR plan that has never been tested is just a document. These exercises are how organizations build real muscle memory — and find the gaps before attackers do.

Tabletop Exercise

Scenario Discussion Around a Table

A facilitated discussion where the IR team, leadership, legal, comms, and IT walk through a hypothetical incident scenario step by step — without touching any systems. The facilitator presents a scenario ("your SIEM just fired a ransomware alert on three servers in the finance department — what do you do next?") and the team talks through their response. Goals: test decision-making, validate communication paths, identify playbook gaps, and surface misunderstandings about roles. Low-cost, high-value. Should be run quarterly for high-maturity teams.

Purple Team Exercise

Red + Blue = Better Together

A collaborative exercise where a Red Team (attackers) simulates a realistic attack while the Blue Team (defenders) attempts to detect and respond. Unlike traditional red teaming (where the red team hides), purple teaming is transparent — both teams share TTPs, detection results, and gaps in real time. The output: concrete detection rule improvements and validated playbooks. Purple teaming is the fastest way to improve detection coverage against specific threat scenarios.

Full-Scale Simulation

Live-Fire Practice

A hands-on simulation where the IR team responds to a simulated incident injected into a test environment (or even the live environment with prior approval). Tests not just decision-making but actual tool usage, response speed (MTTD and MTTR are measured), and the mechanics of containment and eradication. Should include surprise elements that weren't in the original scenario — because real incidents never follow the script. Run annually at minimum.

Lessons-Learned Loop

The Point of All of It

Every exercise — tabletop, purple team, or real incident — must produce a concrete action list with owners and deadlines. Common outputs: update playbook step 4 to include isolating the print server, add a detection rule for T1566.001 phishing, acquire a memory forensics tool, establish a pre-agreed retainer with an external IR firm, test backup restoration quarterly. Without this loop, exercises are theater. With it, they're how organizations measurably improve.

The biggest IR failures are always process failures, not tool failures. The most common post-incident findings: "we didn't know who was authorized to isolate production servers during an active incident," "the IR plan existed but nobody had read it in 18 months," "our SIEM had the right data but nobody had built a detection rule for this technique," and "we didn't have legal on speed dial so the first 4 hours were spent finding a lawyer." Tools are necessary but insufficient. Process, practice, and people decide outcomes.

13 — Career Path

IR Certifications &
Learning Path

Incident Response is one of the most in-demand and well-compensated specializations in cybersecurity. Here's the certification landscape and how to build toward it.

🎓 GCIH — GIAC Certified Incident Handler

  • The most recognized hands-on IR certification — from SANS Institute
  • Covers: incident handling process, attack techniques, live response, log analysis, network forensics, and malware analysis fundamentals
  • Exam: 106 questions, 4 hours, open book (GIAC's standard format) — tests practical knowledge, not memorization
  • Paired training course: SEC504 — "Hacker Tools, Techniques, and Incident Handling"
  • Widely required in SOC analyst and IR analyst job postings globally

🔬 GCFE / GCFA — GIAC Forensics Certifications

  • GCFE (Forensic Examiner): Windows forensics — filesystem artifacts, registry, browser history, email forensics
  • GCFA (Forensic Analyst): Advanced incident response and forensics — memory analysis, timeline analysis, malware triage
  • Paired courses: FOR500 (Windows forensics), FOR508 (Advanced Incident Response)
  • FOR508 is widely considered the gold standard course for IR practitioners — covers the most realistic, complex scenarios
  • Target audience: Digital forensics analysts and senior IR analysts

🏆 CISA / CISSP — Management Level

  • CISA (Certified Information Systems Auditor) — ISACAcertification, relevant for IR governance and compliance aspects
  • CISSP — broad security management cert that includes IR as a domain; validates IR program design knowledge
  • Neither is hands-on technical — they're for IR program managers, CISOs, and security architects who design IR programs
  • CISSP requires 5 years of experience; CISA requires 5 years with at least 3 in IS audit

💻 Blue Team Labs & Practical Platforms

  • BlueTeamLabs.online — free and paid IR investigation labs with realistic scenarios
  • TryHackMe — SOC Level 1 and 2 paths include hands-on IR and forensics rooms
  • Hack The Box (Sherlocks) — forensics challenges using real malware artifacts and log data
  • DFIR.training — David Cowen's resource aggregating free forensics training and datasets
  • CyberDefenders — Blue team CTFs with realistic incident scenarios and network captures
  • SANS Holiday Hack / KringleCon — annual free challenge with IR components

📚 Essential Reading & Resources

  • NIST SP 800-61 Rev.3 (2025) — free download at csrc.nist.gov — the official framework
  • SANS Incident Handler's Handbook — free PDF from sans.org
  • MITRE ATT&CK — free at attack.mitre.org — essential daily reference
  • "The Threat Intelligence Handbook" — free from Recorded Future
  • IR case studies: Mandiant M-Trends Report (annual), Verizon DBIR (annual)
  • Blogs: DFIR.blog, AboutDFIR.com, TheDFIRReport.com (real intrusion case studies)

🚀 Career Roles & Compensation

  • SOC Analyst L1: $55,000–$80,000 — alert triage and initial IR
  • SOC Analyst L2/L3: $80,000–$120,000 — investigation and escalation
  • IR Analyst: $100,000–$145,000 — handles confirmed incidents, runs playbooks
  • Senior IR / Threat Hunter: $130,000–$175,000 — proactive hunting and complex IR
  • Digital Forensics Analyst: $110,000–$160,000 — evidence collection and analysis
  • IR Consultant (external firm): $150,000–$220,000+ — parachutes into breach scenarios
  • IR Manager / CISO: $175,000–$300,000+ — program design and executive responsibility

14 — Reference

Glossary of Key Terms

Every important IR term from this guide, defined in plain English.

Incident Response (IR)
The structured, documented process organizations follow to detect, contain, eradicate, and recover from cyberattacks — minimizing damage and restoring normal operations as quickly as possible.
CSIRT
Computer Security Incident Response Team. The group of people responsible for executing the incident response plan. Also called CIRT (Cyber Incident Response Team) or CERT.
NIST SP 800-61
The US government's gold-standard IR framework — four phases: Preparation, Detection & Analysis, Containment/Eradication/Recovery, and Post-Incident Activity. Updated to Revision 3 in April 2025.
SANS PICERL
The SANS Institute's six-phase IR model: Preparation, Identification, Containment, Eradication, Recovery, Lessons Learned. More granular than NIST, favored by hands-on practitioners.
IRP — Incident Response Plan
The documented policy and procedures governing how an organization detects, responds to, and recovers from security incidents. Must include roles, communication trees, escalation criteria, and playbooks.
Playbook
A step-by-step response guide for a specific type of incident (ransomware, phishing, etc.). Can be a human checklist or an automated SOAR workflow. Removes guesswork during high-pressure response situations.
Runbook
A low-level, technical how-to procedure for a specific task within a playbook — e.g., "how to image a Windows disk" or "how to isolate a Linux server." Playbooks reference runbooks for the detailed steps.
SOAR
Security Orchestration, Automation and Response. A platform that executes automated response playbooks triggered by SIEM alerts — blocking IPs, isolating endpoints, resetting credentials at machine speed.
IOC — Indicator of Compromise
A specific observable artifact indicating malicious activity — file hash, IP address, domain, URL, registry key. Loaded into SIEM and EDR as detection rules to alert if seen elsewhere in the environment.
TTP — Tactics, Techniques & Procedures
How an attacker operates — the behavioral patterns and methods they use. Described using the MITRE ATT&CK framework vocabulary. Harder for attackers to change than IOCs, making TTP-based detection more durable.
MITRE ATT&CK
A freely accessible knowledge base of real-world attacker behaviors across 14 tactics and 200+ techniques. The universal language for describing, detecting, and hunting for attacker TTPs in IR investigations.
Kill Chain
Lockheed Martin's 7-stage attack model: Reconnaissance → Weaponization → Delivery → Exploitation → Installation → C2 → Actions on Objectives. Helps IR teams understand where an attacker is in their attack and what to block.
MTTD
Mean Time to Detect. How long from initial compromise until the organization knows an incident occurred. The global average is 194 days. The primary metric targeted by threat hunting and detection tuning programs.
MTTR
Mean Time to Respond. How long from detection to full containment. Reduced by automated playbooks, pre-approved containment authorities, and practiced response processes.
Dwell Time
How long an attacker was present in the environment before detection. Longer dwell time = more damage. Proactive threat hunting exists specifically to reduce dwell time below what automated detection achieves.
Chain of Custody
The documented, unbroken record of who collected each piece of evidence, when, how, where it's been stored, and who has accessed it. Required for digital evidence to be admissible in legal proceedings.
Order of Volatility
The forensic principle that evidence should be collected from most volatile (disappears fastest) to least. CPU/RAM first, then network state, then running processes, then disk — RAM especially disappears when a machine is rebooted.
Volatility
The gold-standard open-source Python framework for analyzing memory dumps. Used to find malware running only in RAM, extract encryption keys, identify injected shellcode, and reconstruct process activity.
KAPE
Kroll Artifact Parser and Extractor. A free IR triage tool that rapidly collects the most forensically significant Windows artifacts (event logs, registry, prefetch, $MFT, browser history) in minutes without a full disk image.
Pyramid of Pain
David Bianco's model showing how much operational pain different indicator types cause attackers when blocked. File hashes (bottom, easy to change) up to TTPs (top, hard to change). Guides what defenses have the most durable impact.
BEC — Business Email Compromise
An attack where criminals impersonate executives or vendors via email to trick employees into wire transfers or data disclosure. Costs organizations $50B+ globally. One of the highest-impact, lowest-technical IR scenarios.
RaaS — Ransomware as a Service
A criminal business model where ransomware developers license their malware to "affiliates" who conduct the actual attacks and share revenue with the developer. Most major ransomware attacks since 2020 use this model.
Tabletop Exercise
A facilitated discussion-based IR practice where teams walk through a hypothetical incident scenario without touching systems. Tests decision-making, communication, and playbook adequacy. Should be run at minimum quarterly.
Purple Team
A collaborative exercise combining Red Team (attackers) and Blue Team (defenders) to transparently test and improve detection and response capabilities against specific attack scenarios.
After-Action Report (AAR)
The formal document produced after every incident or major exercise. Documents the full timeline, root cause, business impact, what worked, what didn't, and concrete improvement actions with owners and deadlines.
GCIH
GIAC Certified Incident Handler. The most recognized hands-on IR certification, from the SANS Institute. Paired with the SEC504 training course.