Given an Incident, Utilize Appropriate Data Sources to Support an Investigation
1. Introduction: Choosing the Right Evidence Fast
In incident response, speed comes from choosing the right data source first. For CompTIA Security+ this matters because exam questions rarely ask for every possible source; they ask for the best source for the scenario. If the clue is a suspicious IP, DHCP may matter more than PCAP. If the clue is a malicious URL, proxy logs are usually more useful than firewall logs. If the clue is suspicious execution on a host, EDR or process creation telemetry is often the fastest path.
This article is written for Security+ SY0-601 objectives, though the concepts still apply to newer versions such as SY0-701 even if objective wording differs. CompTIA may phrase incident response phases as preparation, detection/analysis, containment, eradication/recovery, and post-incident activity. No matter how they phrase it, the real skill is the same: you’ve gotta know what each data source can tell you, what it can’t tell you, and what has to be turned on beforehand for it to be actually useful.
2. Core Concepts and Investigation Basics
Logs are discrete recorded events. Telemetry is broader sensor data such as event streams, process state, network observations, counters, and enriched activity. Alerts are detections built from logs or telemetry. Artifacts are remnants left behind, such as Prefetch, browser history, scheduled tasks, shell history, or registry autoruns. Evidence is information used to support a conclusion, but evidentiary weight depends on collection quality, integrity protection, and documentation.
Useful metadata falls into categories: temporal (timestamp), identity (username, SID, account ID), asset (hostname, device ID, MAC), object (hash, file path, PID), and network (source/destination IP, port, protocol). These are your pivot fields.
Three rules matter in almost every investigation:
- Normalize time: use UTC where possible; watch clock drift, DST confusion, and inconsistent time zones.
- Collect volatile evidence first: memory, running processes, active connections, logged-in users, open sessions.
- Validate important findings against source data: a SIEM is a search and correlation platform, not the original source of truth by itself.
Preserve evidence carefully. Export logs, record who collected them, hash files where appropriate, restrict access, and avoid unnecessary changes. Attackers may also tamper with visibility by clearing Windows logs, deleting shell history, disabling agents, or shortening cloud retention.
3. Host-Based Data Sources
Host data is usually the best place to confirm what executed, who logged in, and how persistence was established.
4. Windows Investigation Essentials
Windows Event Logs are powerful, but visibility depends on configuration. For example, Event ID 4688 process creation requires auditing to be enabled, and command-line visibility may require additional policy settings. Scheduled task and service evidence may appear in multiple channels depending on logging.
High-yield Windows events include:
- 4624 successful logon
- 4625 failed logon
- 4648 logon with explicit credentials
- 4672 special privileges assigned
- 4688 process creation
- 4697 service installation
- 4698 scheduled task creation
- 4720 user account creation
- 4728/4732 user added to privileged groups
- 7045 service creation in the System log
- 4104 PowerShell script block logging when enabled
Common logon types worth recognizing: 2 interactive, 3 network, 5 service, 10 remote interactive/RDP. That matters because a 4624 with Logon Type 3 does not mean someone sat at the keyboard.
Sysmon is a major enhancement source in real environments. Sysmon Event ID 1 is one of those logs I’ve learned to love because it gives you much better process-creation detail than native Windows logging usually does. And the nice part is that Sysmon doesn’t stop there—it can also help you see network connections, driver loads, registry changes, and file activity if it’s configured right.
Important Windows artifacts beyond logs include Prefetch, Amcache, ShimCache/AppCompatCache, UserAssist, LNK files, Jump Lists, browser history, Run/RunOnce keys, services, scheduled tasks, and WMI event subscriptions. These help prove execution or persistence even when logs are incomplete. Limitations matter: Prefetch may be disabled, some artifacts age out, and not every server keeps the same evidence.
Memory is especially valuable for fileless malware, injected processes, credential theft, and active network sessions. If malware is running only in RAM, disk artifacts may not tell the full story.
5. Linux Investigation Essentials
Linux logging is distro-dependent. Debian-family systems often use /var/log/auth.log; Red Hat-family systems often use /var/log/secure. Many systems rely heavily on systemd-journald, so logs may not exist as traditional flat files unless persistent journaling is configured.
Useful Linux evidence sources include SSH authentication, sudo activity, cron jobs, systemd service changes, shell history, and audit logs. Commands such as journalctl, last, lastb, and ausearch are common investigation tools. Auditd can provide stronger visibility into execve, file access, and privileged actions when configured.
Example indicators include successful SSH login followed by sudo to root, a new cron entry, a suspicious binary in /tmp, or a modified systemd unit for persistence. Shell history can help, but it is easy to alter or disable, so it is supporting evidence rather than proof.
6. Endpoint Telemetry, EDR, and Persistence Triage
Modern AV/EPP can do more than signature matching, but EDR usually provides deeper investigation and response telemetry: process trees, parent-child relationships, command lines, hashes, file writes, registry changes, and network connections. My go-to triage flow is pretty simple: start with the alert, look at the process tree, read the command line, check how common the file or process is, verify the user context, and then decide whether you need to contain it.
This is where you distinguish suspicious-but-benign activity from true abuse. powershell.exe alone is not enough. powershell.exe -nop -w hidden -enc ... launched by Word or a temp-file script host is much more concerning. Common persistence checks include services, scheduled tasks, startup folders, Run keys, WMI subscriptions, browser extensions, and unusual local admin accounts.
7. Network Data Sources: What They Show and What They Miss
Firewall logs show allowed or denied connections, usually with source, destination, port, action, and sometimes NAT details or policy names. They help answer whether communication was attempted or permitted, but they do not give payload visibility.
Flow-oriented telemetry such as NetFlow and IPFIX summarizes who talked to whom, when, for how long, and how many bytes moved. sFlow is different: it samples packets and interfaces rather than exporting full flow records. For Security+ purposes, the big thing to remember is that all of these can help you spot traffic patterns, but they don’t give you the same depth of visibility. So yeah, they’re related, but you really can’t treat them like perfect substitutes for each other. Flow data is excellent for beaconing, lateral movement, and exfiltration volume; it does not show content.
PCAP provides packet-level detail. It is the right choice when the question asks for payload inspection, protocol reconstruction, exploit analysis, or upload/download direction. But encrypted traffic limits visibility. For TLS sessions, PCAP usually shows metadata, handshake details, timing, SNI in some cases, and session behavior unless decryption, TLS inspection, or endpoint capture is available.
DNS logs show queries, not guaranteed successful connections. They are strong for domain resolution history, suspicious NXDOMAIN patterns, tunneling clues, and malware domain lookups. They may be weakened by DoH/DoT. DHCP logs map IP to host for a time window, which is critical because IPs change. Watch for short retention, relay complexity, and MAC randomization in some environments.
Proxy logs are usually the best source for web access history when traffic is actually proxied and user identity is integrated. Without SSL inspection, HTTPS visibility may be limited to domain, category, action, or CONNECT details rather than full URL paths. VPN logs help validate remote access sessions and may include posture/compliance information in integrated deployments. NAC and wireless controller logs can also help tie a device to a network location.
Exam matrix: suspicious IP -> firewall + DHCP + VPN; suspicious domain -> DNS + proxy + endpoint; payload contents -> PCAP; traffic pattern only -> NetFlow/IPFIX.
8. Application, Email, and Identity Sources
Web server, reverse proxy, and WAF logs should be read together. The reverse proxy may terminate TLS and preserve original client IP in headers such as X-Forwarded-For. And here’s the thing: a WAF can be in block, allow, or detect-only mode, so a WAF alert doesn’t automatically mean the attack was stopped. Web logs usually show request paths, methods, status codes, and user agents. Database logs can show authentication events, privilege changes, or odd query behavior, but if you want the full query text, you’ll often need explicit auditing turned on.
Email investigations are high-value for Security+. Start with message trace or secure email gateway logs to confirm whether the message was delivered, blocked, or quarantined. Then review headers and authentication results such as SPF, DKIM, and DMARC. Check the envelope sender versus visible From address, look for URL rewriting or sandbox detonation results, and review mailbox rules for suspicious forwarding. Headers visible to the user can be spoofed or incomplete; gateway and mail server logs are stronger validation sources.
Authentication and identity sources include IdP logs, Active Directory/domain controller logs, Kerberos and NTLM events, VPN logs, RADIUS/TACACS+ logs, and mailbox audit logs. These help detect password spraying, impossible travel, MFA fatigue, legacy authentication abuse, and post-login activity. Impossible travel is only an indicator; VPN egress, mobile networks, and cloud proxying can create false positives.
9. Cloud, SaaS, and Exfiltration Sources
Cloud logging is easiest to understand in categories: identity logs, control-plane audit logs, data-plane access logs, network/security flow logs, and workload or SaaS logs. Examples include AWS CloudTrail, AWS VPC Flow Logs, Azure Activity Logs, Entra ID sign-in logs, Google Cloud Audit Logs, and Microsoft 365 Unified Audit Log. Shared responsibility matters because some logs must be enabled, exported, or retained by the customer.
For exfiltration, think by channel. Web upload: proxy, DLP, flow, PCAP, endpoint. Email exfil: mail logs, mailbox audit, DLP. Cloud sharing: SaaS admin logs, CASB, object access logs. DNS tunneling: DNS query patterns and resolver logs. USB/removable media: endpoint device control logs, OS events, DLP. RDP clipboard or drive mapping: session logs plus host evidence. A DLP alert may show blocked or attempted movement; it does not automatically prove successful data loss.
Backup logs are crucial in ransomware response, but be precise: they confirm job status and restore operations, not true recoverability unless restore testing or verification was performed. Look for immutability, recent clean restore points, and whether backups themselves were targeted.
10. Building a Reliable Timeline and Correlating Sources
A practical timeline method is: collect timestamps, convert to UTC, choose anchor events, pivot on user/host/IP/hash/domain/process, mark confidence, and document gaps. High-confidence anchors include confirmed email delivery, a successful login, a process creation event, a DNS query, or a firewall allow record.
Useful cross-source patterns include:
- Phishing: email trace -> proxy click -> DNS lookup -> EDR process tree
- Credential abuse: IdP/VPN success -> host logon -> PowerShell/RDP/SMB activity
- Suspicious IP: firewall source IP -> DHCP lease -> CMDB owner -> VPN correlation
- Web attack: WAF hit -> reverse proxy request -> web log -> DB audit
Simple search logic in a SIEM often starts with a narrow time window and one pivot field. Example: search the hostname and user around the alert time, then add source IP and destination domain. SIEM normalization definitely helps, no question about it, but it’s not some magical fix that makes bad data good. If you’re not paying attention, parsing errors, missing fields, and delayed ingestion can still send you down the wrong path pretty quickly.
11. When the Evidence Doesn’t Line Up
If the evidence you expected isn’t there, don’t jump straight to thinking the incident didn’t happen. Check for common blind spots:
- auditing was never enabled
- retention expired
- endpoint agent offline or disabled
- clock drift or time-zone mismatch
- NAT, load balancers, or proxies obscured attribution
- TLS or DoH reduced visibility
- cloud audit export not enabled
- WAF in detect-only mode
- attacker cleared or tampered with logs
Before trusting a conclusion, verify source coverage, time sync, parsing quality, and whether the data source is direct evidence or just enrichment.
12. Three High-Value Security+ Scenarios
Phishing to malware: Best first source is usually the secure email gateway or message trace to confirm delivery. Then use SPF/DKIM/DMARC results, proxy logs, DNS logs, and EDR. Goal: prove delivery, click, download, and execution.
Suspicious login or credential abuse: Best first source is usually IdP or VPN logs. Then check AD/DC logs, host logons, mailbox access, and EDR for post-login behavior. Goal: prove whether authentication succeeded and what happened next.
Ransomware or lateral movement: Best first source is usually EDR or FIM for initial impact, then SMB/file share logs, Windows 5140/5145-style share access where enabled, authentication logs, backup logs, and network telemetry. Goal: identify patient zero, spread path, affected shares, and restore options.
13. Best-First-Source Cheat Sheet and Exam Traps
If the question says...
- identify device from IP -> DHCP
- confirm domain resolution -> DNS
- confirm website visited -> Proxy
- confirm process execution -> EDR / 4688 / Sysmon
- confirm remote access login -> VPN / IdP
- inspect payload contents -> PCAP
- see traffic pattern or volume -> NetFlow/IPFIX
- trace cloud admin/API action -> Cloud audit log
Common distractors are predictable. PCAP is tempting when proxy is enough. DNS is tempting when the question really asks about browsing. Threat intelligence is tempting when the question asks for internal proof. A SIEM alert is tempting when the better answer is the original log source. Choose the source closest to the evidence need: confirmation, attribution, scoping, or payload analysis.
14. Rapid Review
Remember these distinctions:
- DNS shows queries, not full browsing.
- DHCP maps IP to host only for a time window.
- Firewall logs show connection metadata, not payload.
- NetFlow/IPFIX show patterns and volume, not content.
- Proxy logs show web access when traffic is actually proxied.
- EDR shows host behavior and process relationships.
- Windows process creation and PowerShell visibility depend on logging configuration.
- Cloud logs must be enabled and retained; shared responsibility matters.
- Backup logs support recovery decisions, but restore testing proves recoverability better than job success alone.
That is the mindset Security+ wants: not “what tool is coolest,” but “what source most directly answers this question first?”