Most organizations don’t discover breaches themselves. According to IBM’s Cost of a Data Breach Report, the average time to identify a breach is over 200 days. Two hundred days of an attacker in your environment, and the most common reason it takes that long is that nobody was watching the logs — or worse, the logs didn’t exist. You can’t detect what you can’t see. You can’t investigate what wasn’t recorded. And you can’t prove what happened to regulators, courts, or your board without evidence.
A security logging pipeline is the foundation of detection and response. Not a SIEM license collecting dust, not a log server filling up with unreviewed data — a pipeline that collects the right logs, puts them in one place, keeps them long enough, alerts on what matters, and protects the evidence from tampering. Here’s how to build one.
DO / DON’T
DO:
- Log authentication events — Every login, logout, failure, lockout, and privilege change. Authentication logs are the single most valuable log source for security.
- Centralize logs immediately — Logs that live only on the system that generated them disappear when that system is compromised. Ship logs off the source in real time.
- Define retention policies before you start collecting — Storage costs money. Know how long you need to keep logs for operational, compliance, and forensic purposes.
- Tune alerts — An alert that fires 500 times a day gets ignored on fire 501. Tune until every alert is actionable.
- Protect log integrity — If an attacker can modify or delete logs, your detection capability and your forensic evidence are both gone.
- Log in UTC — Timestamps across systems in different time zones are a correlation nightmare. Use UTC everywhere.
DON’T:
- Don’t log passwords, tokens, or secrets — Logs are accessed by many people and stored in many places. Sensitive credentials in logs create a new vulnerability.
- Don’t log everything and analyze nothing — Collecting logs without reviewing them is just writing an expensive diary that nobody reads.
- Don’t rely on local logs alone — The first thing a sophisticated attacker does after compromise is clear local logs. If your only copy is on the compromised system, it’s gone.
- Don’t ignore log volume planning — A busy web server can generate gigabytes per day. Plan storage, ingestion rates, and costs before you turn on verbose logging everywhere.
- Don’t set and forget — New systems, new applications, and new threats mean your logging strategy needs regular review.
What to Log
Not all logs are created equal. Focus on the sources that provide the most security value per byte stored.
Tier 1: Must Have (Day One)
These log sources provide the foundation for security monitoring. If you log nothing else, log these.
Authentication and access:
- Login successes and failures (all systems, all applications)
- Account lockouts
- MFA successes and failures
- Privilege escalation (sudo, runas, UAC elevation)
- Account creation, modification, and deletion
- Password changes and resets
- Session creation and termination
System changes:
- Configuration changes to security-relevant settings
- Software installation and removal
- Scheduled task and cron job creation or modification
- Service start, stop, and configuration changes
- Group policy changes
- Firewall rule changes
Network security:
- Firewall allows and denies (denies are often more valuable than allows)
- VPN connections and disconnections
- DNS queries (especially to external resolvers)
- Proxy/web filter logs
- IDS/IPS alerts
Tier 2: Should Have (Month One)
Application-level:
- Web application access logs (URLs, response codes, source IPs, user agents)
- Database queries — at minimum, DDL (schema changes), DML on sensitive tables, and failed queries
- API access logs (authentication, rate limiting, errors)
- Email flow logs (sender, recipient, subject, attachment presence, spam/phishing verdicts)
- File access on sensitive shares (who opened what, when)
Endpoint:
- Process execution logs (what ran, who ran it, parent process)
- PowerShell and script execution (with script block logging enabled)
- USB device connections
- Endpoint protection alerts (antivirus, EDR)
Tier 3: Nice to Have (Quarter One)
- Full packet capture on critical segments (expensive, high value for forensics)
- Cloud control plane logs (AWS CloudTrail, Azure Activity Log, GCP Audit Logs)
- Container and orchestration logs (Kubernetes audit logs, container runtime events)
- Physical access logs (badge reader events, if integrated)
- DHCP lease logs (map IP addresses to devices over time)
NIST SP 800-92 provides comprehensive guidance on log management, including what to log, how to manage log data, and log analysis. CISA’s Logging Made Easy project provides free tools and guidance for organizations getting started.
Centralized Log Collection
Architecture
Logs need to flow from sources to a central platform in real time. The basic architecture:
[Log Sources] --> [Collection/Shipping] --> [Ingestion/Parsing] --> [Storage/Index] --> [Search/Alert]
Collection agents:
- Syslog — The universal log protocol. Most network devices, Linux systems, and security appliances support syslog natively. Use syslog-ng or rsyslog for collection and forwarding.
- Beats/Filebeat — Elastic’s lightweight log shipper. Reads log files and forwards to Logstash or Elasticsearch.
- Fluentd/Fluent Bit — Open-source log collector and forwarder. Popular in containerized environments.
- Windows Event Forwarding (WEF) — Native Windows mechanism for centralizing Windows Event Logs. No agent required on endpoints.
SIEM platforms:
- Open source: Wazuh, Elastic Security (ELK Stack + detection rules), Graylog, Security Onion
- Commercial: Splunk, Microsoft Sentinel, CrowdStrike LogScale (Humio), Sumo Logic, Datadog Security
The platform matters less than the commitment to using it. An open-source SIEM with well-tuned detection rules and an analyst reviewing alerts daily outperforms an expensive SIEM that’s collecting logs nobody looks at.
Log Parsing and Normalization
Raw logs from different sources use different formats: syslog, JSON, CSV, Windows Event XML, proprietary formats. Normalize them into a common schema so you can correlate events across sources.
The Elastic Common Schema (ECS) and OCSF (Open Cybersecurity Schema Framework) provide standardized field mappings. Normalizing means that “source_ip” from your firewall and “ClientIP” from your web server both map to the same field in your SIEM. Without normalization, cross-source correlation is manual and painful.
Retention Policy
How Long to Keep Logs
Retention is a balance between forensic value, compliance requirements, and storage costs.
| Log Type | Minimum Retention | Rationale |
|---|---|---|
| Authentication | 1 year | Compliance (PCI-DSS requires 1 year, 3 months immediately available). Forensic investigation of long-dwell-time breaches. |
| Firewall/network | 90 days - 1 year | Network forensics, lateral movement investigation. |
| Application access | 90 days - 1 year | User activity investigation, insider threat. |
| System/configuration changes | 1 year | Change attribution, root cause analysis. |
| DNS queries | 90 days | C2 detection, data exfiltration investigation. |
| Full packet capture | 7-30 days | Storage-intensive; keep only for critical segments. |
Regulatory requirements set the floor:
- PCI-DSS — Audit trail history for at least one year; at least three months immediately available for analysis.
- HIPAA — Six years for documentation; log retention not explicitly specified but implied by audit requirements.
- SOX — Seven years for audit-related records.
- GDPR — Logs containing personal data are subject to data minimization. Don’t retain longer than necessary.
Use tiered storage: hot storage (fast search, recent data) for 30-90 days, warm storage (slower search, older data) for the remainder of your retention period, cold/archive storage (retrieval takes hours) for long-term compliance retention.
Alert Tuning
The Alert Fatigue Problem
An untuned SIEM generates thousands of alerts per day. Analysts start ignoring them. The real attack hides in the noise. Alert fatigue is the number one reason organizations with SIEMs still miss breaches.
How to Tune
Start with high-fidelity detections:
- Login from impossible travel (same account, two locations, insufficient travel time)
- New admin account created outside normal provisioning process
- Service account used interactively
- PowerShell or command-line execution by a non-IT account
- Disabled antivirus/EDR on an endpoint
- Bulk file download or access outside normal patterns
- DNS queries to known malicious domains (threat intelligence feeds)
- Failed login followed by successful login from a different source IP (credential stuffing)
Reduce noise:
- Whitelist known-good activity that triggers false positives. Document every whitelist entry with justification and review date.
- Set thresholds — a single failed login is noise; ten failed logins to the same account in five minutes is a brute force attack.
- Correlate across sources — a failed VPN login alone is low priority; a failed VPN login followed by a successful login from a different country followed by access to the file server at 3 AM is high priority.
- Use MITRE ATT&CK as a detection framework. Map your detection rules to ATT&CK techniques to identify coverage gaps and prioritize new detections.
Measure alert quality:
- Track true positive rate — what percentage of alerts result in confirmed incidents?
- Track alert-to-investigation ratio — how many alerts does an analyst review per actual investigation?
- If the true positive rate is below 20%, your alerts need tuning, not more analysts.
Log Integrity
Why Log Integrity Matters
If an attacker can modify or delete logs after compromising a system, they can erase evidence of the breach, hide their persistence mechanisms, and make forensic investigation impossible. Protecting log integrity is protecting your ability to detect, investigate, and prove what happened.
How to Protect Logs
- Ship logs off-source immediately — Logs should leave the source system in real time. If the source is compromised, the logs are already safe in the central platform.
- Write-once storage — Store logs in immutable storage. S3 Object Lock, Azure Immutable Blob Storage, or WORM (Write Once Read Many) appliances prevent modification or deletion even by administrators.
- Separate access controls — The team that administers production systems should not have delete access to the log platform. Separation of duties prevents an insider (or a compromised admin account) from covering tracks.
- Log integrity verification — Hash log files and store hashes separately. Periodically verify that stored logs match their recorded hashes. Any mismatch indicates tampering.
- Log the log access — Meta-monitoring: who accessed the SIEM, what queries did they run, did anyone attempt to delete or modify log data? NIST SP 800-92 covers log integrity requirements in detail.
If It Already Happened
If you’re investigating an incident and discovering that the logs you need don’t exist or have been tampered with:
- Collect what’s available — Even partial logs have forensic value. Memory dumps, disk images, and network flow data can fill some gaps.
- Check secondary log sources — Cloud provider logs, DNS provider logs, ISP flow data, and third-party service logs may capture activity that your internal logs missed.
- Engage external IR support — Forensic firms have tools and techniques for recovering evidence from systems with incomplete logs.
- Build the logging pipeline now — The incident just demonstrated the cost of insufficient logging. Use it as the business case for investment.
- Report through CISA. Even with incomplete evidence, reporting helps CISA track threat patterns and issue advisories that help others.
Logs are evidence, detection, and accountability. Start with authentication logs — every login, every failure, every privilege change. Centralize them today. Set up three high-fidelity alerts. Tune them until they’re actionable. Then expand: add network logs, application logs, endpoint logs, one source at a time. The pipeline you build now determines whether you catch the next breach in hours or discover it in months.