
YARA Rule Writing Cheat Sheet
Single-page reference for malware analysts — syntax, modifiers, performance and the gotchas that get rules wrong in production.
YARA is the de-facto pattern-matching language for malware analysts. The syntax is small, the conditions are flexible, and the rules run anywhere — endpoint EDR, mail gateway, SIEM, retro-hunting buckets. This cheat sheet covers the syntax that matters in production plus the mistakes that quietly turn good intel into noise.
1. Rule anatomy
A YARA rule has three sections: meta (free-form attributes), strings (the patterns you'll match), and condition (the boolean expression that decides hits). Strings are referenced by their $variable name in the condition.
2. String types
| Type | Syntax | Notes |
|---|---|---|
| Text | $a = "GET /api/v1" | Quoted, case-sensitive by default |
| Text (nocase) | $a = "admin" nocase | Case-insensitive matching |
| Text (wide) | $a = "admin" wide | Match UTF-16LE strings (Windows PE common) |
| Hex | $h = { 4D 5A ?? ?? 50 45 } | Bytes, with ?? wildcards and [n-m] ranges |
| Regex | $r = /admin[0-9]{1,3}/ | Use sparingly — perf hit |
3. Condition shortcuts
- any of them — at least one $string hits
- all of them — every $string must hit
- N of them — at least N hits
- $a at 0 — anchored at file offset
- filesize < 10MB — file-size gating to skip big files cheaply
- uint16(0) == 0x5A4D — PE magic gate; runs only on Windows EXEs
4. Performance — the only thing that matters at scale
- Add a filesize gate first; nothing is cheaper than skipping the file
- Gate by magic bytes (uint16(0) == 0x5A4D for PE, 0x7F454C46 for ELF)
- Avoid regex strings unless you've proved a text/hex alternative won't work
- Use 4+ byte hex anchors; 2-byte hex matches everything
- Test on a benign-file corpus (Govt-clean-set, Windows ISO, Linux distro) before deploy
5. The five mistakes that ruin rules in production
- 1. Matching strings the malware borrowed from a public library (curl UA, OpenSSL banner) — false-positives on every benign binary that imports the same library.
- 2. Single 4-byte hex pattern as the only signal — collides with random data in compressed archives.
- 3. Using $string at offset against a packed sample — the unpacked image moves the offset.
- 4. Forgetting `wide` for Windows malware that uses UTF-16LE — half the malware family is missed.
- 5. Author / threat-actor names in meta without verification — survives the binary update; rule becomes wrong after the attacker refactors.
Every production rule should require ≥ 2 distinct strings to hit. Single-string rules survive one malware update, then they're noise. Two-or-three string rules tied by `and` survive the next variant.
6. Template you can paste
rule Suspicious_Loader_Macksofy_2026 { meta: author = "Macksofy DFIR" date = "2026-05" hash1 = "a1b2c3..." reference = "INT-2026-042" strings: $s1 = "HelloKitty.dll" wide nocase $s2 = { 48 8B ?? ?? 48 89 ?? ?? E8 } $s3 = /\/cmd\/[a-f0-9]{16}/ condition: filesize < 5MB and uint16(0) == 0x5A4D and 2 of them }
Macksofy offers full-service engagements that map directly to this resource. Common starting points:
- Malware Analysis & Reverse Engineering →
- Digital Forensics & Incident Response (DFIR) →
- Cyber Threat Intelligence →
