/\                           _  _        _    ____  
|/\|_ __ ___  __ _  _____  __| || |      / \  | __ ) 
   | '__/ _ \/ _` |/ _ \ \/ / __) |     / _ \ |  _ \ 
   | | |  __/ (_| |  __/>  <\__ \ |___ / ___ \| |_) |
   |_|  \___|\___,|\___/_/\_(   /_____/_/   \_\____/ 
             |___/           |_|

Regular Expressions Cheatsheet

A quick reference for pattern matching — Python · JavaScript · Go · PHP · Java · Ruby

Anchors

^Start of string / line
$End of string / line
\bWord boundary
\BNot a word boundary
\AStart of string (no multiline)
\ZEnd of string (no multiline)
\zAbsolute end of string
\GStart of current search (pos)
\KReset match start (keep left)
(?m)^Start of each line (multiline)
(?m)$End of each line (multiline)

Character Classes

.Any char except newline
\wWord char [a-zA-Z0-9_]
\WNon-word character
\dDigit [0-9]
\DNon-digit
\sWhitespace [ \t\n\r\f]
\SNon-whitespace
[abc]Any of a, b, or c
[^abc]Not a, b, or c
[a-z]Range a to z
[a-zA-Z0-9]Alphanumeric
\p{L}Unicode letter (PCRE/Java)
\p{N}Unicode number

Quantifiers

*0 or more (greedy)
+1 or more (greedy)
?0 or 1 (greedy)
{n}Exactly n times
{n,}n or more times
{n,m}Between n and m times
*?0 or more (lazy)
+?1 or more (lazy)
??0 or 1 (lazy)
*+0 or more (possessive)
++1 or more (possessive)

Groups

(abc)Capture group
(?:abc)Non-capturing group
(?<name>abc)Named capture group
(?P<name>abc)Named group (Python)
\1Backreference to group 1
(?P=name)Named backreference (Python)
(?>abc)Atomic group (no backtrack)
a|bAlternation: a or b

Lookarounds

(?=abc)Positive lookahead
(?!abc)Negative lookahead
(?<=abc)Positive lookbehind
(?<!abc)Negative lookbehind
Lookahead example
\d+(?=px) → matches digits before "px"
Lookbehind example
(?<=\$)\d+ → matches digits after "$"

Flags / Modifiers

iCase insensitive
gGlobal — find all matches
mMultiline — ^ and $ per line
sDotall — . matches \n too
xVerbose — ignore whitespace
uUnicode mode
(?ims)Multiple inline flags
(?-i)Turn off flag inline

Escape Sequences

\nNewline (LF, U+000A)
\rCarriage return (CR, U+000D)
\tHorizontal tab (U+0009)
\vVertical tab (U+000B)
\fForm feed (U+000C)
\aBell (U+0007)
\eEscape (U+001B)
\0Null character (U+0000)
\xFFHex byte (e.g. \x41 = A)
\uFFFFUnicode code point (4 hex)
\u{1F600}Unicode (JS with /u flag)
\cXControl char (e.g. \cM = CR)
\\Literal backslash
\. \* \+ \? \^ \$Literal meta characters
\( \) \[ \] \{ \}Literal brackets
\| \/Literal pipe / slash

Replacement / Substitution

$1 or \1Insert capture group 1
${1}Group 1 (unambiguous)
$&Entire match
$`String before match
$'String after match
$$Literal $ in replacement
$<name>Named group (JS)
\g<name>Named group (Python)
\g<1>Group by number (Python)
${name}Named group (.NET / JS)
\U\1Uppercase group (Perl/PCRE)
\L\1Lowercase group (Perl/PCRE)
\EEnd \U or \L

Common Patterns

[\w.+-]+@[\w-]+\.[a-z]{2,}Email
https?://[\w./%-]+(\?[\w=&%-]+)?URL
\d{1,3}(\.\d{1,3}){3}IPv4 address
([0-9a-f]{2}:){5}[0-9a-f]{2}MAC address
^\+?[\d\s()\-]{7,15}$Phone number
^#?[0-9a-fA-F]{6}$Hex color (#RRGGBB)
^#?([0-9a-fA-F]{3}){1,2}$Hex color (3 or 6 digits)
\d{4}-\d{2}-\d{2}Date YYYY-MM-DD
\d{2}[\/.-]\d{2}[\/.-]\d{4}Date DD/MM/YYYY
\d{2}:\d{2}(:\d{2})?Time HH:MM[:SS]
^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}$Strong password
^[a-zA-Z0-9_]{3,20}$Username
^\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}$Credit card number
^[a-z][a-z0-9\-]{1,61}[a-z0-9]$Domain name
[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}UUID / GUID
<[^>]+>HTML tag (basic)
\/\*[\s\S]*?\*\//* Block comment */
^\s+|\s+$Leading/trailing whitespace
\b\d+(\.\d+)?\bInteger or float
^[01]?\d|2[0-3]):[0-5]\d$24-hour time HH:MM

JavaScript

/pattern/flagsRegex literal
new RegExp(str, flags)Dynamic regex
re.test(str)Returns true/false
str.match(re)Array of matches or null
str.matchAll(re)Iterator of all matches
str.search(re)Index of first match or -1
str.replace(re, '$1')Replace with string or fn
str.replaceAll(re, s)Replace all (needs g flag)
str.split(re)Split by pattern

Python (re module)

re.match(p, s)Match at start of string
re.search(p, s)Search anywhere in string
re.findall(p, s)List of all matches
re.finditer(p, s)Iterator of match objects
re.sub(p, repl, s)Replace matches
re.subn(p, repl, s)Replace + count
re.split(p, s)Split by pattern
re.compile(p, re.I)Compile with flags
m.group(1)Get capture group value
m.groupdict()Named groups as dict
m.span()Start and end positions

POSIX Classes

[:alpha:]Letters [a-zA-Z]
[:digit:]Digits [0-9]
[:alnum:]Letters and digits
[:space:]Whitespace
[:upper:]Uppercase letters
[:lower:]Lowercase letters
[:punct:]Punctuation
[:print:]Printable characters
[:xdigit:]Hex digits [0-9a-fA-F]

MQTT & IoT

Topic validation
^(?:[^/+#\x00-\x1f\x7f]+)(?:\/(?:[^/+#\x00-\x1f\x7f]*))*$Publish topic — no wildcards, no control chars; from MQTT spec 4.7 + Node-RED isValidPublishTopic()
^(?:[^+#\x00-\x1f\x7f]*|\+)(?:\/(?:[^+#\x00-\x1f\x7f]*|\+))*(?:\/(#))?$Subscribe topic filter — + single-level and # multi-level (end only); from MQTT spec + mqtt-regex
(?:^|\/)\\+(?:\/|$)Single-level wildcard + — finds + tokens inside a topic filter (mqtt-regex process_single)
(?:^|\/)#$Multi-level wildcard # — valid only at end of topic filter (mqtt-regex process_multi)
(?:^|\/)(\+|#|[^/+#\x00-\x1f\x7f]*)(?=\/|$)Tokenize topic — splits into level segments, wildcard-aware (mqtt-regex tokenize)
Topic patterns by platform
^(?:myhome|home)\/(?:[a-z0-9]+)\/(?:[a-z0-9]+)\/(?:temperature|humidity|pressure|motion|light|power)$Home sensor topic — home/<floor>/<room>/<measurement>; from HiveMQ MQTT Essentials Part 5
^homeassistant\/([a-z_]+)\/([a-zA-Z0-9_-]+)(?:\/([a-zA-Z0-9_-]+))?\/(?:config|state|set|status|availability)$Home Assistant MQTT Discovery — homeassistant/<component>/[node_id/]<object_id>/<suffix>
^zigbee2mqtt\/(?:bridge\/(?:state|config|devices|groups|info|logging|request\/.+|response\/.+)|([^/]+)\/(?:set|get|availability)?)$Zigbee2MQTT topic — bridge control or <device_id>/set|get|availability (from mqtt.ts)
^\$SYS\/broker\/(?:clients\/(?:connected|disconnected|maximum|total)|messages\/(?:sent|received|stored)|uptime|version|load\/\w+)$Mosquitto $SYS — broker stats: clients, messages, uptime, version, load
^\$aws\/things\/[^/]+\/shadow\/(?:get|update|delete)$AWS IoT Core device shadow — get, update or delete operations
^devices\/([^\/]+)\/messages\/(?:events|devicebound)Azure IoT Hub — device-to-cloud events and cloud-to-device messages
^[A-Z]{2,6}\/[A-Z0-9_]+\/[A-Z0-9_]+$Industrial SCADA topic — AREA/DEVICE/PARAM naming used in factory automation
Mosquitto log parsing
^(\d{10}):\s+(.+)$Mosquitto log line — splits Unix epoch timestamp and message body (default, no log_timestamp_format)
^(\d{10}|(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})):\s+New (?:bridge )?client connected from ([\d.]+):(\d+) as (\S+) \(p(\d+), c([01]), k(\d+)(?:, u'([^']*)')?\)Mosquitto CONNECT — captures ts, IP, port, clientId, protocol, clean-session, keepalive, username
Received PUBLISH from (\S+) \(d(\d), q(\d), r(\d), m(\d+), '([^']+)', \.\.\. \((\d+) bytes\)\)Mosquitto PUBLISH debug — clientId, dup, QoS, retain, msgId, topic, bytes (handle_publish.c)
^(\d{10}|(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})):\s+Client (\S+) disconnected\.Mosquitto DISCONNECT — captures timestamp and clientId
\bas\s+([a-zA-Z0-9_:\-\.]+)\s+\(Extract clientId from Mosquitto CONNECT log — 'as <clientId> (' format (handle_connect.c)
\bq(?:os)?[=:\s]?([0-2])\bQoS level from Mosquitto debug — matches (d0, q1, r0) or subscribe log entries
Payload & device identifiers
\{[\s\S]*\}JSON payload — extract inline JSON from log output or MQTT debug capture
"(?:temperature|temp|t)"\s*:\s*(-?\d+(?:\.\d+)?)|"(?:humidity|hum|h)"\s*:\s*(\d+(?:\.\d+)?)Temp / humidity — extract numeric values from JSON sensor payload (HA docs format)
^(ON|OFF|true|false|1|0|OPEN|CLOSED|LOCK|UNLOCK|ONLINE|OFFLINE)$Boolean / state payload — switch, lock, cover, availability (Home Assistant MQTT)
"(?:lat(?:itude)?|lon(?:gitude)?|lng|alt(?:itude)?)"\s*:\s*(-?\d+\.\d+)GPS payload — extracts lat/lon/altitude from JSON tracker topic (HiveMQ asset tracking example)
"battery"\s*:\s*(\d{1,3})Battery level — extract % from JSON, useful for low-charge alerting
(?:[0-9A-Fa-f]{2}[:\-]){5}[0-9A-Fa-f]{2}MAC address — device identifier in topic paths and Zigbee2MQTT payloads
[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}Device UUID — cloud IoT platform identifier embedded in topic or payload
\besp(?:_|-)?[a-z0-9_-]{4,32}\bESP8266/ESP32 client ID — matches esp_device_01, esp-abc123 in Mosquitto CONNECT logs
Timestamps
^(\d{10})(?=:\s)Unix epoch (seconds) at start of Mosquitto log line — 10 digits before ': ' (logging.c)
\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?(?:Z|[+-]\d{2}:\d{2})?ISO 8601 timestamp — when Mosquitto log_timestamp_format = %Y-%m-%dT%H:%M:%S is set
Topic wildcard examples
home/+/temperature → matches home/living/temperature, home/bedroom/temperature   sensors/# → matches sensors/any/depth/path
Mosquitto log example (default format)
1711234567: New client connected from 192.168.1.42:49823 as esp_sensor_01 (p2, c1, k60, u'homelab')
JSON sensor payload (Home Assistant)
{"temperature":21.4,"humidity":58,"battery":87,"status":"online"}