Regex Characters: Mastering Literal Matches and Case Sensitivity
Learn the fundamentals of regex with a focus on literal characters and case sensitivity. Understand how to precisely match text patterns and customize regex behavior with case-insensitivity in your programming projects.
Characters in Regex: Building the Foundation
Understanding Literal Characters
At its core, a regular expression (regex) is a sequence of characters. These characters, often referred to as literal characters, are matched literally in the input text. For instance, the regex /cat/
will match the exact sequence "cat" within a string.
Case Sensitivity
By default, regex is case-sensitive. This means that /cat/
will not match "Cat" or "CAT". However, many programming languages offer flags or modifiers to make regex case-insensitive.
Special Characters
While most characters match themselves directly, some have special meanings in regex. These are called metacharacters. They don't match themselves literally but represent pattern-matching constructs. We'll delve deeper into metacharacters in the next section.
Escape Sequences
To match a literal special character, you need to escape it using a backslash (\
). This tells the regex engine to treat the following character as a literal rather than a metacharacter.
Common Escape Sequences
\n
: Newline\r
: Carriage return\t
: Tab\s
: Whitespace\d
: Digit (equivalent to [0-9])\w
: Word character (letters, digits, underscore)\D
: Non-digit\W
: Non-word character.
: Matches any character except newline
Examples
Syntax
/\$100/: Matches the exact string "$100"
/\d+/: Matches one or more digits
/\w+/: Matches one or more word characters
Output
"$100", "12345", "Hello123"
Table of Character Matches
Regex Pattern | Input String | Matches |
---|---|---|
/H/ |
"Hello World!" | "H" |
/Hello/ |
"Hello World!" | "Hello" |
/World/ |
"Hello World!" | "World" |
/he/ |
"Hello World!" | No match (case-sensitive) |
/ / |
"Hello World!" | Space character |
/\t/ |
"Hello\tWorld" | Tab character |
/\$10/ |
"The cost is $10" | "$10" |
Key Points
- Literal characters match themselves directly.
- Regex is case-sensitive by default.
- Special characters have specific meanings in regex.
- Use backslash to escape special characters.
Conclusion
By understanding these foundational concepts, you're well-prepared to explore the power of metacharacters and build more complex regex patterns.