Java Regular Expressions: Master Text Searching with Regex

Learn about Java regular expressions (regex), a powerful tool for searching and manipulating text based on specific patterns. Discover how regex allows you to efficiently find, match, and extract data within text using predefined criteria.



Java Regular Expressions

A regular expression (regex) is a sequence of characters that forms a search pattern, allowing you to search for data within text based on specific criteria.

In Java, regular expressions are managed using the java.util.regex package, which includes:

The Java Regex or Regular Expression is an API to define a pattern for searching or manipulating strings.

It is widely used to define constraints on strings such as password and email validation. After learning this tutorial, you can test your regular expressions using the Java Regex Tester Tool.

  • Pattern Class: Defines the regex pattern to be used in a search.
  • Matcher Class: Used to perform searches based on the pattern.
  • PatternSyntaxException Class: Indicates syntax errors in a regex pattern.

Matcher class

The Matcher class implements the MatchResult interface and is used to perform match operations on a character sequence.

No. Method Description
1 boolean matches() Tests whether the regular expression matches the pattern.
2 boolean find() Finds the next expression that matches the pattern.
3 boolean find(int start) Finds the next expression that matches the pattern from the given start number.
4 String group() Returns the matched subsequence.
5 int start() Returns the starting index of the matched subsequence.
6 int end() Returns the ending index of the matched subsequence.
7 int groupCount() Returns the total number of the matched subsequences.

Pattern class

The Pattern class is the compiled version of a regular expression. It defines a pattern for the regex engine.

No. Method Description
1 static Pattern compile(String regex) Compiles the given regex and returns an instance of the Pattern.
2 Matcher matcher(CharSequence input) Creates a matcher that matches the given input with the pattern.
3 static boolean matches(String regex, CharSequence input) Compiles the regular expression and matches the given input with the pattern.
4 String[] split(CharSequence input) Splits the given input string around matches of the given pattern.
5 String pattern() Returns the regex pattern.

Example

Syntax

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("Movie", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher("I am watching a Movie!");
boolean matchFound = matcher.find();
if(matchFound) {
System.out.println("Match found");
} else {
System.out.println("Match not found");
}
}
}
Output

Match found

Explanation

In the example above:

  • A pattern is created using Pattern.compile(), searching for the word "tutorialsarena" with a case-insensitive flag.
  • matcher() is used to find occurrences of the pattern in the input string.
  • find() returns true if the pattern is found, otherwise false.

Example of Java Regular Expressions

There are three ways to write the regex example in Java.


import java.util.regex.*;

public class RegexExample1 {
public static void main(String args[]) {
// 1st way
Pattern p = Pattern.compile(".s"); // . represents single character
Matcher m = p.matcher("as");
boolean b = m.matches();

// 2nd way
boolean b2 = Pattern.compile(".s").matcher("as").matches();
// 3rd way
boolean b3 = Pattern.matches(".s", "as");

System.out.println(b + " " + b2 + " " + b3);
}
}
    
Output

true true true
    

Flags

Flags in compile() change how the search is performed, such as ignoring case or treating special characters as literals.

Regular Expression Patterns

The pattern passed to Pattern.compile() specifies what to search for, using brackets for character ranges and metacharacters for special meanings.

Metacharacters

Metacharacters include special characters like |, ., ^, $, \d, \s, \b, and \uxxxx, each with specific search functionalities.

Metacharacter Description
| Find a match for any one of the patterns separated by | as in: cat|dog|fish
. Find just one instance of any character
^ Finds a match as the beginning of a string as in: ^Hello
$ Finds a match at the end of the string as in: World$
\d Find a digit
\s Find a whitespace character
\b Find a match at the beginning or end of a word like this: \bWORD or WORD\b
\uxxxx Find the Unicode character specified by the hexadecimal number xxxx

Example of Metacharacters


import java.util.regex.*;

class RegexExample5 {
public static void main(String args[]) {
    System.out.println("metacharacters d...."); \\d means digit

    System.out.println(Pattern.matches("\\d", "abc")); // false (non-digit)
    System.out.println(Pattern.matches("\\d", "1")); // true (digit and comes once)
    System.out.println(Pattern.matches("\\d", "4443")); // false (digit but comes more than once)
    System.out.println(Pattern.matches("\\d", "323abc")); // false (digit and char)

    System.out.println("metacharacters D...."); \\D means non-digit

    System.out.println(Pattern.matches("\\D", "abc")); // false (non-digit but comes more than once)
    System.out.println(Pattern.matches("\\D", "1")); // false (digit)
    System.out.println(Pattern.matches("\\D", "4443")); // false (digit)
    System.out.println(Pattern.matches("\\D", "323abc")); // false (digit and char)
    System.out.println(Pattern.matches("\\D", "m")); // true (non-digit and comes once)

    System.out.println("metacharacters D with quantifier....");
    System.out.println(Pattern.matches("\\D*", "mak")); // true (non-digit and may come 0 or more times)
}
}
    
Output

metacharacters d....
false
true
false
false
metacharacters D....
false
false
false
false
true
metacharacters D with quantifier....
true
    

Regex Character Classes

No. Character Class Description
1 [abc] a, b, or c (simple class)
2 [^abc] Any character except a, b, or c (negation)
3 [a-zA-Z] a through z or A through Z, inclusive (range)
4 [a-d[m-p]] a through d, or m through p: [a-dm-p] (union)
5 [a-z&&[def]] d, e, or f (intersection)
6 [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction)
7 [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z] (subtraction)

Quantifiers

Quantifiers specify how many instances of a character or group are expected, using symbols like +, *, ?, {x}, {x,y}, and {x,}.

Quantifier Description
n+ Matches any string that contains at least one n
n* Matches any string that contains zero or more occurrences of n
n? Matches any string that contains zero or one occurrences of n
n{x} Matches any string that contains a sequence of X n's
n{x,y} Matches any string that contains a sequence of X to Y n's
n{x,} Matches any string that contains a sequence of at least X n's

Example of Character Classes and Quantifiers


import java.util.regex.*;

class RegexExample4 {
public static void main(String args[]) {
    System.out.println("? quantifier ....");
    System.out.println(Pattern.matches("[amn]?", "a")); // true (a or m or n comes one time)
    System.out.println(Pattern.matches("[amn]?", "aaa")); // false (a comes more than one time)
    System.out.println(Pattern.matches("[amn]?", "aammmnn")); // false (a m and n comes more than one time)
    System.out.println(Pattern.matches("[amn]?", "aazzta")); // false (a comes more than one time)
    System.out.println(Pattern.matches("[amn]?", "am")); // false (a or m or n must come one time)

    System.out.println("+ quantifier ....");
    System.out.println(Pattern.matches("[amn]+", "a")); // true (a or m or n once or more times)
    System.out.println(Pattern.matches("[amn]+", "aaa")); // true (a comes more than one time)
    System.out.println(Pattern.matches("[amn]+", "aammmnn")); // true (a or m or n comes more than once)
    System.out.println(Pattern.matches("[amn]+", "aazzta")); // false (z and t are not matching pattern)

    System.out.println("* quantifier ....");
    System.out.println(Pattern.matches("[amn]*", "ammmna")); // true (a or m or n may come zero or more times)
}
}
    
Output

? quantifier ....
true
false
false
false
false
+ quantifier ....
true
true
true
false
* quantifier ....
true