Java Character Class: Enhancing Character Handling Beyond Primitive Types
Discover the Java Character class, which provides useful methods for manipulating characters beyond the primitive char
data type. Learn how to leverage this class to perform operations such as case conversion, character classification, and more, improving your character handling capabilities in Java programming.
Java Character Class
Normally, when we work with characters, we use primitive data types char
.
char ch = 'a';
// Unicode for uppercase Greek omega character
char uniChar = '\u039A';
// an array of chars
char[] charArray = {'a', 'b', 'c', 'd', 'e'};
Use of Character Class in Java
In development, we come across situations where we need to use objects instead of primitive data types. Java provides a wrapper class Character
for the primitive data type char
.
The Character
class offers a number of useful class (i.e., static) methods for manipulating characters. You can create a Character
object with the Character
constructor:
Character ch = new Character('a');
The Java compiler will also create a Character
object for you under some circumstances. For example, if you pass a primitive char
into a method that expects an object, the compiler automatically converts the char
to a Character
for you. This feature is called autoboxing or unboxing, if the conversion goes the other way.
Example of Java Character Class
// Here following primitive char 'a'
// is boxed into the Character object ch
Character ch = 'a';
// Here primitive 'x' is boxed for method test,
// return is unboxed to char 'c'
char c = test('x');
Escape Sequences
A character preceded by a backslash (\
) is an escape sequence and has a special meaning to the compiler.
The newline character (\n
) has been used frequently in this tutorial in System.out.println()
statements to advance to the next line after the string is printed.
Escape Sequence | Description |
---|---|
\t | Inserts a tab in the text at this point. |
\b | Inserts a backspace in the text at this point. |
\n | Inserts a newline in the text at this point. |
\r | Inserts a carriage return in the text at this point. |
\f | Inserts a form feed in the text at this point. |
\' | Inserts a single quote character in the text at this point. |
\" | Inserts a double quote character in the text at this point. |
\\ | Inserts a backslash character in the text at this point. |
When an escape sequence is encountered in a print statement, the compiler interprets it accordingly.
Example: Escape Sequences
If you want to put quotes within quotes, you must use the escape sequence, \"
, on the interior quotes:
public class Test {
public static void main(String args[]) {
System.out.println("She said \"Hello!\" to me.");
}
}
Output
She said "Hello!" to me.
Character Class Declaration
Following is the declaration for java.lang.Character
class:
public final class Character
extends Object
implements Serializable, Comparable<Character>
Fields
Following are the fields for java.lang.Character
class:
static byte COMBINING_SPACING_MARK
− This is the General category "Mc" in the Unicode specification.static byte CONNECTOR_PUNCTUATION
− This is the General category "Pc" in the Unicode specification.static byte CONTROL
− This is the General category "Cc" in the Unicode specification.static byte CURRENCY_SYMBOL
− This is the General category "Sc" in the Unicode specification.static byte DASH_PUNCTUATION
− This is the General category "Pd" in the Unicode specification.static byte DECIMAL_DIGIT_NUMBER
− This is the General category "Nd" in the Unicode specification.static byte DIRECTIONALITY_ARABIC_NUMBER
− This is the Weak bidirectional character type "AN" in the Unicode specification.static byte DIRECTIONALITY_BOUNDARY_NEUTRAL
− This is the Weak bidirectional character type "BN" in the Unicode specification.static byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
− This is the Weak bidirectional character type "CS" in the Unicode specification.static byte DIRECTIONALITY_EUROPEAN_NUMBER
− This is the Weak bidirectional character type "EN" in the Unicode specification.static byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
− This is the Weak bidirectional character type "ES" in the Unicode specification.static byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
− This is the Weak bidirectional character type "ET" in the Unicode specification.static byte DIRECTIONALITY_LEFT_TO_RIGHT
− This is the Strong bidirectional character type "L" in the Unicode specification.static byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
− This is the Strong bidirectional character type "LRE" in the Unicode specification.static byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
− This is the Strong bidirectional character type "LRO" in the Unicode specification.static byte DIRECTIONALITY_NONSPACING_MARK
− This is the Weak bidirectional character type "NSM" in the Unicode specification.static byte DIRECTIONALITY_OTHER_NEUTRALS
− This is the Neutral bidirectional character type "ON" in the Unicode specification.static byte DIRECTIONALITY_PARAGRAPH_SEPARATOR
− This is the Neutral bidirectional character type "B" in the Unicode specification.static byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
− This is the Weak bidirectional character type "PDF" in the Unicode specification.static byte DIRECTIONALITY_RIGHT_TO_LEFT
− This is the Strong bidirectional character type "R" in the Unicode specification.static byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
− This is the Strong bidirectional character type "AL" in the Unicode specification.static byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
− This is the Strong bidirectional character type "RLE" in the Unicode specification.static byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
− This is the Strong bidirectional character type "RLO" in the Unicode specification.static byte DIRECTIONALITY_SEGMENT_SEPARATOR
− This is the Neutral bidirectional character type "S" in the Unicode specification.static byte DIRECTIONALITY_UNDEFINED
− This is the Undefined bidirectional character type.static byte DIRECTIONALITY_WHITESPACE
− This is the Neutral bidirectional character type "WS" in the Unicode specification.static byte ENCLOSING_MARK
− This is the General category "Me" in the Unicode specification.static byte END_PUNCTUATION
− This is the General category "Pe" in the Unicode specification.static byte FINAL_QUOTE_PUNCTUATION
− This is the General category "Pf" in the Unicode specification.static byte FORMAT
− This is the General category "Cf" in the Unicode specification.static byte INITIAL_QUOTE_PUNCTUATION
− This is the General category "Pi" in the Unicode specification.static byte LETTER_NUMBER
− This is the General category "Nl" in the Unicode specification.static byte LINE_SEPARATOR
− This is the Neutral bidirectional character type "B" in the Unicode specification.static byte LOWERCASE_LETTER
− This is the General category "Ll" in the Unicode specification.static byte MATH_SYMBOL
− This is the General category "Sm" in the Unicode specification.static byte MODIFIER_LETTER
− This is the General category "Lm" in the Unicode specification.static byte MODIFIER_SYMBOL
− This is the General category "Sk" in the Unicode specification.static byte NON_SPACING_MARK
− This is the General category "Mn" in the Unicode specification.static byte OTHER_LETTER
− This is the General category "Lo" in the Unicode specification.static byte OTHER_NUMBER
− This is the General category "No" in the Unicode specification.static byte OTHER_PUNCTUATION
− This is the General category "Po" in the Unicode specification.static byte OTHER_SYMBOL
− This is the General category "So" in the Unicode specification.static byte PARAGRAPH_SEPARATOR
− This is the Neutral bidirectional character type "B" in the Unicode specification.static byte PRIVATE_USE
− This is the General category "Co" in the Unicode specification.static byte SPACE_SEPARATOR
− This is the General category "Zs" in the Unicode specification.static byte START_PUNCTUATION
− This is the General category "Ps" in the Unicode specification.static byte SURROGATE
− This is the General category "Cs" in the Unicode specification.static byte TITLECASE_LETTER
− This is the General category "Lt" in the Unicode specification.static byte UNASSIGNED
− This is the General category "Cn" in the Unicode specification.static byte UPPERCASE_LETTER
− This is the General category "Lu" in the Unicode specification.