Several characters or character classes inside square brackets
[…] mean to “search for any character among given”.
[eao] means any of the 3 characters:
That’s calles a set. Sets can be used in a regexp along with regular characters:
Please note that although there are multiple characters in the set, they correspond to exactly one character in the match.
So the example above gives no matches:
The pattern assumes:
- then one of the letters
So there would be a match for
Square brackets may also contain character ranges.
[a-z] is a character in range from
[0-5] is a digit from
In the example below we’re searching for
"x" followed by two digits or letters from
Please note that in the word
Exception there’s a substring
xce. It didn’t match the pattern, because the letters are lowercase, while in the set
[0-9A-F] they are uppercase.
If we want to find it too, then we can add a range
i flag would allow lowercase too.
Character classes are shorthands for certain character sets.
- \d – is the same as
- \w – is the same as
- \s – is the same as
[\t\n\v\f\r ]plus few other unicode space characters.
We can use character classes inside
[…] as well.
For instance, we want to match all wordly characters or a dash, for words like “twenty-third”. We can’t do it with
\w class does not include a dash. But we can use
We also can use a combination of classes to cover every possible character, like
[\s\S]. That matches spaces or non-spaces – any character. That’s wider than a dot
".", because the dot matches any character except a newline.
Besides normal ranges, there are “excluding” ranges that look like
They are denoted by a caret character
^ at the start and match any character except the given ones.
[^aeyo]– any character except
[^0-9]– any character except a digit, the same as
[^\s]– any non-space character, same as
The example below looks for any characters except letters, digits and spaces:
Usually when we want to find exactly the dot character, we need to escape it like
\.. And if we need a backslash, then we use
In square brackets the vast majority of special characters can be used without escaping:
- A dot
- A plus
'-'in the beginning or the end (where it does not define a range).
- A caret
'^'if not in the beginning (where it means exclusion).
- And the opening square bracket
In other words, all special charactere are allowed except where they mean something for square brackets.
"." inside square brackets means just a dot. The pattern
[.,] would look for one of characters: either a dot or a comma.
In the example below the regexp
[-().^+] looks for one of the characters
…But if you decide to escape them “just in case”, then there would be no harm: