Alternation is the term in regular expression that is actually a simple “OR”.
In a regular expression it is denoted with a vertial line character
The corresponding regexp:
A usage example:
We already know a similar thing – square brackets. They allow to choose between multiple character, for instance
Alternation works not on a character level, but on expression level. A regexp
A|B|C means one of expressions
gr(a|e)ymeans exactly the same as
gra|eymeans “gra” or “ey”.
To separate a part of the pattern for alternation we usually enclose it in parentheses, like this:
In previous chapters there was a task to build a regexp for searching time in the form
hh:mm, for instance
12:00. But a simple
\d\d:\d\d is too vague. It accepts
25:99 as the time.
How can we make a better one?
We can apply more careful matching:
- The first digit must be
1followed by any digit.
As a regexp:
Then we can add a colon and the minutes part.
The minutes must be from
59, in the regexp language that means the first digit
[0-5] followed by any other digit
Let’s glue them together into the pattern:
We’re almost done, but there’s a problem. The alternation
| is between the
2[0-3]:[0-5]\d. That’s wrong, because it will match either the left or the right pattern:
That’s rather obvious, but still an often mistake when starting to work with regular expressions.
We need to add parentheses to apply alternation exactly to hours:
The correct variant: