Capturing groups can be accessed not only in the result or in the replacement string, but also in the pattern itself.
A group can be referenced in the pattern using
n is the group number.
To make things clear let’s consider a task.
We need to find a quoted string: either a single-quoted
'...' or a double-quoted
"..." – both variants need to match.
How to look for them?
We can put two kinds of quotes in the pattern:
['"](.*?)['"], but it would find strings with mixed quotes, like
'...". That would lead to incorrect matches when one quote appears inside other ones, like the string
"She's the one!":
As we can see, the pattern found an opening quote
", then the text is consumed lazily till the other quote
', that closes the match.
To make sure that the pattern looks for the closing quote exactly the same as the opening one, we can make a groups of it and use the backreference.
Here’s the correct code:
Now it works! The regular expression engine finds the first quote
(['"]) and remembers the content of
(...), that’s the first capturing group.
Further in the pattern
\1 means “find the same text as in the first group”, exactly the same quote in our case.
- To reference a group inside a replacement string – we use
$1, while in the pattern – a backslash
- If we use
?:in the group, then we can’t reference it. Groups that are excluded from capturing
(?:...)are not remembered by the engine.
For named groups, we can backreference by
The same example with the named group: