Tag Archives: regex

Regular Expressions and the void in a string (zero-length matches)

While attempting to do a substitution of word characters   “[a-zA-Z0-9_]” in a string, I noticed using a “\w*” pattern lead to 2 matches:

  1. One for the string
  2. Another for the void at the end of the string

For instance: see: https://regex101.com/r/ptz8Cm/1

The explanation is at: http://www.regular-expressions.info/zerolength.html

In short, we should use “\w+” instead of “\w*” so that zero-length matches are avoided.

Note: if you make the quantifier lazy as in https://regex101.com/r/ptz8Cm/2 then you have 3 matches.