Monday, 19 September 2016

regex - What do 'lazy' and 'greedy' mean in the context of regular expressions?



Could someone explain these two terms in an understandable way?


Answer



Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with <.+>. Suppose you have the following:




Hello World


You may think that <.+> (. means any non newline character and + means one or more) would only match the and the , when in reality it will be very greedy, and go from the first < to the last >. This means it will match Hello World instead of what you wanted.



Making it lazy (<.+?>) will prevent this. By adding the ? after the +, we tell it to repeat as few times as possible, so the first > it comes across, is where we want to stop the matching.



I'd encourage you to download RegExr, a great tool that will help you explore Regular Expressions - I use it all the time.


No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...