Last Updated: February 25, 2016
·
678
· nimf

Regex operators made non greedy

Imagine, you need to quickly match a text inside <b> tags. And you don't want to load any XML/HTML parsing libraries.

I first came up with:

str = "Lorem <b>ipsum</b> dolor sit <b>amet</b>, consectetur adipiscing elit."
str.scan /<b>.*<\/b>/
 => ["<b>ipsum</b> dolor sit <b>amet</b>"]

which match everything from first <b> to last </b>.
And this is not what I want.

This happens because the star operator is greedy by default.
To make it ungreedy (or reluctant) simply add question mark behind:

str.scan /<b>.*?<\/b>/
 => ["<b>ipsum</b>", "<b>amet</b>"]

That's better.
You can also apply this to operators: +, {} and question mark itself.