uccfpq
Last Updated: May 15, 2019
·
29.93K
· bsimpson

Formatting Currency via Regular Expression

I came across an interesting requirement today to format a floating number as currency. In the US and many other countries, currency greater than or equal to 1000 units ($1,000.00) is typically broken up into triads via a comma delimiter. My initial reaction was to use a recursive function, but I glanced at the Rails' ActiveSupport gem to see how their number_with_delimiter method performed this same task. I was surprised to learn that this was handled via a clever regular expression:

/(\d)(?=(\d{3})+(?!\d))/

This uses a few of what I would consider the more advanced concepts of regular expressions: positive lookahead, and negative lookahead.

Positive Lookahead

From the Rdoc page for Ruby's regular expression class:

ensures that the following characters match pat, but doesn't include those characters in the matched text

We can see in our expression that we are looking for a digit, then we use the ?= syntax of positive lookahead to search for one or more instances of a digit triad. This would match 1 in the pattern 1234, and 123 in the pattern 123456.

Negative Lookahead

From the Rdoc page:

ensures that the following characters do not match pat, but doesn't include those characters in the matched text

This is the inverse of the positive lookahead, and is invoked via ?!. We further define our expression by only capturing a digit that follows another digit. Appending this to our positive lookahead rule, we only select digits who's position is a multiple of 3 when that digit is preceded by another digit. That is a mouthful, but it means we are excluding a comma where there is already one present.

Inserting the Comma

Now that we have declared our expression, we can use this in a string replacement call to interpolate our commas at our identified positions using a backreference, followed by our comma (or any other delimiter):

In Ruby:

"10000".gsub(/(\d)(?=(\d{3})+(?!\d))/, "\\1,") # => "10,000"

In Javascript:

"10000".replace(/(\d)(?=(\d{3})+(?!\d))/g, "$1,") # => "10,000"

The second argument to the string replacement methods is the value to replace our pattern with. Here we see our backreference has a comma appended to it.

Conclusion

Regular expressions are a powerful tool. What impresses me about this pattern is that it is so terse, yet eliminates the need for more complex recursive function calls, or looping. The pattern here is defining insertion points for our delimiter while honoring all the rules of currency formatting. The things you learn by peeking under the hood!

3 Responses
Add your response

9712

Thank you for your blog. It helped me a lot

over 1 year ago ·
24458

I think the description for the negative lookahead is a bit misleading. It should be that our expression should only match if the group(s) of 3 digits is not followed by a digit.
i.e. the groups of 3 go up to the end of the string, not leaving any leftover characters

over 1 year ago ·
32143

This won't work if you have decimals in your input string (4 or more).
This works instead: (\d)(?=(\d{3})+(?=\,)(?!\d))
If your decimal input has '.' instead of ',', just replace that character int the provided regex.
However thanks for your solution since was necessary for firther developing my use case

over 1 year ago ·