Last Updated: February 25, 2016
·
2.326K
· wojtekkruszewsk

Find non-ascii characters in a directory

Sometimes I copy text snippets from emails or PDFs into web app templates or tests. This results in errors when Ruby 1.9 is not configured to read source files as UTF8:

invalid byte sequence in US-ASCII (ArgumentError)

Sure there are ways to fix this but still I'd like to know where did I paste "special" characters. They are often difficult (if not impossible) to detect visually. Grep to the rescue!

grep -r --color='auto' -P -n "[\x80-\xFF]" ./app/