jluvug
Last Updated: March 07, 2016
·
2.243K
· moiseevigor
2e1e340e378a26f2c69c6dd418060374

Encoding hell, grep and iconv salvage!

Find non ASCII characters in file

grep --color='auto' -P "[\x80-\xFF]" FILENAME

It is used to extract records from dump, when database encoding is different with respect to the connection.

For example in MySQL the default field encoding is "latin1swedishci" and in browser it is usually the UTF8. We can work it out with iconv

iconv --verbose -f LATIN1 -t UTF8//TRANSLIT FILENAME_latin1 > FILENAME_utf8

If you get

iconv: illegal input sequence at position <NUMBER>

You may correct it with vim, just type in command mode

:goto <NUMBER>

Be aware that you're working with UTF8 locale session in terminal

user@host:~$ locale 
LANG=en_US.UTF-8
LANGUAGE=en_US:
LC_CTYPE="en_US.UTF-8"

Thats it!

Say Thanks
Respond
Filed Under