kvbzka
Last Updated: February 25, 2016
·
1.354K
· mat

Ignore Encoding Issues when reading files in Ruby 1.9

In case you assume a file is encoded in X but are not sure and do not care if some characters cannot be converted and get lost, try

def slurp(file_name, assumed_encoding, target_encoding)
    file_mode = "rb:%s" % assumed_encoding
    File.read(file_name, :mode =>  file_mode).encode(
       target_encoding,
       :undef => :replace,
       :invalid => :replace,
       :replace => "?"
    ).force_encoding(target_encoding)
end

slurp(file_name = "presumably_utf8.txt",
      assumed_encoding="UTF-8",
      target_encoding="UTF-8")