My Rails application (Rails 4.1, Ruby 2.1.1) offers the user to upload a
file. This file will then be parsed by the application, and after the
parsing is done, it is deleted from the upload area.
So far, I have the following:
In my upload form, I have
<%= file_field_tag :upload, {accept: 'text/plain', class:
‘file_upload’} %>
In my controller, params[:upload] contains an object of class Tempfile,
which is already opened for reading. I am using #readline to read
through this file.
The problem now is that the file has encoding utf-8, and as soon as
reading contains a character which isn’t also a 7-Bit ASCII character, I
get an exception.
BTW, I also tried the following approach (in my controller):
tempf=params[:upload]
tempf.set_encoding('BOM|UTF-8')
However, this caused an exception
code converter not found (UTF-8 to UTF-8)
which I find somewhat strange, because I can set the encoding in this
way for a file opened with File.open(…).
What is the best way to read an uploaded UTF-8 file?
I was already thinking along the following line: The Tempfile class also
has a method #path, which returns the path of the uploaded file. I could
create a File object by opening this path, specify utf8 when opening it,
and read from this.
However, since this problem must occur quite frequently, I wonder
whether there is a way (maybe in the file_field_tag) to tell Rails that
the Tempfile object should be opened as utf8 for reading. Is this
possible, or is there another good way to deal with this problem?