Is this a complex encoding problem?

Hey chaps,

Im trying to write a text file to the file system, but it must have an
exact byte order as per this:

http://support.adobe.com/devsup/devsup.nsf/docs/54161.htm

http://support.adobe.com/devsup/devsup.nsf/docs/54162.htm

Ive been looking at the Iconv elements of ruby, and Array.pack but im
really not sure how best to ensure that it is the correct byte order? Or
that the carriage returns are correct.

What are peoples thoughts?

Thanks

Tim

Tim P. wrote:

Hey chaps,

Im trying to write a text file to the file system, but it must have an
exact byte order as per this:

http://support.adobe.com/devsup/devsup.nsf/docs/54161.htm

http://support.adobe.com/devsup/devsup.nsf/docs/54162.htm

Ive been looking at the Iconv elements of ruby, and Array.pack but im
really not sure how best to ensure that it is the correct byte order? Or
that the carriage returns are correct.

What are peoples thoughts?

Thanks

Tim

For UNICODE-WIN and UNICODE-MAC, the tagged text file in its entirety
must be in UTF-16LE or UTF-16BE, respectively. (LE = Little Endian, BE =
Big

So use those encodings with iconv.

7stud – wrote:

Tim P. wrote:

Or
that the carriage returns are correct.

If the file is going to be used on windows, you need to write \r\n at
the end of the lines. If the file is going to be used on a Mac, you
need to write \n. If you program is going to be used on different os’s
to generate that file, then can get the newline for the system from the
global variable $/, or to avoid using such cryptic variable names in
your code, you can require ‘english’ and use $INPUT_RECORD_SEPARATOR.

Figured the without BOM thing by using

converter = Iconv.new(“UTF-16BE”, “ISO-8859-15”)

So now there is not BOM and the files ‘look’ identical. However:

macbookpro:~/Desktop tpfgperrett$ md5 example.txt demo.txt
MD5 (example.txt) = 0da14bc444c760de5a35299868c96f25
MD5 (demo.txt) = 6b1751450c2b58088a34fcae524e26ee

They are still different :frowning: How can I pick apart my two files to see the
differences?

Thanks

Tim

Ive done it :slight_smile:

The problem was that it was inserting \n as line returns where as
indesign needed to have \r

Thats for all your help - much appreciated

Cheers

7stud – wrote:

For UNICODE-WIN and UNICODE-MAC, the tagged text file in its entirety
must be in UTF-16LE or UTF-16BE, respectively. (LE = Little Endian, BE =
Big

So use those encodings with iconv.

Hey 7stud - the files indesign is exporting are utf16be, so i’ll try and
match that. Otherwise, is there a way to write a utf-16 file without the
BOM? The exported tagged text from indesign doesnt appear to have a BOM.

Im only going to be using this on mac osx, so I should just be able to
use \n for a the line endings shouldnt i?

Cheers

Tim

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs