in my linux comsole:
ghex2 /home/pt/myday
there is the first line content
AA3F3201861F0000D61F0000
would you mind to tell me how can convert it into utf-8?
On Sun, Jul 11, 2010 at 1:40 PM, Pen T. [email protected] wrote:
in my linux comsole:
ghex2 /home/pt/myday
there is the first line content
AA3F3201861F0000D61F0000
would you mind to tell me how can convert it into utf-8?
I can’t tell which unicode range those specific code points belong to,
but
if you’re sure the hex data represents utf-8, and I understand what you
are
asking for correctly, try something like:
line.scan(/(…)/).each do |code_point|
puts code_point.pack(‘H*’)
end
Depending on where the data is coming from, you might need to flip the
bytes
to account for endianness. I don’t believe that will be an issue on
linux.
Hope it helps,
Ammar
pt@pt-laptop:~$ file /home/pt/myday
/home/pt/myday: data
would you mind to tell me your email? i can send the data file to
you,please help me,think you in advance.
Pen T. wrote:
in my linux comsole:
ghex2 /home/pt/myday
there is the first line content
AA3F3201861F0000D61F0000
would you mind to tell me how can convert it into utf-8?
What do you meant “convert to utf-8”?
Those hex characters are very unlikely to be UTF-8. See
AA = second, third or fourth byte of multi-byte sequence
3F = single-byte character (?)
32 = single-byte character (2)
01 = single-byte character (ctrl-A)
86 = second, third or fourth byte of multi-byte sequence
1F = single-byte character (ctrl-?)
… etc
Apart from D6 there are no “start of n-byte sequence” characters
Try typing “file /home/pt/myday”, which will attempt to identify what
sort of file you actually have.
Pen T. wrote:
pt@pt-laptop:~$ file /home/pt/myday
/home/pt/myday: data
‘file’ knows many file formats - look at /usr/share/file/magic or
/etc/magic depending on your system. But this isn’t one of them. Neither
is the data UTF-8 text, or at least, the header isn’t.
You could look to see if there are any snippets of readable text in it:
strings /home/pt/myday | less
But if not, then it really is just binary data, and its meaning will
depend on what program wrote it in the first place.