According to the current manual PDF documents generated by
PDF::Writer can use UTF-16BE, but after a few trials with iconv I
can’t get my UTF-8 strings right. Example:
I’m not familiar with PDF::Writer, but I would be surprised if you
really had all the glyphs for ‘UTF-16BE’ by default. What is the exact
output ? Does it produce the PDF file, or it simply fails with an
exception, or crashes ?
If a PDF file is produced (of reasonable size), would you mind posting
it ?
I’m not familiar with PDF::Writer, but I would be surprised if you
really had all the glyphs for ‘UTF-16BE’ by default. What is the exact
output ? Does it produce the PDF file, or it simply fails with an
exception, or crashes ?
If a PDF file is produced (of reasonable size), would you mind
posting
it ?
As you see, the glyph we get wrong in this small test is the euro
symbol. This is important to me because not only my database in in
UTF-8 coming from an unrestricted UTF-8 frontend (website), but the
application has money here and there and needs to be able to output
that currency symbol.
As you see, the glyph we get wrong in this small test is the euro
symbol. This is important to me because not only my database in in UTF-8
coming from an unrestricted UTF-8 frontend (website), but the
application has money here and there and needs to be able to output that
currency symbol.
Actually, what you see on the screen is the latin1 representation of
your UTF-16BE string (see below). ^@ means chr 0 and seem to be ignored
by the PDF viewers, and UTF-16BE has the good taste to map to latin1 for
values up to 255. See what less unicode_test.pdf is giving me (I’m on a
latin1 locale):
Moreover, in this particular case, you are using the Helvetica
built-in font, and I’m pretty sure it doesn’t have glyphes for a Euro
symbol. Finally, acroread says that the encoding of the font is ‘ansi’.
That is definitely not what you want. Keep in mind that most of the
fonts (about everywhere) are defined for a small encoding (ansi/latin1,
or other 8bits encodings). I unfortunately don’t think I can help you
further. If you don’t rely too much yet on PDF::Writer, you could use
pdfLaTeX as an alternative, although PDF produced will be significantly
bigger (for small files)…
Welcome to the nightmare world of fonts and encodings…
According to the current manual PDF documents generated by
PDF::Writer can use UTF-16BE, but after a few trials with iconv I
can’t get my UTF-8 strings right. Example:
The manual is incorrect; I have recently figured out how to write
UTF-16 strings, but the current PDF::Writer doesn’t do this (and there
are issues that I need to resolve before this will even show up in any
release of PDF::Writer).
It is cross platform, FAST and has ruby bindings (it is a little bit
clumsy to use and the ruby bindings are missing some functions but
it is the best i could find)
example:
require “hpdf”
pdf = HPDFDoc.new
font = pdf.get_font(“Helvetica”, “CP1254”)
Moreover, in this particular case, you are using the Helvetica
built-in font, and I’m pretty sure it doesn’t have glyphes for a Euro
symbol.
Austin explained the issue. But to understand that remark in any
case, is that Helvetica in the PDF different from the Helvetica I use
in the system? The Helvetica here in the Mac certainly has the euro
symbol.
Moreover, in this particular case, you are using the Helvetica
built-in font, and I’m pretty sure it doesn’t have glyphes for a Euro
symbol.
Austin explained the issue. But to understand that remark in any case,
is that Helvetica in the PDF different from the Helvetica I use in the
system? The Helvetica here in the Mac certainly has the euro symbol.
Well… It is a long and complex story. A font is (for the PDF
document) just a correspondance (char) -> (nice drawing + metrics). What
we call Helvetica is in real a fair number of different fonts, which
cover various symbols that have a helvetica look & feel… Even if a
font is called helvetica, you can’t be assured that there are all the
glyphs you’re interested in inside it. And I don’t even speak about more
delicate things like fonts with Chinese or Russian characters… I
didn’t mean to exaggerate when I wrote ‘nightmare’ !
But, in this particular case, I was wrong ;-)… I checked up in the
PDF documentation, which specifies char codes for the Euro symbol. The
real problem was that the font encoding wasn’t the right one. I tweaked
manually the file until I could get it. See the problems with the
encodings and fonts: I spent a long time trying to get the char \240
displayed as Euro until I realised the encoding wasn’t quite the right
one and \240 meant ‘unbreakable space’ ! I attached the file just for
the example.
Cheers
Vince
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.