PDF::Writer with UTF-8


#1

I have searched and searched and came up with nothing about the
/current/ state of UTF-8 with PDF-Writer. Has any one had any luck with
it?

I’ve tried using Iconv but it gets errors. I know this is valid utf-8,
however, because Firefox and Ruby in the terminal render it fine.

I also read about using UTF-16 by doing 0xfe 0xff before the string
("\xfe\xff#{str}") but that didn’t work either.

Anyone got any ideas?

Dan Finnie


#2

Daniel F. a écrit :

Anyone got any ideas?

Dan Finnie

I have tried recently to generate Simplified Chinese Flash Cards in PDF
and have given up. (Traditional Chinese is covered by one PDF library
port)
Depending on the variability of the contents, you could consider
creating a template (e.g. OpenOffice Writer -> PDF) then substitute some
strings in the PDF itself.
Alternatively you could generate OpenOffice documents which can then be
saved as PDF.
J-P


#3

On 3/9/07, Daniel F. removed_email_address@domain.invalid wrote:

I have searched and searched and came up with nothing about the
/current/ state of UTF-8 with PDF-Writer. Has any one had any luck with it?

Then you haven’t searched enough. I’m sorry to be so blunt, but I have
been consistent and clear: PDF::Writer does not and will NEVER support
UTF-8 natively. Why? Because the PDF specification does not support
UTF-8.

So, forget about using UTF-8. Period. I will never provide support for
it. (In Ruby 1.9, this may be alleviated by the use of m17n strings,
but I’m not going out of my way to support UTF-8; when I add unicode
support, it will be UTF-16 only).

I also read about using UTF-16 by doing 0xfe 0xff before the string
("\xfe\xff#{str}") but that didn’t work either.

And it won’t. My understanding of the techniques involved was flawed
when I put that in the manual. I will be adding UTF-16 specific
functions to the API in the version following an upcoming one.

-austin


#4

On Sat, 2007-03-10 at 22:26 +0900, Austin Z. wrote:

it. (In Ruby 1.9, this may be alleviated by the use of m17n strings,
but I’m not going out of my way to support UTF-8; when I add unicode
support, it will be UTF-16 only).

I’d suggest supporting UTF-8 and handling conversion internally, rather
than making users scatter Iconv calls everywhere, since UTF-8 is rather
the norm on the internet. Better to do it internally, especially since
the conversion between UTF-8 and UTF-16 is perfect in both directions.

Aria


#5

On 3/10/07, Aredridel removed_email_address@domain.invalid wrote:

it. (In Ruby 1.9, this may be alleviated by the use of m17n strings,
but I’m not going out of my way to support UTF-8; when I add unicode
support, it will be UTF-16 only).
I’d suggest supporting UTF-8 and handling conversion internally, rather
than making users scatter Iconv calls everywhere, since UTF-8 is rather
the norm on the internet. Better to do it internally, especially since
the conversion between UTF-8 and UTF-16 is perfect in both directions.

Not doing it. Not before Ruby 1.9, at a very minimum. Even then, I’m
only going to believe what the user sets their m17n string as. The new
methods to add Unicode text will be UTF-16 only (why? because iconv
isn’t default on all platforms, so I can’t trust it to be there). Once
I have the new methods written, if someone wants to write a wrapper
layer to handle UTF-8 and all of the validation involved, they’re
welcome to.

I’m not supporting UTF-8 at all until Ruby 1.9 has enough uptake to
abandon Ruby 1.8 as a platform. That may disappoint some folks, but
I’m not fighting against the spec, and I’m not breaking support for
PDF::Writer on some platforms because they don’t have iconv.

-austin


#6

Hi,

In 1173546777.18386.1.camel@localhost
“Re: PDF::Writer with UTF-8” on Sun, 11 Mar 2007 02:13:01 +0900,
Aredridel removed_email_address@domain.invalid wrote:

So, forget about using UTF-8. Period. I will never provide support for
it. (In Ruby 1.9, this may be alleviated by the use of m17n strings,
but I’m not going out of my way to support UTF-8; when I add unicode
support, it will be UTF-16 only).

I’d suggest supporting UTF-8 and handling conversion internally, rather
than making users scatter Iconv calls everywhere, since UTF-8 is rather
the norm on the internet. Better to do it internally, especially since
the conversion between UTF-8 and UTF-16 is perfect in both directions.

You can generate a PDF with UTF-8 text if you use rcairo.

Thanks,


#7

On 3/10/07, Kouhei S. removed_email_address@domain.invalid wrote:

You can generate a PDF with UTF-8 text if you use rcairo.

Works great for non-Windows users. Maybe.

-austin, just loves the ubiquity of Linux-based libraries


#8

Hi,

2007/3/12, Austin Z. removed_email_address@domain.invalid:

On 3/10/07, Kouhei S. removed_email_address@domain.invalid wrote:

You can generate a PDF with UTF-8 text if you use rcairo.

Works great for non-Windows users. Maybe.

-austin, just loves the ubiquity of Linux-based libraries

cairo and rcairo can be works on Window, Mac OS X, Linux, FreeBSD
and so on. Windows user can use Ruby-GNOME2 Win32 GUI Installer
for installing cairo and rcairo:
http://ruby-gnome2.sourceforge.jp/index.html?News_20070212_1

Thanks,


#9

On 3/10/07, Jaypee removed_email_address@domain.invalid wrote:

I have tried recently to generate Simplified Chinese Flash Cards in PDF
and have given up. (Traditional Chinese is covered by one PDF library
port)
Depending on the variability of the contents, you could consider
creating a template (e.g. OpenOffice Writer -> PDF) then substitute some
strings in the PDF itself.

This will ONLY work if your strings are always the exact same length.
If you add as little as one character, the PDF may not open properly
because the offets differ.

-austin


#10

Daniel F. a écrit :

Anyone got any ideas?

Dan Finnie

I have tried recently to generate Simplified Chinese Flash Cards in PDF
and have given up. (Traditional Chinese is covered by one PDF library
port)
Depending on the variability of the contents, you could consider
creating a template (e.g. OpenOffice Writer -> PDF) then substitute some
strings in the PDF itself.
Alternatively you could generate OpenOffice documents which can then be
saved as PDF.
J-P