is there a way of extracting text from a PDF, if the latter
is in some non-European language, such as Arabic or
Under Linux, I have been able to use Ruby in conjunction
with pdftotext for English and other Latin1 encoded texts -
with some problems sometimes for special characters,
but it doesn’t seem to work for Unicode …
Is there a Ruby way to do this ?