As described on wkhtmltopdf Google group
(Google Code Archive - Long-term storage for Google Code Project Hosting.), I have a
problem generating a PDF while using popen and wkhtmltopdf.
wkhtmltopdf takes HTML code as input and ouputs a PDF file. Here is what
I’m doing:
command = '"C:\Program Files\wkhtmltopdf\wkhtmltopdf.exe" - - -q'
IO.popen(command, 'r+') do |f|
# Writing the html previously rendered in a string
f.write(html_output)
f.close_write
# Reading the ouput and closing
pdf = f.readlines
f.close
# Returning the pdf data
pdf
end
This code results in a corrupted PDF file. I checked the PDF itself
which shows some differences with a valid PDF file, like some missing
closing tags (endstream) - but I’m not an expert of that format.
Well, my question is the following: am I doing it wrong, using a wrong
method, missing something, or wkhtmltopdf is more likely to be the
problem?
I attached the corrupted file.
If you have a look at it, you’ll notice that a PDF EOF symbol is there,
which tends to say that the generation was not interrupted in any way.
Any idea?
Thanks for your help!
Hi Nicolas,
Whenever I generate pdfs from a rails app using wkhtmltopdf (or
princexml), I usually call wkhtmltopdf using an app_url (ie
wkhtmltopdf hits the web app to get the html/css/imgs/… to be used
to gen the pdf), something like the following:
in some controller …
require ‘timeout’
…
TIMEOUT_SECS = 5
…
def gen_pdf
url_to_pdf = … # the url to gen the pdf from.
fname = … # the name of the resulting pdf.
ftype = “application/pdf”
# combat shell injection?
app_url = app_url.to_s.gsub(/["’\s$;><&\|\(\)\\\[\]]/, '') s = nil # valid url? unless (app_url =~ URI::regexp).nil? begin timeout(TIMEOUT_SECS) do # gen pdf from url. s =
wkhtmltopdf -q “#{app_url}” -`.chomp
end
rescue Exception => e
… # log, render/redirect err msg, …
end
end
# invalid pdf?
if not s.to_s =~ /^%PDF/
… # log, render/redirect err msg, …
end
send_data(s, :type=>ftype, :filename=>fname); return
end
…
Jeff
(reposted due to typo… )
Hi Nicolas,
Whenever I generate pdfs from a rails app using wkhtmltopdf (or
princexml), I usually call wkhtmltopdf using an app_url (ie
wkhtmltopdf hits the web app to get the html/css/imgs/… to be used
to gen the pdf), something like the following:
in some controller …
require ‘timeout’
…
TIMEOUT_SECS = 5
…
def gen_pdf
app_url = … # the url to gen the pdf from.
fname = … # the name of the resulting pdf.
ftype = “application/pdf”
# combat shell injection?
app_url = app_url.to_s.gsub(/["’\s$;><&\|\(\)\\\[\]]/, '') s = nil # valid url? unless (app_url =~ URI::regexp).nil? begin timeout(TIMEOUT_SECS) do # gen pdf from url. s =
wkhtmltopdf -q “#{app_url}” -`.chomp
end
rescue Exception => e
… # log, render/redirect err msg, …
end
end
# invalid pdf?
if not s.to_s =~ /^%PDF/
… # log, render/redirect err msg, …
end
send_data(s, :type=>ftype, :filename=>fname); return
end
…
Jeff
Thanks for your answer Jeff. I’ll give it a try in my own app and see
whether it’s working or not. I’ll keep you posted!
Cheers,