Upload PDF / Save as tiff


#1

Hi,

I’m trying to automate the conversion of a PDF document received via a
browser upload to a tiff image via ghostscript. I have the PDF data in
a string, and I need the tiff data returned into a string.

The general command I want to emulate is:
type test.pdf | “c:\program files\gs\gs8.56\bin\gswin32c.exe” -q
-dNOPAUSE -dBATCH -sDEVICE=tiffg4 -sOutputFile=- > test5.tiff

I’ve had success in writing directly to a file via:
IO.popen(‘c:\program files\gs\gs8.56\bin\gswin32c.exe -q -dNOPAUSE
-dBATCH -sDEVICE=tiffg4 -sOutputFile=test3.tiff -’, “wb”) do |pipe|
pipe.write pdf_data
end

However, I really want to bypass writing to disk. If I attempt to get
the data back from the pipe, there are extra bytes returned at the end
of the stream. On my test file, I receive eight extra bytes on XP, and
ten extra bytes on Linux.

**** This doesn’t work ****

test conversion of the pdf to tiff via standard input, back to

standard output

There are eight extra bytes received on the test file.

IO.popen(‘c:\program files\gs\gs8.56\bin\gswin32c.exe -q -dNOPAUSE
-dBATCH -sDEVICE=tiffg4 -sOutputFile=- -’, “wb+”) do |pipe|
pipe.write pdf_data
pipe.close_write
tiff_data = pipe.read
open(‘test1.tiff’, ‘wb’) { |f| f.syswrite tiff_data }
end

Is it not possible to receive binary data correctly from the pipe?
I’ve tested on both XP (ruby 1.8.4) and Linux (ruby 1.8.4).

Thanks for any advice you can offer.

Regards,

Rich D.


#2

Well, for what it is worth, I’ve reproduced this under MAC OS X, ruby
1.8.4 (2005-12-24) [i686-darwin8.7.1] — the number is 8.

Maybe this is a question for the Ruby list, not Rails.

Stephan


#3

And then again, look at this

cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH

-sDEVICE=tiffg4 -sOutputFile=- - | wc
14 263 11458

cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH

-sDEVICE=tiffg4 -sOutputFile=- - > out.tiff

ls -l out.tiff

-rw-r–r-- 1 username username 11450 Apr 25 11:07 out.tiff

In other words, running your command in a bash shell produces a similar
problem - 8 bytes difference.

11458 is the length of the file produced by the ruby program as well.

Stephan


#4

cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH

-sDEVICE=tiffg4 -sOutputFile=- - | wc
14 263 11458

cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH

-sDEVICE=tiffg4 -sOutputFile=- - > out.tiff

ls -l out.tiff

-rw-r–r-- 1 username username 11450 Apr 25 11:07 out.tiff

In other words, running your command in a bash shell produces a
similar
problem - 8 bytes difference.

11458 is the length of the file produced by the ruby program as well.

Thanks!

That’s a bit of lovely news. I’ll see if I can’t figure out why piping v
redirecting doesn’t’ work on gs.

Regards,
Rich


#5

Thanks!

That’s a bit of lovely news. I’ll see if I can’t figure out why
piping v

redirecting doesn’t’ work on gs.

Regards,
Rich
Actually I’m finding the problem depends on the device.

Here is the ruby script

Hi,

I found out what is going on. Two kind souls on comp.lang.postscript
gave the answer:

“Yes, that’s right. gs’ tiff devices generate TIFF files which contain
the TIFF header at the beginning. As some of the tag values are only
known at the end of the conversion, these values are updated then,
which requires a seekable output file.”

I suspect the other file formats you examined have the same issue.

Regards,
Rich D.


#6

Thanks!

That’s a bit of lovely news. I’ll see if I can’t figure out why piping v
redirecting doesn’t’ work on gs.

Regards,
Rich
Actually I’m finding the problem depends on the device.

Here is the ruby script

We have a test file /tmp/t.pdf which will not be revealed.

def lengths_for(device_name)

pipe = IO.popen("/opt/local/bin/gs -q -dNOPAUSE -dBATCH
-sDEVICE=#{device_name} -sOutputFile=- -", “r+”)

pdf_data = IO.read(’/tmp/t.pdf’) # read the file
pipe.write pdf_data # convert it with the pipe
pipe.close_write
tiff_data = pipe.read # Capture output

Now convert, with the same arguments, but pipe to a file, and then

count the number of characters of that file
compare = /opt/local/bin/gs -q -dNOPAUSE -dBATCH -sDEVICE=#{device_name} -sOutputFile=- /tmp/t.pdf > /tmp/o ; wc -c /tmp/o
compare =~ /(\d+)/

return both lengths

return {:piped => tiff_data.length, :through_file => $&}

end

Here are all the devices that our gs installation reported to

support according to the manual - invoke Ghostscript and type

devicenames ==

%W{ faxg4 png16m pj tiff24nc ljet2p pkm spotcmyk bmpgray pcxcmyk
jpeg bj200 psrgb lj5gray tiffg32d bmp32b pgnmraw pxlmono x11cmyk
pcxmono pnggray pjxl tiffgray ljet3 pkmraw devicen bmpsep1 pbm
jpeggray bjc600 bit cdeskjet tiffg4 deskjet pnm pxlcolor x11gray2
pcxgray pngmono pjxl300 tiff32nc ljet3d pksm xcf bmpsep8 pbmraw
pdfwrite bjc800 bitrgb cdjcolor tifflzw djet500 pnmraw bbox x11gray4
pcx16 png256 uniprint tiffsep ljet4 pksmraw psdcmyk bmp16 pgm pswrite
faxg3 bitcmyk cdjmono tiffpack laserjet ppm cljet5 x11mono pcx256
png16 ijs psmono ljet4d tiffcrle psdrgb bmp256 pgmraw ps2write x11
faxg32d png48 cdj550 tiff12nc ljetplus ppmraw cljet5c bmpmono pcx24b
pngalpha bj10e psgray lj5mono tiffg3 nullpage bmp16m pgnm epswrite
x11alpha }.each do |d|

r = lengths_for(d)
diff = r[:piped].to_i - r[:through_file].to_i

puts “Device #{d.rjust(20)}. Piped: #{r[:piped].to_s.ljust(8)} -
Through File: #{r[:through_file].to_s.ljust(8)}; Difference:
#{diff.to_s.ljust(8)}”

end

Some devices produced errors such as

AFPL Ghostscript 8.54: Unrecoverable error, exit code 1
AFPL Ghostscript 8.54: Cannot open X display `(null)’.
AFPL Ghostscript 8.54: ijs server not specified

which I didn’t trace down, or collect.

Otherwise the output is this:
Device faxg4. Piped: 11108 - Through File: 11108 ; Difference:
0
Device png16m. Piped: 10735 - Through File: 10735 ; Difference:
0
Device pj. Piped: 142747 - Through File: 142747 ; Difference:
0
Device tiff24nc. Piped: 1454453 - Through File: 1454444 ; Difference:
9
Device ljet2p. Piped: 89202 - Through File: 89202 ; Difference:
0
Device pkm. Piped: 2908288 - Through File: 2908288 ; Difference:
0
Device spotcmyk. Piped: 243202 - Through File: 243147 ; Difference:
55
Device bmpgray. Piped: 485782 - Through File: 485782 ; Difference:
0
Device pcxcmyk. Piped: 43945 - Through File: 43945 ; Difference:
0
Device jpeg. Piped: 45675 - Through File: 45675 ; Difference:
0
Device bj200. Piped: 209895 - Through File: 209895 ; Difference:
0
Device psrgb. Piped: 920058 - Through File: 920058 ; Difference:
0
Device lj5gray. Piped: 1098396 - Through File: 1098396 ; Difference:
0
Device tiffg32d. Piped: 18764 - Through File: 18756 ; Difference:
8
Device bmp32b. Piped: 1938870 - Through File: 1938870 ; Difference:
0
Device pgnmraw. Piped: 61050 - Through File: 61050 ; Difference:
0
Device pxlmono. Piped: 488805 - Through File: 488805 ; Difference:
0
Device x11cmyk. Piped: 50 - Through File: 50 ; Difference:
0
Device pcxmono. Piped: 16756 - Through File: 16756 ; Difference:
0
Device pnggray. Piped: 7448 - Through File: 7448 ; Difference:
0
Device pjxl. Piped: 48236 - Through File: 48236 ; Difference:
0
Device tiffgray. Piped: 485041 - Through File: 485032 ; Difference:
9
Device ljet3. Piped: 36620 - Through File: 36620 ; Difference:
0
Device pkmraw. Piped: 1454179 - Through File: 1454179 ; Difference:
0
Device devicen. Piped: 0 - Through File: 0 ; Difference:
0
Device bmpsep1. Piped: 253688 - Through File: 253688 ; Difference:
0
Device pbm. Piped: 492686 - Through File: 492686 ; Difference:
0
Device jpeggray. Piped: 42771 - Through File: 42771 ; Difference:
0
Device bjc600. Piped: 125045 - Through File: 124990 ; Difference:
55
Device bit. Piped: 60984 - Through File: 60984 ; Difference:
0
Device cdeskjet. Piped: 50 - Through File: 50 ; Difference:
0
Device tiffg4. Piped: 11458 - Through File: 11450 ; Difference:
8
Device deskjet. Piped: 85694 - Through File: 85694 ; Difference:
0
Device pnm. Piped: 492686 - Through File: 492686 ; Difference:
0
Device pxlcolor. Piped: 488801 - Through File: 488801 ; Difference:
0
Device x11gray2. Piped: 50 - Through File: 50 ; Difference:
0
Device pcxgray. Piped: 81005 - Through File: 81005 ; Difference:
0
Device pngmono. Piped: 3462 - Through File: 3462 ; Difference:
0
Device pjxl300. Piped: 109453 - Through File: 109453 ; Difference:
0
Device tiff32nc. Piped: 1939159 - Through File: 1939150 ; Difference:
9
Device ljet3d. Piped: 36630 - Through File: 36630 ; Difference:
0
Device pksm. Piped: 1970748 - Through File: 1970748 ; Difference:
0
Device xcf. Piped: 1454817 - Through File: 1454817 ; Difference:
0
Device bmpsep8. Piped: 1943128 - Through File: 1943128 ; Difference:
0
Device pbmraw. Piped: 61049 - Through File: 61049 ; Difference:
0
Device pdfwrite. Piped: 4932 - Through File: 4932 ; Difference:
0
Device bjc800. Piped: 125044 - Through File: 124989 ; Difference:
55
Device bitrgb. Piped: 242352 - Through File: 242352 ; Difference:
0
Device cdjcolor. Piped: 123578 - Through File: 123578 ; Difference:
0
Device tifflzw. Piped: 22546 - Through File: 22538 ; Difference:
8
Device djet500. Piped: 36624 - Through File: 36624 ; Difference:
0
Device pnmraw. Piped: 61049 - Through File: 61049 ; Difference:
0
Device bbox. Piped: 0 - Through File: 0 ; Difference:
0
Device x11gray4. Piped: 50 - Through File: 50 ; Difference:
0
Device pcx16. Piped: 66640 - Through File: 66640 ; Difference:
0
Device png256. Piped: 8431 - Through File: 8431 ; Difference:
0
Device uniprint. Piped: 825 - Through File: 770 ; Difference:
55
Device tiffsep. Piped: 1939159 - Through File: 1939150 ; Difference:
9
Device ljet4. Piped: 87739 - Through File: 87739 ; Difference:
0
Device pksmraw. Piped: 244200 - Through File: 244200 ; Difference:
0
Device psdcmyk. Piped: 1938908 - Through File: 1938908 ; Difference:
0
Device bmp16. Piped: 244054 - Through File: 244054 ; Difference:
0
Device pgm. Piped: 1912494 - Through File: 1912494 ; Difference:
0
Device pswrite. Piped: 32861 - Through File: 32845 ; Difference:
16
Device faxg3. Piped: 27046 - Through File: 27046 ; Difference:
0
Device bitcmyk. Piped: 242352 - Through File: 242352 ; Difference:
0
Device cdjmono. Piped: 31310 - Through File: 31310 ; Difference:
0
Device tiffpack. Piped: 41311 - Through File: 41302 ; Difference:
9
Device laserjet. Piped: 319918 - Through File: 319918 ; Difference:
0
Device ppm. Piped: 5737350 - Through File: 5737350 ; Difference:
0
Device cljet5. Piped: 474347 - Through File: 474292 ; Difference:
55
Device x11mono. Piped: 50 - Through File: 50 ; Difference:
0
Device pcx256. Piped: 81005 - Through File: 81005 ; Difference:
0
Device png16. Piped: 6048 - Through File: 6048 ; Difference:
0
Device ijs. Piped: 50 - Through File: 50 ; Difference:
0
Device psmono. Piped: 167785 - Through File: 167785 ; Difference:
0
Device ljet4d. Piped: 87749 - Through File: 87749 ; Difference:
0
Device tiffcrle. Piped: 25611 - Through File: 25602 ; Difference:
9
Device psdrgb. Piped: 1454204 - Through File: 1454204 ; Difference:
0
Device bmp256. Piped: 485782 - Through File: 485782 ; Difference:
0
Device pgmraw. Piped: 484773 - Through File: 484773 ; Difference:
0
Device ps2write. Piped: 901 - Through File: 901 ; Difference:
0
Device x11. Piped: 50 - Through File: 50 ; Difference:
0
Device faxg32d. Piped: 17661 - Through File: 17661 ; Difference:
0
Device png48. Piped: 13267 - Through File: 13267 ; Difference:
0
Device cdj550. Piped: 32407 - Through File: 32407 ; Difference:
0
Device tiff12nc. Piped: 727397 - Through File: 727388 ; Difference:
9
Device ljetplus. Piped: 319918 - Through File: 319918 ; Difference:
0
Device ppmraw. Piped: 1454181 - Through File: 1454181 ; Difference:
0
Device cljet5c. Piped: 294242 - Through File: 294242 ; Difference:
0
Device bmpmono. Piped: 63422 - Through File: 63422 ; Difference:
0
Device pcx24b. Piped: 240452 - Through File: 240452 ; Difference:
0
Device pngalpha. Piped: 25953 - Through File: 25953 ; Difference:
0
Device bj10e. Piped: 209769 - Through File: 209769 ; Difference:
0
Device psgray. Piped: 539581 - Through File: 539581 ; Difference:
0
Device lj5mono. Piped: 416939 - Through File: 416939 ; Difference:
0
Device tiffg3. Piped: 28291 - Through File: 28282 ; Difference:
9
Device nullpage. Piped: 0 - Through File: 0 ; Difference:
0
Device bmp16m. Piped: 1454166 - Through File: 1454166 ; Difference:
0
Device pgnm. Piped: 492687 - Through File: 492687 ; Difference:
0
Device epswrite. Piped: 32823 - Through File: 32807 ; Difference:
16
Device x11alpha. Piped: 50 - Through File: 50 ; Difference:
0

Notice the “Difference” values are mostly 0, but sometimes not. In
particular they are not always the same, nor powers of 2. Your case is
tiffg4 which shows up with difference 8 (as I found before)

So the puzzle is different!

See you

Stephan