Forum: Ruby on Rails Upload PDF / Save as tiff

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Da8f30efea70c7c865790820df7679b6?d=identicon&s=25 Duzenbury, Rich (Guest)
on 2007-04-25 18:32
(Received via mailing list)
Hi,

I'm trying to automate the conversion of a PDF document received via a
browser upload to a tiff image via ghostscript.  I have the PDF data in
a string, and I need the tiff data returned into a string.

The general command I want to emulate is:
type test.pdf | "c:\program files\gs\gs8.56\bin\gswin32c.exe" -q
-dNOPAUSE -dBATCH -sDEVICE=tiffg4 -sOutputFile=- > test5.tiff

I've had success in writing directly to a file via:
IO.popen('c:\program files\gs\gs8.56\bin\gswin32c.exe -q -dNOPAUSE
-dBATCH -sDEVICE=tiffg4 -sOutputFile=test3.tiff -', "wb") do |pipe|
   pipe.write pdf_data
end

However, I really want to bypass writing to disk.  If I attempt to get
the data back from the pipe, there are extra bytes returned at the end
of the stream.  On my test file, I receive eight extra bytes on XP, and
ten extra bytes on Linux.

# **** This doesn't work ****
# test conversion of the pdf to tiff via standard input, back to
standard output
# There are eight extra bytes received on the test file.
IO.popen('c:\program files\gs\gs8.56\bin\gswin32c.exe -q -dNOPAUSE
-dBATCH -sDEVICE=tiffg4 -sOutputFile=- -', "wb+") do |pipe|
   pipe.write pdf_data
   pipe.close_write
   tiff_data = pipe.read
   open('test1.tiff', 'wb') { |f| f.syswrite tiff_data }
end

Is it not possible to receive binary data correctly from the pipe?
I've tested on both XP (ruby 1.8.4) and Linux (ruby 1.8.4).

Thanks for any advice you can offer.

Regards,

Rich Duzenbury
72ea925c0ca3d19fdd2f12fa76681624?d=identicon&s=25 Stephan Wehner (stephanwehner)
on 2007-04-25 20:00
Well, for what it is worth, I've reproduced this under MAC OS X, ruby
1.8.4 (2005-12-24) [i686-darwin8.7.1] --- the number is 8.

Maybe this is a question for the Ruby list, not Rails.

Stephan
72ea925c0ca3d19fdd2f12fa76681624?d=identicon&s=25 Stephan Wehner (stephanwehner)
on 2007-04-25 20:11
And then again, look at this

#  cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH
-sDEVICE=tiffg4 -sOutputFile=- - | wc
      14     263   11458
# cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH
-sDEVICE=tiffg4 -sOutputFile=- - > out.tiff
# ls -l out.tiff
-rw-r--r--   1 username  username  11450 Apr 25 11:07 out.tiff


In other words, running your command in a bash shell produces a similar
problem - 8 bytes difference.

11458 is the length of the file produced by the ruby program as well.

Stephan
Da8f30efea70c7c865790820df7679b6?d=identicon&s=25 Duzenbury, Rich (Guest)
on 2007-04-26 19:30
(Received via mailing list)
> #  cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH
> -sDEVICE=tiffg4 -sOutputFile=- - | wc
>       14     263   11458
> # cat /tmp/t.pdf | /opt/local/bin/gs -q -dNOPAUSE -dBATCH
> -sDEVICE=tiffg4 -sOutputFile=- - > out.tiff
> # ls -l out.tiff
> -rw-r--r--   1 username  username  11450 Apr 25 11:07 out.tiff
>
>
> In other words, running your command in a bash shell produces a
similar
> problem - 8 bytes difference.
>
> 11458 is the length of the file produced by the ruby program as well.
>

Thanks!

That's a bit of lovely news. I'll see if I can't figure out why piping v
redirecting doesn't' work on gs.

Regards,
Rich
72ea925c0ca3d19fdd2f12fa76681624?d=identicon&s=25 Stephan Wehner (stephanwehner)
on 2007-04-26 20:26
>
> Thanks!
>
> That's a bit of lovely news. I'll see if I can't figure out why piping v
> redirecting doesn't' work on gs.
>
> Regards,
> Rich
Actually I'm finding the problem depends on the device.

Here is the ruby script


# We have a test file /tmp/t.pdf which will not be revealed.

def lengths_for(device_name)

  pipe = IO.popen("/opt/local/bin/gs -q -dNOPAUSE -dBATCH
-sDEVICE=#{device_name} -sOutputFile=- -", "r+")

  pdf_data = IO.read('/tmp/t.pdf')  # read the file
  pipe.write pdf_data               # convert it with the pipe
  pipe.close_write
  tiff_data = pipe.read             # Capture output

  # Now convert, with the same arguments, but pipe to a file, and then
count the number of characters of that file
  compare  = `/opt/local/bin/gs -q -dNOPAUSE -dBATCH
-sDEVICE=#{device_name} -sOutputFile=- /tmp/t.pdf > /tmp/o ; wc -c
/tmp/o`
  compare =~ /(\d+)/

  # return both lengths
  return {:piped => tiff_data.length, :through_file => $&}

end



# Here are all the devices that our gs installation reported to
# support according to the manual - invoke Ghostscript and type
#
#            devicenames ==
#

%W{ faxg4 png16m pj tiff24nc ljet2p pkm spotcmyk bmpgray pcxcmyk
jpeg bj200 psrgb lj5gray tiffg32d bmp32b pgnmraw pxlmono x11cmyk
pcxmono pnggray pjxl tiffgray ljet3 pkmraw devicen bmpsep1 pbm
jpeggray bjc600 bit cdeskjet tiffg4 deskjet pnm pxlcolor x11gray2
pcxgray pngmono pjxl300 tiff32nc ljet3d pksm xcf bmpsep8 pbmraw
pdfwrite bjc800 bitrgb cdjcolor tifflzw djet500 pnmraw bbox x11gray4
pcx16 png256 uniprint tiffsep ljet4 pksmraw psdcmyk bmp16 pgm pswrite
faxg3 bitcmyk cdjmono tiffpack laserjet ppm cljet5 x11mono pcx256
png16 ijs psmono ljet4d tiffcrle psdrgb bmp256 pgmraw ps2write x11
faxg32d png48 cdj550 tiff12nc ljetplus ppmraw cljet5c bmpmono pcx24b
pngalpha bj10e psgray lj5mono tiffg3 nullpage bmp16m pgnm epswrite
x11alpha }.each do |d|


  r = lengths_for(d)
  diff = r[:piped].to_i - r[:through_file].to_i

  puts "Device #{d.rjust(20)}. Piped: #{r[:piped].to_s.ljust(8)} -
Through File: #{r[:through_file].to_s.ljust(8)}; Difference:
#{diff.to_s.ljust(8)}"

end



Some devices produced errors such as

AFPL Ghostscript 8.54: Unrecoverable error, exit code 1
AFPL Ghostscript 8.54: Cannot open X display `(null)'.
AFPL Ghostscript 8.54: ijs server not specified


which I didn't trace down, or collect.

Otherwise the output is this:
Device      faxg4. Piped: 11108    - Through File: 11108   ; Difference:
0
Device     png16m. Piped: 10735    - Through File: 10735   ; Difference:
0
Device         pj. Piped: 142747   - Through File: 142747  ; Difference:
0
Device   tiff24nc. Piped: 1454453  - Through File: 1454444 ; Difference:
9
Device     ljet2p. Piped: 89202    - Through File: 89202   ; Difference:
0
Device        pkm. Piped: 2908288  - Through File: 2908288 ; Difference:
0
Device   spotcmyk. Piped: 243202   - Through File: 243147  ; Difference:
55
Device    bmpgray. Piped: 485782   - Through File: 485782  ; Difference:
0
Device    pcxcmyk. Piped: 43945    - Through File: 43945   ; Difference:
0
Device       jpeg. Piped: 45675    - Through File: 45675   ; Difference:
0
Device      bj200. Piped: 209895   - Through File: 209895  ; Difference:
0
Device      psrgb. Piped: 920058   - Through File: 920058  ; Difference:
0
Device    lj5gray. Piped: 1098396  - Through File: 1098396 ; Difference:
0
Device   tiffg32d. Piped: 18764    - Through File: 18756   ; Difference:
8
Device     bmp32b. Piped: 1938870  - Through File: 1938870 ; Difference:
0
Device    pgnmraw. Piped: 61050    - Through File: 61050   ; Difference:
0
Device    pxlmono. Piped: 488805   - Through File: 488805  ; Difference:
0
Device    x11cmyk. Piped: 50       - Through File: 50      ; Difference:
0
Device    pcxmono. Piped: 16756    - Through File: 16756   ; Difference:
0
Device    pnggray. Piped: 7448     - Through File: 7448    ; Difference:
0
Device       pjxl. Piped: 48236    - Through File: 48236   ; Difference:
0
Device   tiffgray. Piped: 485041   - Through File: 485032  ; Difference:
9
Device      ljet3. Piped: 36620    - Through File: 36620   ; Difference:
0
Device     pkmraw. Piped: 1454179  - Through File: 1454179 ; Difference:
0
Device    devicen. Piped: 0        - Through File: 0       ; Difference:
0
Device    bmpsep1. Piped: 253688   - Through File: 253688  ; Difference:
0
Device        pbm. Piped: 492686   - Through File: 492686  ; Difference:
0
Device   jpeggray. Piped: 42771    - Through File: 42771   ; Difference:
0
Device     bjc600. Piped: 125045   - Through File: 124990  ; Difference:
55
Device        bit. Piped: 60984    - Through File: 60984   ; Difference:
0
Device   cdeskjet. Piped: 50       - Through File: 50      ; Difference:
0
Device     tiffg4. Piped: 11458    - Through File: 11450   ; Difference:
8
Device    deskjet. Piped: 85694    - Through File: 85694   ; Difference:
0
Device        pnm. Piped: 492686   - Through File: 492686  ; Difference:
0
Device   pxlcolor. Piped: 488801   - Through File: 488801  ; Difference:
0
Device   x11gray2. Piped: 50       - Through File: 50      ; Difference:
0
Device    pcxgray. Piped: 81005    - Through File: 81005   ; Difference:
0
Device    pngmono. Piped: 3462     - Through File: 3462    ; Difference:
0
Device    pjxl300. Piped: 109453   - Through File: 109453  ; Difference:
0
Device   tiff32nc. Piped: 1939159  - Through File: 1939150 ; Difference:
9
Device     ljet3d. Piped: 36630    - Through File: 36630   ; Difference:
0
Device       pksm. Piped: 1970748  - Through File: 1970748 ; Difference:
0
Device        xcf. Piped: 1454817  - Through File: 1454817 ; Difference:
0
Device    bmpsep8. Piped: 1943128  - Through File: 1943128 ; Difference:
0
Device     pbmraw. Piped: 61049    - Through File: 61049   ; Difference:
0
Device   pdfwrite. Piped: 4932     - Through File: 4932    ; Difference:
0
Device     bjc800. Piped: 125044   - Through File: 124989  ; Difference:
55
Device     bitrgb. Piped: 242352   - Through File: 242352  ; Difference:
0
Device   cdjcolor. Piped: 123578   - Through File: 123578  ; Difference:
0
Device    tifflzw. Piped: 22546    - Through File: 22538   ; Difference:
8
Device    djet500. Piped: 36624    - Through File: 36624   ; Difference:
0
Device     pnmraw. Piped: 61049    - Through File: 61049   ; Difference:
0
Device       bbox. Piped: 0        - Through File: 0       ; Difference:
0
Device   x11gray4. Piped: 50       - Through File: 50      ; Difference:
0
Device      pcx16. Piped: 66640    - Through File: 66640   ; Difference:
0
Device     png256. Piped: 8431     - Through File: 8431    ; Difference:
0
Device   uniprint. Piped: 825      - Through File: 770     ; Difference:
55
Device    tiffsep. Piped: 1939159  - Through File: 1939150 ; Difference:
9
Device      ljet4. Piped: 87739    - Through File: 87739   ; Difference:
0
Device    pksmraw. Piped: 244200   - Through File: 244200  ; Difference:
0
Device    psdcmyk. Piped: 1938908  - Through File: 1938908 ; Difference:
0
Device      bmp16. Piped: 244054   - Through File: 244054  ; Difference:
0
Device        pgm. Piped: 1912494  - Through File: 1912494 ; Difference:
0
Device    pswrite. Piped: 32861    - Through File: 32845   ; Difference:
16
Device      faxg3. Piped: 27046    - Through File: 27046   ; Difference:
0
Device    bitcmyk. Piped: 242352   - Through File: 242352  ; Difference:
0
Device    cdjmono. Piped: 31310    - Through File: 31310   ; Difference:
0
Device   tiffpack. Piped: 41311    - Through File: 41302   ; Difference:
9
Device   laserjet. Piped: 319918   - Through File: 319918  ; Difference:
0
Device        ppm. Piped: 5737350  - Through File: 5737350 ; Difference:
0
Device     cljet5. Piped: 474347   - Through File: 474292  ; Difference:
55
Device    x11mono. Piped: 50       - Through File: 50      ; Difference:
0
Device     pcx256. Piped: 81005    - Through File: 81005   ; Difference:
0
Device      png16. Piped: 6048     - Through File: 6048    ; Difference:
0
Device        ijs. Piped: 50       - Through File: 50      ; Difference:
0
Device     psmono. Piped: 167785   - Through File: 167785  ; Difference:
0
Device     ljet4d. Piped: 87749    - Through File: 87749   ; Difference:
0
Device   tiffcrle. Piped: 25611    - Through File: 25602   ; Difference:
9
Device     psdrgb. Piped: 1454204  - Through File: 1454204 ; Difference:
0
Device     bmp256. Piped: 485782   - Through File: 485782  ; Difference:
0
Device     pgmraw. Piped: 484773   - Through File: 484773  ; Difference:
0
Device   ps2write. Piped: 901      - Through File: 901     ; Difference:
0
Device        x11. Piped: 50       - Through File: 50      ; Difference:
0
Device    faxg32d. Piped: 17661    - Through File: 17661   ; Difference:
0
Device      png48. Piped: 13267    - Through File: 13267   ; Difference:
0
Device     cdj550. Piped: 32407    - Through File: 32407   ; Difference:
0
Device   tiff12nc. Piped: 727397   - Through File: 727388  ; Difference:
9
Device   ljetplus. Piped: 319918   - Through File: 319918  ; Difference:
0
Device     ppmraw. Piped: 1454181  - Through File: 1454181 ; Difference:
0
Device    cljet5c. Piped: 294242   - Through File: 294242  ; Difference:
0
Device    bmpmono. Piped: 63422    - Through File: 63422   ; Difference:
0
Device     pcx24b. Piped: 240452   - Through File: 240452  ; Difference:
0
Device   pngalpha. Piped: 25953    - Through File: 25953   ; Difference:
0
Device      bj10e. Piped: 209769   - Through File: 209769  ; Difference:
0
Device     psgray. Piped: 539581   - Through File: 539581  ; Difference:
0
Device    lj5mono. Piped: 416939   - Through File: 416939  ; Difference:
0
Device     tiffg3. Piped: 28291    - Through File: 28282   ; Difference:
9
Device   nullpage. Piped: 0        - Through File: 0       ; Difference:
0
Device     bmp16m. Piped: 1454166  - Through File: 1454166 ; Difference:
0
Device       pgnm. Piped: 492687   - Through File: 492687  ; Difference:
0
Device   epswrite. Piped: 32823    - Through File: 32807   ; Difference:
16
Device   x11alpha. Piped: 50       - Through File: 50      ; Difference:
0

Notice the "Difference" values are mostly 0, but sometimes not. In
particular they are not always the same, nor powers of 2. Your case is
tiffg4 which shows up with difference 8 (as I found before)

So the puzzle is different!

See you

Stephan
Da8f30efea70c7c865790820df7679b6?d=identicon&s=25 Duzenbury, Rich (Guest)
on 2007-04-28 00:00
(Received via mailing list)
> > Thanks!
> >
> > That's a bit of lovely news. I'll see if I can't figure out why
piping v
> > redirecting doesn't' work on gs.
> >
> > Regards,
> > Rich
> Actually I'm finding the problem depends on the device.
>
> Here is the ruby script
>


Hi,

I found out what is going on.  Two kind souls on comp.lang.postscript
gave the answer:

"Yes, that's right. gs' tiff devices generate TIFF files which contain
the TIFF header at the beginning. As some of the tag values are only
known at the end of the conversion, these values are updated then,
which requires a seekable output file."

I suspect the other file formats you examined have the same issue.

Regards,
Rich Duzenbury
This topic is locked and can not be replied to.