Forum: Ruby on Rails Converting HTML into an Image

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-05-25 15:57
(Received via mailing list)
I have this challenge of converting a bit of html into an image file
and then shrinking it into a thumbnail.  The shrinking part shouldn't
be too bad, but what advice can you give on converting the html to the
image?  My thinking is that I will need to convert it to postscript or
latex and then convert it to an image from there.  Is there an easier
/ better approach?

Michael
A77873df3a9766b208e009248a2a9a56?d=identicon&s=25 Hampton (Guest)
on 2006-05-25 16:36
(Received via mailing list)
"converting html to an image" is a fairly complicated task. that's why
browsers exist and are so complicated themselves.

Even a "good" html2ps util will probably only be a rough-estimate and
could
easily destroy the look of most websites. It just depends on if you mean
thumbnailing websites or some sort of html document that you have.

That being said, if getting the exact look right was not too important,
then
I'd use that method.

-hampton.
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-05-25 16:57
(Received via mailing list)
Thanks.  Accuracy is not important since these will probably end up
being 50 x 50 or 100 x 100.  We're just looking for somewhat of a
visual representation.  I just want to be sure before I dig in that
I'm not missing something a lot simpler.

Michael
931d5cc3b6fcb9e740ad2846db11a9ba?d=identicon&s=25 Matt Ramos (Guest)
on 2006-05-25 21:10
(Received via mailing list)
I'm not sure how you could go about this, but you might be able to just
print screen it and crop it automatically. That might be simpler, but
I'm
not sure.
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-05-25 23:25
(Received via mailing list)
Thanks Matt, but this needs to be a function of the application.  It
is snapshoting just a portion of the page and turning that into an
image and indexing them.

Michael
10c122532c00465b809dbf9dc35806a7?d=identicon&s=25 Paolo Negri (Guest)
on 2006-05-26 11:48
(Received via mailing list)
Hi

I think that this task is an hard one. The only nice solution I ever
read of is to use a  browser library on the server to render the html
and take a snapshot of the rendered page.
Please note that this will grow the complexity of your server setup.

you can look at khtml2png html2png
www.hackdiary.com/archives/000055.html to have some inspiration.

Paolo

2006/5/25, Michael Trier <mtrier@gmail.com>:
D81b718e125c403ac8511947ad621333?d=identicon&s=25 antonio rodriguez (Guest)
on 2006-05-26 12:06
(Received via mailing list)
Paolo Negri
escribió:> Hi
>
> I think that this task is an hard one. The only nice solution I ever
> read of is to use a  browser library on the server to render the html
> and take a snapshot of the rendered page.
> Please note that this will grow the complexity of your server setup.
>
> you can look at khtml2png html2png
> www.hackdiary.com/archives/000055.html to have some inspiration.
>

In Linux,

1. Transform your html file to .pdf (OpenOffice does the task)
2. Open with ImageMagick the .pdf file
3. Save as .jpg, .png, etc.

Hope this helps,

Antonio

>>
>
--
=====
Por favor, si me mandas correos con copia a varias personas,
pon mi dirección de correo en copia oculta (CCO), para evitar
que acabe en montones de sitios, eliminando mi privacidad,
favoreciendo la propagación de virus y la proliferación del SPAM. Gracias.
-----
If you send me e-mail which has also been sent to several other people,
kindly mark my address as blind-carbon-copy (or BCC), to avoid its
distribution, which affects my privacy, increases the likelihood of
spreading viruses, and leads to more SPAM. Thanks.
=====
D5145c421cd25af6fa577c15219add90?d=identicon&s=25 unknown (Guest)
on 2006-05-26 12:24
(Received via mailing list)
If all the pages you want to get pictures of have the same format
layout/colours, you could take a screenshot of the layout, then grab
any dynamic text from the html (or straight from the db or whatever)
and overlay it in the right position. Depending on how much work you
want to do, you could customise it beyond plain text, too. As long as
you're pages have a certain amount of consistency, you could do a
better job than any existing program/plug-in, because existing
programs wouldn't take that consistency into account.
Hope this helps,
-N
94c40fd67ffecc80b479aa9d7df3c494?d=identicon&s=25 Jonathan del Strother (Guest)
on 2006-05-26 12:49
(Received via mailing list)
If you're on a Mac, you could use WebKit to render into an offscreen
view and just grab the view to an image...
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-05-26 17:08
(Received via mailing list)
Wow, thanks for all of the good ideas.  I really appreciate it.  I'll
be digging into this this weekend and early next week so if I find a
workable solution, I might post something just so folks know how it
all turned out.

Michael
357558a6682f4d6624594763d9acdb35?d=identicon&s=25 Mike Fletcher (fletch)
on 2006-05-26 17:59
Jonathan del Strother wrote:
> If you're on a Mac, you could use WebKit to render into an offscreen
> view and just grab the view to an image...

There's an app that does just this:

http://www.derailer.org/paparazzi/

Lets you pick the rendered window size, clip the page, all sorts of good
stuff.
79c5ec9c4818d8f0d9309ab74431212f?d=identicon&s=25 tom Shorebird (Guest)
on 2006-05-26 18:17
(Received via mailing list)
I have been getting all the email from your list.  over 100 messages
since
earling this morning.
I don't want to receive anymore
Thanks
Tom Shorebird
56493a3c223a155bbe2b44bdf55ef184?d=identicon&s=25 Mike Oligny (Guest)
on 2006-05-26 18:21
(Received via mailing list)
On 26-May-06, at 11:14 AM, tom Shorebird wrote:

> I have been getting all the email from your list.  over 100
> messages since earling this morning.
> I don't want to receive anymore
> Thanks
> Tom Shorebird

Maybe you shouldn't have subscribed then?  The link to unsubscribe is
on the bottom of EVERY one of those 100 messages.

	http://lists.rubyonrails.org/mailman/listinfo/rails
Af72b502d277ea86637ba12c5056ec68?d=identicon&s=25 unknown (Guest)
on 2006-05-26 18:30
(Received via mailing list)
36958dd94ca666a38483df282a5214d5?d=identicon&s=25 Peter Ertl (Guest)
on 2006-05-26 19:05
(Received via mailing list)
$> man how-to-post

1. enable brain
2. type text
3. submit
A05834e9b5954947eb0ba3b570c47d5e?d=identicon&s=25 Pratik Naik (pratik)
on 2006-05-26 21:48
(Received via mailing list)
You might be interested in looking at
http://ruby-gnome2.sourceforge.jp/hiki.cgi?Ruby%2F...

Sample : http://mirko.lilik.it/Ruby-GNOME2/moz-snapshooter.rb

Hope that helps.

Thanks,
Pratik
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-05-27 02:27
(Received via mailing list)
Pefect.  Two excellent references.  Thanks so much everyone.  This
should save me some time.

Michael
6f7c877de704c7cc03c8a3b2dc52df92?d=identicon&s=25 Carmen --- (carmen)
on 2006-05-30 04:31
Michael Trier wrote:
> I have this challenge of converting a bit of html into an image file
> and then shrinking it into a thumbnail.  The shrinking part shouldn't
> be too bad, but what advice can you give on converting the html to the
> image?  My thinking is that I will need to convert it to postscript or
> latex and then convert it to an image from there.  Is there an easier
> / better approach?


a single argument to kdesktop will turn any url to an image. its part of
KDE..
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-05-30 05:12
(Received via mailing list)
Thanks, unfortunately this needs to be function of the Rails
application.  But I will look into the source for that.  Thanks for
the tip.

Michael
5c485ab97aa085dd150e1a450a70f313?d=identicon&s=25 Tim Haines (Guest)
on 2006-06-08 08:27
Hi Michael,

Did you get this solved?   Please email me at tmhaines at gmail dot com
if you did.

I've found one way - but it's ugly and skips out of the Rails app.

Tim.
6ce7790a2be4d9c2308d87f88cb8f989?d=identicon&s=25 Fez (Guest)
on 2006-06-12 22:15
Tim Haines wrote:
> Hi Michael,
>
> Did you get this solved?   Please email me at tmhaines at gmail dot com
> if you did.
>
> I've found one way - but it's ugly and skips out of the Rails app.
>
> Tim.

Hey all, someone just forwarded me this thread.  Thanks everyone for all
the great comments / suggestions thrown out so far.

I am trying to do something very similar, and have a bounty of at least
$250 waiting if someone can come up with a solution. (not much i know
but hey)

http://jobs.rubynow.com/jobs/show/405

Requirements
- all self-contained on a hosted linux box (a dedicated server at
LayeredTech.com w/o KVM access... so, not sure if KDE-type stuff will
work)
- does not require using any pay-for ($$$) webservice like Alexa or
something to grab the images
- shelling out to use external libraries/scripts/etc is A-OK
- does not have to be real-time.  the command that does this can be
called later via automated cron

Something like this (+ the required system libraries/binaries used
installed) would be swell:

class WebGrabber
  def self.grab(url, output_dir)
    ... (your magic here) ...
  end
end

So you could just do:

Webgrabber.grab('http://www.google.com/', '/static/images')

And it would dump at least one image to that directory.  A few default
sizes would be slick as well but a little RMagick after the fact should
do the trick, right?

If anyone comes up with the solution I'll PayPal ya the $250 and we can
make the project open source for all to enjoy.  yay!

FYI - this is for a new digg-esque site built in 100% RoR (*not*
open-source sadly as daddy still has to pay the bills):

http://ummyeah.com/

- Fez
455ac2a64d06dc8461f4d258d7f7e980?d=identicon&s=25 Michael Trier (Guest)
on 2006-06-14 03:54
(Received via mailing list)
Awesome.  I'm eager to see the results as well.  I have not addressed
this issue as other issues have come in front of it. But I will be
looking into it soon, probably next week.

Michael
This topic is locked and can not be replied to.