Forum: Ruby on Rails Converting HTML into an Image

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Michael T. (Guest)
on 2006-05-25 17:57
(Received via mailing list)
I have this challenge of converting a bit of html into an image file
and then shrinking it into a thumbnail.  The shrinking part shouldn't
be too bad, but what advice can you give on converting the html to the
image?  My thinking is that I will need to convert it to postscript or
latex and then convert it to an image from there.  Is there an easier
/ better approach?

Michael
Hampton (Guest)
on 2006-05-25 18:36
(Received via mailing list)
"converting html to an image" is a fairly complicated task. that's why
browsers exist and are so complicated themselves.

Even a "good" html2ps util will probably only be a rough-estimate and
could
easily destroy the look of most websites. It just depends on if you mean
thumbnailing websites or some sort of html document that you have.

That being said, if getting the exact look right was not too important,
then
I'd use that method.

-hampton.
Michael T. (Guest)
on 2006-05-25 18:57
(Received via mailing list)
Thanks.  Accuracy is not important since these will probably end up
being 50 x 50 or 100 x 100.  We're just looking for somewhat of a
visual representation.  I just want to be sure before I dig in that
I'm not missing something a lot simpler.

Michael
Matt R. (Guest)
on 2006-05-25 23:10
(Received via mailing list)
I'm not sure how you could go about this, but you might be able to just
print screen it and crop it automatically. That might be simpler, but
I'm
not sure.
Michael T. (Guest)
on 2006-05-26 01:25
(Received via mailing list)
Thanks Matt, but this needs to be a function of the application.  It
is snapshoting just a portion of the page and turning that into an
image and indexing them.

Michael
Paolo N. (Guest)
on 2006-05-26 13:48
(Received via mailing list)
Hi

I think that this task is an hard one. The only nice solution I ever
read of is to use a  browser library on the server to render the html
and take a snapshot of the rendered page.
Please note that this will grow the complexity of your server setup.

you can look at khtml2png html2png
www.hackdiary.com/archives/000055.html to have some inspiration.

Paolo

2006/5/25, Michael T. <removed_email_address@domain.invalid>:
antonio rodriguez (Guest)
on 2006-05-26 14:06
(Received via mailing list)
Paolo N.
escribió:> Hi
>
> I think that this task is an hard one. The only nice solution I ever
> read of is to use a  browser library on the server to render the html
> and take a snapshot of the rendered page.
> Please note that this will grow the complexity of your server setup.
>
> you can look at khtml2png html2png
> www.hackdiary.com/archives/000055.html to have some inspiration.
>

In Linux,

1. Transform your html file to .pdf (OpenOffice does the task)
2. Open with ImageMagick the .pdf file
3. Save as .jpg, .png, etc.

Hope this helps,

Antonio

>>
>
--
=====
Por favor, si me mandas correos con copia a varias personas,
pon mi dirección de correo en copia oculta (CCO), para evitar
que acabe en montones de sitios, eliminando mi privacidad,
favoreciendo la propagación de virus y la proliferación del SPAM. Gracias.
-----
If you send me e-mail which has also been sent to several other people,
kindly mark my address as blind-carbon-copy (or BCC), to avoid its
distribution, which affects my privacy, increases the likelihood of
spreading viruses, and leads to more SPAM. Thanks.
=====
unknown (Guest)
on 2006-05-26 14:24
(Received via mailing list)
If all the pages you want to get pictures of have the same format
layout/colours, you could take a screenshot of the layout, then grab
any dynamic text from the html (or straight from the db or whatever)
and overlay it in the right position. Depending on how much work you
want to do, you could customise it beyond plain text, too. As long as
you're pages have a certain amount of consistency, you could do a
better job than any existing program/plug-in, because existing
programs wouldn't take that consistency into account.
Hope this helps,
-N
Jonathan del Strother (Guest)
on 2006-05-26 14:49
(Received via mailing list)
If you're on a Mac, you could use WebKit to render into an offscreen
view and just grab the view to an image...
Michael T. (Guest)
on 2006-05-26 19:08
(Received via mailing list)
Wow, thanks for all of the good ideas.  I really appreciate it.  I'll
be digging into this this weekend and early next week so if I find a
workable solution, I might post something just so folks know how it
all turned out.

Michael
Mike F. (Guest)
on 2006-05-26 19:59
Jonathan del Strother wrote:
> If you're on a Mac, you could use WebKit to render into an offscreen
> view and just grab the view to an image...

There's an app that does just this:

http://www.derailer.org/paparazzi/

Lets you pick the rendered window size, clip the page, all sorts of good
stuff.
tom Shorebird (Guest)
on 2006-05-26 20:17
(Received via mailing list)
I have been getting all the email from your list.  over 100 messages
since
earling this morning.
I don't want to receive anymore
Thanks
Tom Shorebird
Mike O. (Guest)
on 2006-05-26 20:21
(Received via mailing list)
On 26-May-06, at 11:14 AM, tom Shorebird wrote:

> I have been getting all the email from your list.  over 100
> messages since earling this morning.
> I don't want to receive anymore
> Thanks
> Tom Shorebird

Maybe you shouldn't have subscribed then?  The link to unsubscribe is
on the bottom of EVERY one of those 100 messages.

	http://lists.rubyonrails.org/mailman/listinfo/rails
unknown (Guest)
on 2006-05-26 20:30
(Received via mailing list)
Peter E. (Guest)
on 2006-05-26 21:05
(Received via mailing list)
$> man how-to-post

1. enable brain
2. type text
3. submit
Pratik N. (Guest)
on 2006-05-26 23:48
(Received via mailing list)
You might be interested in looking at
http://ruby-gnome2.sourceforge.jp/hiki.cgi?Ruby%2F...

Sample : http://mirko.lilik.it/Ruby-GNOME2/moz-snapshooter.rb

Hope that helps.

Thanks,
Pratik
Michael T. (Guest)
on 2006-05-27 04:27
(Received via mailing list)
Pefect.  Two excellent references.  Thanks so much everyone.  This
should save me some time.

Michael
Carmen -. (Guest)
on 2006-05-30 06:31
Michael T. wrote:
> I have this challenge of converting a bit of html into an image file
> and then shrinking it into a thumbnail.  The shrinking part shouldn't
> be too bad, but what advice can you give on converting the html to the
> image?  My thinking is that I will need to convert it to postscript or
> latex and then convert it to an image from there.  Is there an easier
> / better approach?


a single argument to kdesktop will turn any url to an image. its part of
KDE..
Michael T. (Guest)
on 2006-05-30 07:12
(Received via mailing list)
Thanks, unfortunately this needs to be function of the Rails
application.  But I will look into the source for that.  Thanks for
the tip.

Michael
Tim H. (Guest)
on 2006-06-08 10:27
Hi Michael,

Did you get this solved?   Please email me at tmhaines at gmail dot com
if you did.

I've found one way - but it's ugly and skips out of the Rails app.

Tim.
Fez (Guest)
on 2006-06-13 00:15
Tim H. wrote:
> Hi Michael,
>
> Did you get this solved?   Please email me at tmhaines at gmail dot com
> if you did.
>
> I've found one way - but it's ugly and skips out of the Rails app.
>
> Tim.

Hey all, someone just forwarded me this thread.  Thanks everyone for all
the great comments / suggestions thrown out so far.

I am trying to do something very similar, and have a bounty of at least
$250 waiting if someone can come up with a solution. (not much i know
but hey)

http://jobs.rubynow.com/jobs/show/405

Requirements
- all self-contained on a hosted linux box (a dedicated server at
LayeredTech.com w/o KVM access... so, not sure if KDE-type stuff will
work)
- does not require using any pay-for ($$$) webservice like Alexa or
something to grab the images
- shelling out to use external libraries/scripts/etc is A-OK
- does not have to be real-time.  the command that does this can be
called later via automated cron

Something like this (+ the required system libraries/binaries used
installed) would be swell:

class WebGrabber
  def self.grab(url, output_dir)
    ... (your magic here) ...
  end
end

So you could just do:

Webgrabber.grab('http://www.google.com/', '/static/images')

And it would dump at least one image to that directory.  A few default
sizes would be slick as well but a little RMagick after the fact should
do the trick, right?

If anyone comes up with the solution I'll PayPal ya the $250 and we can
make the project open source for all to enjoy.  yay!

FYI - this is for a new digg-esque site built in 100% RoR (*not*
open-source sadly as daddy still has to pay the bills):

http://ummyeah.com/

- Fez
Michael T. (Guest)
on 2006-06-14 05:54
(Received via mailing list)
Awesome.  I'm eager to see the results as well.  I have not addressed
this issue as other issues have come in front of it. But I will be
looking into it soon, probably next week.

Michael
This topic is locked and can not be replied to.