I have this challenge of converting a bit of html into an image file and then shrinking it into a thumbnail. The shrinking part shouldn't be too bad, but what advice can you give on converting the html to the image? My thinking is that I will need to convert it to postscript or latex and then convert it to an image from there. Is there an easier / better approach? Michael
on 2006-05-25 15:57
on 2006-05-25 16:36
"converting html to an image" is a fairly complicated task. that's why browsers exist and are so complicated themselves. Even a "good" html2ps util will probably only be a rough-estimate and could easily destroy the look of most websites. It just depends on if you mean thumbnailing websites or some sort of html document that you have. That being said, if getting the exact look right was not too important, then I'd use that method. -hampton.
on 2006-05-25 16:57
Thanks. Accuracy is not important since these will probably end up being 50 x 50 or 100 x 100. We're just looking for somewhat of a visual representation. I just want to be sure before I dig in that I'm not missing something a lot simpler. Michael
on 2006-05-25 21:10
I'm not sure how you could go about this, but you might be able to just print screen it and crop it automatically. That might be simpler, but I'm not sure.
on 2006-05-25 23:25
Thanks Matt, but this needs to be a function of the application. It is snapshoting just a portion of the page and turning that into an image and indexing them. Michael
on 2006-05-26 11:48
Hi I think that this task is an hard one. The only nice solution I ever read of is to use a browser library on the server to render the html and take a snapshot of the rendered page. Please note that this will grow the complexity of your server setup. you can look at khtml2png html2png www.hackdiary.com/archives/000055.html to have some inspiration. Paolo 2006/5/25, Michael Trier <email@example.com>:
on 2006-05-26 12:06
Paolo Negri escribió:> Hi > > I think that this task is an hard one. The only nice solution I ever > read of is to use a browser library on the server to render the html > and take a snapshot of the rendered page. > Please note that this will grow the complexity of your server setup. > > you can look at khtml2png html2png > www.hackdiary.com/archives/000055.html to have some inspiration. > In Linux, 1. Transform your html file to .pdf (OpenOffice does the task) 2. Open with ImageMagick the .pdf file 3. Save as .jpg, .png, etc. Hope this helps, Antonio >> > -- ===== Por favor, si me mandas correos con copia a varias personas, pon mi dirección de correo en copia oculta (CCO), para evitar que acabe en montones de sitios, eliminando mi privacidad, favoreciendo la propagación de virus y la proliferación del SPAM. Gracias. ----- If you send me e-mail which has also been sent to several other people, kindly mark my address as blind-carbon-copy (or BCC), to avoid its distribution, which affects my privacy, increases the likelihood of spreading viruses, and leads to more SPAM. Thanks. =====
on 2006-05-26 12:24
If all the pages you want to get pictures of have the same format layout/colours, you could take a screenshot of the layout, then grab any dynamic text from the html (or straight from the db or whatever) and overlay it in the right position. Depending on how much work you want to do, you could customise it beyond plain text, too. As long as you're pages have a certain amount of consistency, you could do a better job than any existing program/plug-in, because existing programs wouldn't take that consistency into account. Hope this helps, -N
on 2006-05-26 12:49
If you're on a Mac, you could use WebKit to render into an offscreen view and just grab the view to an image...
on 2006-05-26 17:08
Wow, thanks for all of the good ideas. I really appreciate it. I'll be digging into this this weekend and early next week so if I find a workable solution, I might post something just so folks know how it all turned out. Michael
on 2006-05-26 17:59
Jonathan del Strother wrote: > If you're on a Mac, you could use WebKit to render into an offscreen > view and just grab the view to an image... There's an app that does just this: http://www.derailer.org/paparazzi/ Lets you pick the rendered window size, clip the page, all sorts of good stuff.
on 2006-05-26 18:17
I have been getting all the email from your list. over 100 messages since earling this morning. I don't want to receive anymore Thanks Tom Shorebird
on 2006-05-26 18:21
On 26-May-06, at 11:14 AM, tom Shorebird wrote: > I have been getting all the email from your list. over 100 > messages since earling this morning. > I don't want to receive anymore > Thanks > Tom Shorebird Maybe you shouldn't have subscribed then? The link to unsubscribe is on the bottom of EVERY one of those 100 messages. http://lists.rubyonrails.org/mailman/listinfo/rails
on 2006-05-26 18:30
on 2006-05-26 19:05
$> man how-to-post 1. enable brain 2. type text 3. submit
on 2006-05-26 21:48
You might be interested in looking at http://ruby-gnome2.sourceforge.jp/hiki.cgi?Ruby%2F... Sample : http://mirko.lilik.it/Ruby-GNOME2/moz-snapshooter.rb Hope that helps. Thanks, Pratik
on 2006-05-27 02:27
Pefect. Two excellent references. Thanks so much everyone. This should save me some time. Michael
on 2006-05-30 04:31
Michael Trier wrote: > I have this challenge of converting a bit of html into an image file > and then shrinking it into a thumbnail. The shrinking part shouldn't > be too bad, but what advice can you give on converting the html to the > image? My thinking is that I will need to convert it to postscript or > latex and then convert it to an image from there. Is there an easier > / better approach? a single argument to kdesktop will turn any url to an image. its part of KDE..
on 2006-05-30 05:12
Thanks, unfortunately this needs to be function of the Rails application. But I will look into the source for that. Thanks for the tip. Michael
on 2006-06-08 08:27
Hi Michael, Did you get this solved? Please email me at tmhaines at gmail dot com if you did. I've found one way - but it's ugly and skips out of the Rails app. Tim.
on 2006-06-12 22:15
Tim Haines wrote: > Hi Michael, > > Did you get this solved? Please email me at tmhaines at gmail dot com > if you did. > > I've found one way - but it's ugly and skips out of the Rails app. > > Tim. Hey all, someone just forwarded me this thread. Thanks everyone for all the great comments / suggestions thrown out so far. I am trying to do something very similar, and have a bounty of at least $250 waiting if someone can come up with a solution. (not much i know but hey) http://jobs.rubynow.com/jobs/show/405 Requirements - all self-contained on a hosted linux box (a dedicated server at LayeredTech.com w/o KVM access... so, not sure if KDE-type stuff will work) - does not require using any pay-for ($$$) webservice like Alexa or something to grab the images - shelling out to use external libraries/scripts/etc is A-OK - does not have to be real-time. the command that does this can be called later via automated cron Something like this (+ the required system libraries/binaries used installed) would be swell: class WebGrabber def self.grab(url, output_dir) ... (your magic here) ... end end So you could just do: Webgrabber.grab('http://www.google.com/', '/static/images') And it would dump at least one image to that directory. A few default sizes would be slick as well but a little RMagick after the fact should do the trick, right? If anyone comes up with the solution I'll PayPal ya the $250 and we can make the project open source for all to enjoy. yay! FYI - this is for a new digg-esque site built in 100% RoR (*not* open-source sadly as daddy still has to pay the bills): http://ummyeah.com/ - Fez
on 2006-06-14 03:54
Awesome. I'm eager to see the results as well. I have not addressed this issue as other issues have come in front of it. But I will be looking into it soon, probably next week. Michael