Excerpting a summary from my marked-up blog post without breaking html tags?

Hi there,

I’m working on a simple blog project. I’d like to automatically
generate a 100 word excerpt of each article to display on the main
page, and require the user to click on the article to read the whole
thing. The problem is, if the hundredth word is in the middle of a
block quote or a heading or something, the whole page gets messed up.
What’s the easiest way to close any tags that might still be open at
the end of the excerpt?

For what it’s worth, posts are marked up using textile and
superredcloth.

Thanks for your help,

Andrew

On Wed, Jul 23, 2008 at 08:10:39PM -0700, [email protected] wrote:

I’m working on a simple blog project. I’d like to automatically
generate a 100 word excerpt of each article to display on the main
page, and require the user to click on the article to read the whole
thing. The problem is, if the hundredth word is in the middle of a
block quote or a heading or something, the whole page gets messed up.
What’s the easiest way to close any tags that might still be open at
the end of the excerpt?

For what it’s worth, posts are marked up using textile and
superredcloth.

When I had to do something like that I worked with the straight HTML,
parsed by Hpricot. I then used #traverse_text to count words until I
reached the desired word count (i.e. counting only actual words, not
tags),
then truncated the text element I was on if necessary and deleting
subsequent sibling elements on up the #parent chain.

Thanks for your help,
Andrew
–Greg