Removing a block of text within a string

ryan · December 7, 2006, 7:13am

When you first visit my site, you see a snippet (the first 75 words) of
the most recent post. If it exceeds 75 words, a link will be appended
to extend the post and read it in full.

Now, my problem is when I post a code snippet, and I use “pre” tags to
preserve its formatting, I don’t want that to show up on the snippet…
only on the full version of the post. The reason is because the site
get’s all funky if the 75 work mark happens to fall in between the open
“pre” and the closed “pre” and they are no properly matched.

I’m using textile for text formatting, so in my view I currently have
something like this:

<%= to_html(post.body) %>

The “to_html” helper simply converts the textile text into html. I
would like to write a helper that would accept the converted html text
as a parameter, and just completely remove the

 snippet

block if I’m not on the full post. So, I’d like something like this:

<%= strip_code_snippets(to_html(post.body)) %>

Can someone help me with how to do this? What would the
“strip_code_snippets” method look like? I think I’d be fine finding the

 tag but I don't know how to completely remove the text from to 
.  Any help would be greatly appreciated.  Thanks in advance.

ryan · December 7, 2006, 7:47am

my_string = "

dflakdjflakj

my_string.gsub(/[^

][

$], ‘’)

You basically create a regular expression to find the

and

and nuke all the characters when you find the match.

I dont think my regex is correct. Advanced Ruby developers here can
help.

ryan · December 7, 2006, 2:37pm

Ok, that’s what I thought it would be. I’m not very good at regular
expressions, as I’ve had little experience with them. I’ll try it out,
though. Thanks a lot. Anyone else know if this regex will work? Or
have a better one? Thanks.

ryan · December 7, 2006, 4:53pm

On Dec 7, 2006, at 1:45 AM, Bala P. wrote:

You basically create a regular expression to find the
 and 
and nuke all the characters when you find the match.

I dont think my regex is correct. Advanced Ruby developers here can
help.

my_string = “This is my little block of code
puts ‘I need to

learn about Regexp’\nputs ‘Will you help?’
”
=> “This is my little block of code
puts ‘I need to learn about

Regexp’\nputs ‘Will you help?’
”
regexp = %r{<pre\b[^>]>.?}m
=> /<pre\b[^>]>.?</pre>/m
my_string.gsub(regexp, ‘’)
=> "This is my little block of code "

The square brackets […] enclose a character set and [^…] enclose
a negated set (not one of those characters. The previous posting
isn’t even well-formed syntactically.

Here’s a little explanation to get you started:

regexp = %r{<pre\b[^>]>.?}m

] = character set matching anything that's NOT a >
*    = zero or more times
 >    = literal >
.*?  = any character (.) repeated zero or more times, but as few as
possible to let the regexp match (*?)
        Note: this is getting rather advanced, you can look at the
pickaxe pp.68-77

= literal character matching

The %r{ } is an alternate way to write a literal regular expression
which I used in lieu of escaping the / in the /pre. You can see the
equivalent form that irb printed as the value.

The ‘m’ at the end is a flag to match multi-line input. It turns the
‘.’ from matching “any character except newline” to simply “any
character”.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

ryan · December 7, 2006, 5:57pm

I haven’t seen this come back from the list so I thought a copy
directly to you might be useful.

-Rob

Begin forwarded message:

ryan · December 7, 2006, 10:57pm

Sorry if this gets dup’d, I haven’t seen it hit the list after 8hrs.

On Dec 7, 2006, at 1:45 AM, Bala P. wrote:

You basically create a regular expression to find the
 and 
and nuke all the characters when you find the match.

I dont think my regex is correct. Advanced Ruby developers here can
help.

my_string = “This is my little block of code
puts ‘I need to

learn about Regexp’\nputs ‘Will you help?’
”
=> “This is my little block of code
puts ‘I need to learn about

Regexp’\nputs ‘Will you help?’
”
regexp = %r{<pre\b[^>]>.?}m
=> /<pre\b[^>]>.?</pre>/m
my_string.gsub(regexp, ‘’)
=> "This is my little block of code "

The square brackets […] enclose a character set and [^…] enclose
a negated set (not one of those characters. The previous posting
isn’t even well-formed syntactically.

Here’s a little explanation to get you started:

regexp = %r{<pre\b[^>]>.?}m

] = character set matching anything that's NOT a >
*    = zero or more times
 >    = literal >
.*?  = any character (.) repeated zero or more times, but as few as
possible to let the regexp match (*?)
        Note: this is getting rather advanced, you can look at the
pickaxe pp.68-77

= literal character matching

The %r{ } is an alternate way to write a literal regular expression
which I used in lieu of escaping the / in the /pre. You can see the
equivalent form that irb printed as the value.

The ‘m’ at the end is a flag to match multi-line input. It turns the
‘.’ from matching “any character except newline” to simply “any
character”.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]