Hi,
I have a string containing some ruby code and html tags in-between.
For example,
str = “require 'my_class.rb’
require
‘your_class.rb’
:key=>‘hello’”
I want these html tags(’
’, ’ ', ‘>’, ‘<’, ‘
’, ‘’
etc…)
to be replaced by the equivalent ruby characters("\n", " ", “>”, “<”
etc…).
These html tags can change dynamically according to the inputs.
Is there any way to parse these html tags to equivalent ruby characters?
Thanks in advance…
string = “
><
”
string.gsub!(“tag”,“replacement”)
I think you get the idea.
On Tue, May 20, 2008 at 2:33 PM, Karthi kn
[email protected]
wrote:
etc…)
Posted via http://www.ruby-forum.com/.
–
Appreciated my help?
Reccommend me on Working With Rails
http://workingwithrails.com/person/11030-ryan-bigg
Thanks Ryan. But I can’t guess what are all the tags i will be getting.
Because those are dynamic. Any possible tag can come. So if I have to
use the ‘gsub’ method, I will have to write for each and every html tag.
Then that will be big.
So I am looking for any other easier way to implement this(something
like html parser kind of).
You never specified what you wanted the
and tags replaced
with
either.
On Tue, May 20, 2008 at 3:28 PM, Karthi kn
[email protected]
wrote:
–
Appreciated my help?
Reccommend me on Working With Rails
http://workingwithrails.com/person/11030-ryan-bigg
On 20 May 2008, at 07:12, Karthi kn wrote:
Sorry. That’s my mistake. The final thing i want from the string is a
runnable ruby code. So
and tags can be removed from the
string without any replacement.
Now I think, the only way to implement this is to use the ‘gsub’
method
for each and every possible tag.
Well assuming the only tag with special meaning is
Then you can
just convert entities to their respective characters (there are tables
of these),
to “\n” and then just replace every other tag with ‘’.
No need for one regexp per tag for that!
Fred
But “>” and “<” need to be replaced with “>” and “<” respectively.
Because I will having some ruby hash code in the string.
Also I need to find out all the html tags in that string. Is there any
way to find that?
Sorry. That’s my mistake. The final thing i want from the string is a
runnable ruby code. So
and tags can be removed from the
string without any replacement.
Now I think, the only way to implement this is to use the ‘gsub’ method
for each and every possible tag.
On 20 May 2008, at 09:42, Karthi kn wrote:
But “>” and “<” need to be replaced with “>” and “<”
respectively.
Because I will having some ruby hash code in the string.
I’m not seeing the problem
Replace entities and then look for
everything between < and >. Change it to a newline if it’s a br, or
just replace it with blank and add it to your list of html tags.
Fred
Thanks for your replies. I have done as I wanted. The following the code
for that.
markup = markup.gsub('<br>', "\n")
markup = markup.gsub(/[\<]([\/])*([A-Za-z0-9])*[\>]/, '')
markup = markup.gsub('>', ">")
markup = markup.gsub('<', "<")
markup = markup.gsub(' ', " ")
markup = markup.gsub('&', "&")
It’s working fine now. But I am not sure whether I have covered all the
tags and characters or not.
On 21 May 2008, at 12:30, Derbee Don wrote:
markup = markup.gsub(‘&’, “&”)
It’s working fine now. But I am not sure whether I have covered all
the
tags and characters or not.
depends what you are trying todo. there are far more html entities
that that. (a partial list is here
http://www.w3schools.com/tags/ref_entities.asp)
and of course there are the unicode style ones
(http://theorem.ca/~mvcorks/code/charsets/auto.html
)
Fred
I have a very, very strong suspicion that the need is only to
translate character enconding (e.g., &=>‘&’).
It might be worth considering iterating over an array of hashes rather
than repeating the same code with different parameters:
[{:regex=>/<br>/, :decoded=>“\n”},
{:regex=>/<([A-Za-z0-9])[>]/, :decoded=>‘’},
{:regex=>/>/, :decoded=>‘>’}
…
].each do |decoding_hash|
markup.gsub!(decoding_hash[:regex], decoding_hash[:decoded])
end
The advantage is in keeping the code DRY and making the intentions of
the block a bit clearer.
On May 21, 7:53 am, Frederick C. [email protected]