Extract a substring

hi

I have a string:
my_string=“blablablablasubstringblabla”

I need to extract the sentence beetween “” and “</
coordinates>”

How can I do that?
Thanks for your help
JF

my_string=“blablablablasubstringblabla”
#the parentheses below define the actual match for the overall regex
pattern
sub_string = /.(.)</coordinates>.*/.match(my_string)
puts sub_string[0]

Regex is the fastest/most effective for one/off text parsing. Another
good option is Whytheluckystiff’s Hpricot:
http://code.whytheluckystiff.net/hpricot/

Hank

On Sep 21, 2008, at 5:03 PM, blasterpal wrote:

Hank
You probably want the regexp to be:
/(.)</coordinates>/
so there’s less backtracking when the .
first tries to gobble
everything.

You might also need something like:
/<coordinates\b[^>]>(.)</coordinates>/
If there can be any attributes on the coordinates tag. Of course, if
you really do have XML in my_string, a true parser like Hpricot or
REXML will be more reliable than regular expressions. For example, if
you had to match against:
“blahblahfirst one</
coordinates>yadayadayadaoops! another one</
coordinates>yakyakyak”
would you want the substring to be:
“first oneyadayadayadaoops! another one”
(yeah, I didn’t think so :wink:

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

On Sun, Sep 21, 2008 at 1:28 PM, jef [email protected] wrote:

hi

I have a string:
my_string=“blablablablasubstringblabla”

I need to extract the sentence beetween “” and “</
coordinates>”

How can I do that?

Hi, I would recommend using the Hpricot and you can find the
documentation
here:

http://code.whytheluckystiff.net/doc/hpricot

Good luck,

-Conrad

Regexp.match(string) will return you a MatchData object, which is not
just the match: It can be accessed as an Array. So:
sub[0] returns the entire matched string
sub[1], sub[2], … return the values of the matched back references
(the ones between parentheses).

sub[1] is therefore the thing you want to use. No need to use to_s.

ah ok

thank you all for your help

hi

/.(.)</coordinates>.*/ The reg exp you gave works
fine. I tested it with rubular

probleme I can retrieve the substring I always get the whole string.

Here is what i did:

irb(main):001:0> st="-0.954850,46.436960,0</
coordinates>"
=> “-0.954850,46.436960,0”
irb(main):002:0> sub=/.(.)</coordinates>./.match(st)
=> #MatchData:0x7f2040045fd0
irb(main):003:0> sub.inspect
=> “#MatchData:0x7f2040045fd0
irb(main):004:0> sub.to_s
=> “-0.954850,46.436960,0”
irb(main):005:0> sub.string
=> “-0.954850,46.436960,0”
irb(main):006:0> st.match(/.
(.)</coordinates>./)
=> #MatchData:0x7f2040019fc0
irb(main):007:0> st.match(/.(.)</coordinates>.*/).to_s
=> “-0.954850,46.436960,0”

thank you for your help

: