Regex for "not matching" an unneeded prefix substring?

Hi all,

I’m new to Ruby and even newer to regex. I’m trying to write my first
[useful] Ruby program and need a way to cut out an unneeded prefix
substring and retain the substring that comes after it.

Here are the actual details from my code:

result.each do |item|
price = item.search(".price").text.match(/\d+[.]\d+/)
condition = item.search(".condition").text.match(/Used - ([^,]+)/)
rating = item.search(".rating a").text.to_i
seller = item.search(".seller b").text
puts “#{price} - #{condition} - #{rating} - #{seller}”
end

The one from condition [in the code above] is the one that is giving me
a challenge. The string that is sent to condition will always be exactly
one of the following and nothing else at all:

“Used - Like New”
“Used - Very Good”
“Used - Good”
“Used - Acceptable”

I’m trying to get them to display as the following in the puts at the
end of my code:

“Like New”
“Very Good”
“Good”
“Acceptable”

The regex that I’ve got there in the condition line works in Rubular,
but not in my code. I’m running 1.8.7 if that matters…

One last thing that I don’t understand too is that in Rubular my regex
for price shows the match in the “Match result:” line, but the regex for
condition shows the whole string as a match in the “Match result:” line
but shows the correctlt matching substring in the “Match captures:”
line.

I’m grateful for this great resource (the list/forum) and would be very
happy to hear from anyone who can help me sort this out!

Thanks in advance,
J

On 02/26/2010 22:20, Jet K. wrote:

end of my code:

“Like New”
“Very Good”
“Good”
“Acceptable”

If you just want to ged rid of the word “Used”, you could use something
like this:

text = “Used - Like New”
text[7, text.length]
=> “Like New”

Regards

Alexander J. wrote:

On 02/26/2010 22:20, Jet K. wrote:

end of my code:

“Like New”
“Very Good”
“Good”
“Acceptable”

If you just want to ged rid of the word “Used”, you could use something
like this:

text = “Used - Like New”
text[7, text.length]
=> “Like New”

Regards

It works! :slight_smile: I had to change 7 to 8 to get rid of an extra space, but
that did it in a far less complex way than using regex! Thanks.

2010/2/26 Jet K. [email protected]:

condition = item.search(".condition").text.match(/Used - ([^,]+)/)
“Used - Very Good”

The regex that I’ve got there in the condition line works in Rubular,
but not in my code. I’m running 1.8.7 if that matters…

I am not sure which regexp you are referring to specifically.
However, you can do this

irb(main):001:0> s = “Used - Like New”
=> “Used - Like New”
irb(main):002:0> s[/\AUsed\s±\s+(.*)\z/, 1]
=> “Like New”
irb(main):003:0> s[7…-1]
=> “Like New”

String#[] with regular expression is a very powerful tool - especially
when used with grouping as in this case.

One last thing that I don’t understand too is that in Rubular my regex
for price shows the match in the “Match result:” line, but the regex for
condition shows the whole string as a match in the “Match result:” line
but shows the correctlt matching substring in the “Match captures:”
line.

I am having difficulties to follow you here since I don’t know what
“item” is in your case. It’s probably easier if you provide a simple
test case that demonstrates your point. Using IRB often also helps.

I’m grateful for this great resource (the list/forum) and would be very
happy to hear from anyone who can help me sort this out!

We’ll try to help but please provide a bit more information.

Kind regards

robert

Jet K. wrote:

Alexander J. wrote:

On 02/26/2010 22:20, Jet K. wrote:

end of my code:

“Like New”
“Very Good”
“Good”
“Acceptable”

If you just want to ged rid of the word “Used”, you could use something
like this:

text = “Used - Like New”
text[7, text.length]
=> “Like New”

Regards

It works!

Hmmm, well, actually, it kind of works. I did this:

result.each do |item|
price = item.search(".price").text.match(/\d+[.]\d+/)
condition = item.search(".condition").text
rating = item.search(".rating a").text.to_i
seller = item.search(".seller b").text
puts “#{price} - #{condition.chomp[8, condition.length]} - #{rating} -
#{seller}”
end

and then realized I actually need to be able to just put #{condition} by
itself in the puts and not use #{condition.chomp[8, condition.length]}

but, I tried and found that I don’t know how to adjust the code in the
block above. Can someone help again?

Robert K. wrote:

2010/2/26 Jet K. [email protected]:

�condition = item.search(".condition").text.match(/Used - ([^,]+)/)
“Used - Very Good”

The regex that I’ve got there in the condition line works in Rubular,
but not in my code. I’m running 1.8.7 if that matters…

I am not sure which regexp you are referring to specifically.
However, you can do this

irb(main):001:0> s = “Used - Like New”
=> “Used - Like New”
irb(main):002:0> s[/\AUsed\s±\s+(.*)\z/, 1]
=> “Like New”
irb(main):003:0> s[7…-1]
=> “Like New”

String#[] with regular expression is a very powerful tool - especially
when used with grouping as in this case.

One last thing that I don’t understand too is that in Rubular my regex
for price shows the match in the “Match result:” line, but the regex for
condition shows the whole string as a match in the “Match result:” line
but shows the correctlt matching substring in the “Match captures:”
line.

I am having difficulties to follow you here since I don’t know what
“item” is in your case. It’s probably easier if you provide a simple
test case that demonstrates your point. Using IRB often also helps.

I’m grateful for this great resource (the list/forum) and would be very
happy to hear from anyone who can help me sort this out!

We’ll try to help but please provide a bit more information.

Kind regards

robert

Hi Robert,

Thanks a lot. I’ve discovered that there are many ways of achieving this
goal, whether it’s through regex, ranges, or even split (as a friend
offline just advised me of).

I’ve gotten it working for now, but I’ll likely be back eventually when
the next question arises. :slight_smile:

On 02/26/2010 23:06, Jet K. wrote:

and then realized I actually need to be able to just put #{condition} by
itself in the puts and not use #{condition.chomp[8, condition.length]}

Insert

condition = condition.chomp[8, condition.length]

after

condition = item.search(".condition").text

and you can use #{condition} in the string.

Regards

Jet K. wrote:

Hi all,

I’m new to Ruby and even newer to regex. I’m trying to write my first
[useful] Ruby program and need a way to cut out an unneeded prefix
substring and retain the substring that comes after it.

Here are the actual details from my code:

result.each do |item|
price = item.search(".price").text.match(/\d+[.]\d+/)
condition = item.search(".condition").text.match(/Used - ([^,]+)/)
rating = item.search(".rating a").text.to_i
seller = item.search(".seller b").text
puts “#{price} - #{condition} - #{rating} - #{seller}”
end

The one from condition [in the code above] is the one that is giving me
a challenge. The string that is sent to condition will always be exactly
one of the following and nothing else at all:

“Used - Like New”
“Used - Very Good”
“Used - Good”
“Used - Acceptable”
(…)

This is another option, avoiding regular expressions. It’s kind of old
school, but it’s fast, flexible, and handles garbage.

sanitize_condition = Hash.new(“Unknown”)
sanitize_condition[“Used - Like New”] = “Like New”
sanitize_condition[“Used - Very Good”] = “Very Good”
sanitize_condition[“Used - Good”] = “Good”
sanitize_condition[“Used - Acceptable”] = “Acceptable”
sanitize_condition[“Used - Broken”] = “Kaput”

demo_conditions = [“Used - Like New”,"",nil,“Used - Broken”,“Used -
Acceptable”,“garble”]
demo_conditions.each{|cond| puts sanitize_condition[cond] }

hth,

Siep

Siep K. wrote:

Jet K. wrote:

Hi all,

I’m new to Ruby and even newer to regex. I’m trying to write my first
[useful] Ruby program and need a way to cut out an unneeded prefix
substring and retain the substring that comes after it.

Here are the actual details from my code:

result.each do |item|
price = item.search(".price").text.match(/\d+[.]\d+/)
condition = item.search(".condition").text.match(/Used - ([^,]+)/)
rating = item.search(".rating a").text.to_i
seller = item.search(".seller b").text
puts “#{price} - #{condition} - #{rating} - #{seller}”
end

The one from condition [in the code above] is the one that is giving me
a challenge. The string that is sent to condition will always be exactly
one of the following and nothing else at all:

“Used - Like New”
“Used - Very Good”
“Used - Good”
“Used - Acceptable”
(…)

This is another option, avoiding regular expressions. It’s kind of old
school, but it’s fast, flexible, and handles garbage.

sanitize_condition = Hash.new(“Unknown”)
sanitize_condition[“Used - Like New”] = “Like New”
sanitize_condition[“Used - Very Good”] = “Very Good”
sanitize_condition[“Used - Good”] = “Good”
sanitize_condition[“Used - Acceptable”] = “Acceptable”
sanitize_condition[“Used - Broken”] = “Kaput”

demo_conditions = [“Used - Like New”,"",nil,“Used - Broken”,“Used -
Acceptable”,“garble”]
demo_conditions.each{|cond| puts sanitize_condition[cond] }

hth,

Siep

Hi Siep,

Thanks! My offline friend actually suggested that I refactor everything
into a hash actually, because the condition info is just one criteria of
many that I am pulling into my app…

but it is making my head spin because I am so new to Ruby, so I’m going
to take a break and then look at it again and also look at the
documentation for hash and see what I can come up with.

My friend also suggested that I write sudocode for all my desired
functionality and that that could help a lot. I have a prioritized list
for now, but it is making my head hurt to try and do so much that I
don’t know how to do! :slight_smile:

I can’t say enough how helpful the list/forum is, and that I’m very
grateful for everyone using their free time to help me along.

On 02/27/2010 12:23 AM, Jet K. wrote:

Thanks a lot. I’ve discovered that there are many ways of achieving this
goal, whether it’s through regex, ranges, or even split (as a friend
offline just advised me of).

That’s often the case with Ruby - and many of those ways are also
elegant.

I’ve gotten it working for now, but I’ll likely be back eventually when
the next question arises. :slight_smile:

“I’ll be back.” - oooh… :wink:

Cheers

robert

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs