There are several ways to do it with regular expressions, but in any
case, the patterns you want to extract needs to be enclosed in
parentheses (which makes them capturing groups).
One way would then be to use String#match on your input string (see http://ruby-doc.org/core-1.9.3/String.html#method-i-match), which
returns an object of type MatchData. The example in the aforementioned
URL shows how you can extract the matched strings from the MatchData
object.
txt=<<EEND
Response by … Service
Limited to Jo
Bloggs on 13 September 2016.
Follow up sent to … Service
Limited by Jane
Doe on 3 February 2017.
EEND
txt.gsub(/\r?\n/,"").scan(/\b(href|datetime)="(.*?)"/).
each_slice(2) do |(k,v),(k1,v1)|
p [v.split(’/’).last,v1.split(‘T’).first] if k==“href” &&
k1==“datetime”
end
txt=<<EEND
Response by … Service
Limited to Jo
Bloggs on 13 September 2016.
Follow up sent to … Service
Limited by Jane
Doe on 3 February 2017.
EEND
txt.gsub(/\r?\n/,"").scan(/\b(href|datetime)="(.*?)"/).
each_slice(2) do |(k,v),(k1,v1)|
p [v.split(’/’).last,v1.split(‘T’).first] if k==“href” &&
k1==“datetime”
end
It’s just that when I run the code, it returns just one result:
[“A_Person”, “2017-01-30”]
From the 25 or so entries in my text file, there is one line with:
Request to Test Service
Ltd by J Doe.
Annotated by
A Person on 30 January 2017.
I think the code is getting thrown off by this line. I deleted the
“annotated by” part and made it similar to the other lines. Then the
code returns nothing.
Actually, is there a way to just extract “J_Doe”. Forget about
the date/time for now. (I can filter this elsewhere)
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.