Complex GSUB query

Hello all,

I am struggling with something and I have yet been able to find anything
that may help me.

I have a string like follows:

string = ““bloggs, Joe (JBloggs, INFO)” joebloggs, “bloggs, Bill
(BBloggs, INFO)” billbloggs”

I want to break this string up into two entries using a _ seperator, one
for Joe and the other for Bill. I could do this with a simple

string.gsub(",", “_”)

However the problem with doing this is that there are commas elsewhere
in the string. So what I need to say is, if the comma is outside of “”
(quotes) replace it with the _

Could anyone possibly help me with this?

Thanks

Ne Scripter wrote:

string = ““bloggs, Joe (JBloggs, INFO)” joebloggs, “bloggs, Bill
(BBloggs, INFO)” billbloggs”

I want to break this string up into two entries using a _ seperator, one
for Joe and the other for Bill. I could do this with a simple

string.gsub(",", “_”)

However the problem with doing this is that there are commas elsewhere
in the string. So what I need to say is, if the comma is outside of “”
(quotes) replace it with the _

Could anyone possibly help me with this?

Thanks

I’d do:
string.gsub!(", “”, "_ “”)

If the comma is followed by a space and double quotes, replace that

with an undersore, a space and a double quote.
But that’s because I’m really lazy.

Hi –

On Wed, 21 Oct 2009, Ne Scripter wrote:

I want to break this string up into two entries using a _ seperator, one
for Joe and the other for Bill. I could do this with a simple

string.gsub(“,”, “_”)

However the problem with doing this is that there are commas elsewhere
in the string. So what I need to say is, if the comma is outside of “”
(quotes) replace it with the _

Could anyone possibly help me with this?

It looks like the pattern /, "/ occurs at the end of one record into
the beginning of the next one, and nowhere else. Assuming that’s
correct, it suggests something like:

string.gsub(/,(?=\s+")/, ‘_’)

i.e., for any comma which is followed by some whitespace and a double
quote character, replace the comma with an underscore.

David


The Ruby training with D. Black, G. Brown, J.McAnally
Compleat Jan 22-23, 2010, Tampa, FL
Rubyist http://www.thecompleatrubyist.com

David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)

On Oct 20, 2009, at 11:37 AM, David A. Black wrote:

I have a string like follows:
However the problem with doing this is that there are commas

Compleat Jan 22-23, 2010, Tampa, FL
Rubyist http://www.thecompleatrubyist.com

David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)

Or perhaps scan is a better hammer for this nail:

irb> string = “"bloggs, Joe (JBloggs, INFO)" joebloggs, "bloggs,
Bill (BBloggs, INFO)" billbloggs”
=> “"bloggs, Joe (JBloggs, INFO)" joebloggs, "bloggs, Bill
(BBloggs, INFO)" billbloggs”
irb> re = %r{“\w+, \w+ (\w+, \w+)” \w+}
=> /“\w+, \w+ (\w+, \w+)” \w+/
irb> string.scan(re)
=> [“"bloggs, Joe (JBloggs, INFO)" joebloggs”, “"bloggs, Bill
(BBloggs, INFO)" billbloggs”]

You could paste them back together with a .join(‘_’), but I suspect
that you want the pieces later anyway.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]

Thanks all. I went the suggestion given by David because although the
structure is consistent I can never be sure on the number of elements in
the string.

Many thanks

Rob B. wrote:

On Oct 20, 2009, at 11:37 AM, David A. Black wrote:

I have a string like follows:
However the problem with doing this is that there are commas

Compleat Jan 22-23, 2010, Tampa, FL
Rubyist http://www.thecompleatrubyist.com

David A. Black/Ruby Power and Light, LLC (http://www.rubypal.com)

Or perhaps scan is a better hammer for this nail:

irb> string = “"bloggs, Joe (JBloggs, INFO)" joebloggs, "bloggs,
Bill (BBloggs, INFO)" billbloggs”
=> “"bloggs, Joe (JBloggs, INFO)" joebloggs, "bloggs, Bill
(BBloggs, INFO)" billbloggs”
irb> re = %r{“\w+, \w+ (\w+, \w+)” \w+}
=> /“\w+, \w+ (\w+, \w+)” \w+/
irb> string.scan(re)
=> [“"bloggs, Joe (JBloggs, INFO)" joebloggs”, “"bloggs, Bill
(BBloggs, INFO)" billbloggs”]

You could paste them back together with a .join(‘_’), but I suspect
that you want the pieces later anyway.

-Rob

Rob B. http://agileconsultingllc.com
[email protected]