Regexp conditional

Hi,

How could I easily do the following?:

input = ‘111|“aaaa” bbbbb|c’

I would like to get as output following:

output = ‘111|""“aaa”" bbbbb"|c’

to correct wrongly prepared pipe separated file with enclosing the
equation characters.

thanks
chris

On Mar 19, 2008, at 10:00 , ciapecki wrote:

to correct wrongly prepared pipe separated file with enclosing the
equation characters.

My interpretation of what you need is:

input = '111|"aaaa" bbbbb|c'

fields = input.split('|')
fields[1].gsub!('"', '""')
fields[1] = %Q{"#{fields[1]}"}

output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

– fxn

On 19 Mrz., 10:09, Xavier N. [email protected] wrote:

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

– fxn

your solution works, and thanks for that,
I am waiting though for a regexp solution,

thanks anyway
chris

Robert K. wrote:

I am waiting though for a regexp solution,
Even 2 regexps:
Cheers

robert

Robert,

How fixed is the input. If it is always of the same format, then what
about:

input = ‘111|“aaaa” bbbbb|c’
output=input.gsub(/"/,’""’).gsub(/(.)|(.)|(.*)/,’\1|"\2"|\3’)

Mac

On 19.03.2008 11:21, ciapecki wrote:

output = fields.join('|')

Depending on the input that may not be valid, for example if fields[1]
may contain a pipe. Anyway being a mal-formed input assumptions depend
on the actual data.

your solution works, and thanks for that,
I am waiting though for a regexp solution,

Even 2 regexps:

irb(main):001:0> input = ‘111|“aaaa” bbbbb|c’
=> “111|“aaaa” bbbbb|c”
irb(main):002:0> input.gsub(/[^|]+/) {|m| m.gsub!(/"/,’""’) ?
‘"’<<m<<’"’ : m}
=> “111|”"“aaaa”" bbbbb"|c"
irb(main):003:0> puts input.gsub(/[^|]+/) {|m| m.gsub!(/"/,’""’) ?
‘"’<<m<<’"’ : m}
111|""“aaaa”" bbbbb"|c
=> nil

Cheers

robert

Paul M. wrote:

Robert,

Oops. I meant Chris of course.

Mac

On 19 Mrz., 22:28, Robert K. [email protected] wrote:

fields = input.split('|')

I am waiting though for a regexp solution,
111|“”“aaaa”" bbbbb"|c
=> nil

Cheers

    robert

this is just great,
thanks robert for this

chris

On 20 Mrz., 01:18, Paul M. [email protected] wrote:


Posted viahttp://www.ruby-forum.com/.

Hi Mac,

The format is fixed but contains 49 fields separated by | so your one-
liner could not fit into one line :slight_smile:

thanks,
chris

2008/3/20, ciapecki [email protected]:

your solution works, and thanks for that,
‘"’<<m<<‘"’ : m}
111|“”“aaaa”" bbbbb"|c
=> nil

Cheers

    robert

this is just great,
thanks robert for this

You’re welcome. Btw, this is even better (also faster)

irb(main):003:0> input = ‘111|“aaaa” bbbbb|c’
=> “111|"aaaa" bbbbb|c”
irb(main):004:0> input.gsub(/“/,'”“‘).gsub(/[^|]"[^|]/,’”\&“')
=> “111|"""aaaa"" bbbbb"|c”
irb(main):005:0> puts input.gsub(/”/,‘“”’).gsub(/[^|]"[^|]/,‘“\&”’)
111|“”“aaaa”" bbbbb"|c
=> nil

This could even be a bit faster:

irb(main):006:0> input.gsub(/“/,'”“').gsub(/[^|”]"[^|]/,‘“\&”’)
=> “111|"""aaaa"" bbbbb"|c”

Kind regards

robert

ciapecki wrote:

The format is fixed but contains 49 fields separated by | so your one-
liner could not fit into one line :slight_smile:

A minor change :slight_smile:

input = ‘111|“aaaa” bbbbb|c|“22222” asdasd|ddd|"aaaa"qqqqq|jjjj’
output=input.gsub(/"/,’""’).gsub(/(.?)|(.?)|(.*?)/,’\1|"\2"|\3’)

=>111|""“aaaa”" bbbbb"|c|""“22222"” asdasd"|ddd|""“aaaa”“qqqqq”|jjjj

This was just to point out that there is no need for multiple replace
options, but you need to know the layout is correct and in a given
pattern. Robert’s is the better and more robust solution (and also
shorter).

Mac