Need help bringing select array lines together

Hi there, I am looking at some old, confusing ruby code that works but
is really ugly. I’m hoping someone here can help me find a more ruby
way of rewriting it. Rather than post the ugly code, I’ll describe
what it is trying to do.

The code reads in each line of an array, looks for a closing/ending "
and puts multiple lines into one element of another array if they are
part of the same string.

input: data_array = [ " “Only one line.”", " "line 1 “, " line 2.”
" ]

so data_array.size = 3

puts data_array
“Only one line.”
“line 1
line 2.”


I need to bring linked lines together like this:
new_array = [ " “Only one line.”", " “line 1 line 2.” "]

=> new_array.size = 2

ideas?

TIA.

The code reads in each line of an array, looks for a closing/ending "
and puts multiple lines into one element of another array if they are
part of the same string.

Do you need to handle nested strings?

Here is a crude solution that doesn’t. I tried it on 1.9.2

The messy input

mess = [ " “Only one line.”", " “line 1 “, " line 2.”” ]

A single string, from the mess concatenated

single = mess.join “”

Find all matches of the regex

matches = single.scan /("[^"]*")/

Flatten those results,

to get lines

lines = matches.flatten

Pretty print the lines

p lines

I don’t think any regex, being a finite automaton, could handle nested
strings properly.

Have you considered writing a small lexer and parser? You may be
better doing that, can handle weirder strings too.

Cheers,
Johnny

I don’t think any regex, being a finite automaton, could handle nested
strings properly.

Ignore this bit, it’s false.

Hi Johnny, thanks for the reply

On May 26, 1:43pm, Johnny M. wrote:

Do you need to handle nested strings?

I’m not sure. I don’t think so. I think the content is scrubbed
beforehand to replace all embedded " with '.

p lines

I’m not sure about the ‘join’ solution for 2 reasons:

  1. this array may be 1000’s of lines long and I wonder what the
    performance will be like.
  2. I’m thinking about getting rid of the initial array and reading
    the data straight from the input files. In that case, i would have to
    read in each line anyway.

Having said that, I will give this a try and see how it works with the
test data I have. Thanks!

Have you considered writing a small lexer and parser? You may be
better doing that, can handle weirder strings too.

No I haven’t – I don’t know how to do that. yet.

Cheers! Paul.

Paul wrote in post #1001335:

I’m not sure about the ‘join’ solution for 2 reasons:

Okay. How about:

original_lines = [
%q{“Only one line.”},
%q{“line 1 },
%q{ line 2.”},
]

combined_lines = []
temp = ‘’

original_lines.each do |line|
temp << line

if temp[-1] == %q{"}
combined_lines << temp
temp = ‘’
end
end

p combined_lines

–output:–
["“Only one line.”", ““line 1 line 2.””]

Have you considered writing a small lexer and parser? You may be
better doing that, can handle weirder strings too.

No I haven’t – I don’t know how to do that. yet.

Hey I was brainfarting at that point, you really don’t need to do that!
I brainfart a lot. Should really think for a bit before sending emails
off…

Paul wrote in post #1001335:

  1. I’m thinking about getting rid of the initial array and reading
    the data straight from the input files.

Okay. How about:

require ‘stringio’

str =<<‘ENDOFSTRING’
“Only one line.”
“line 1
line 2
line 3”
ENDOFSTRING

file = StringIO.new(str)
$/ = %Q{"\n} #change input line separator

file.each do |line|
line.chomp!("\n")
line.gsub!("\n", ’ ')
p line
end

–output:–
““Only one line.””
"“line 1 line 2 line 3"”

I imagine that will be slower than my previous solution.

On May 26, 2011, at 6:33 PM, 7stud – wrote:

Johnny M. wrote in post #1001292:

I don’t think any regex, being a finite automaton, could handle nested
strings properly.

There are regexes for nested parentheses–they use recursive regexes.

You are both right. Johnny is using the formal language definition of
regular expression while 7stud is using the term to refer to particular
programming language constructs. Most modern regexp libraries allow
for patterns that go well beyond the formal language concept of
regular expressions.

Still, I’m not sure how you would defined nested strings using a single
quoting character:

“stuff"otherstuff"morestuff”

Is that an example of a nested string or just two strings with
otherstuff
inbetween?

Gary W.

“stuff"otherstuff"morestuff”

Is that an example of a nested string or just two strings with
otherstuff inbetween?

That’s ambiguous. Oh noes!

I am a sort of disorganised person. I was thinking of matching
parenthesis, which you can’t do with “regular” regular expressions.

Really I just made a stupid mistake!
I was thinking of nested stings like so “”\"\""".

All you would have to look for is “, since \” ends with ". So that
can be done without variable amounts of memory.

Hence, I sort of just blabbed without engaging the brain properly,
leading to this discussion. Oh noes!

I’m not really sure why my mind drips out my ears, but hey I have some
cheese and wrote some ruby so it’s all good.

Johnny

On May 26, 6:14pm, 7stud wrote:

end
end

p combined_lines

–output:–
["“Only one line.”", ““line 1 line 2.””]

That code almost works. I changed the one line to “temp[-1,1]” and
then it worked! =)

This looks exactly like what I need. Thanks!

Cheers! Paul.

Johnny M. wrote in post #1001292:

I don’t think any regex, being a finite automaton, could handle nested
strings properly.

There are regexes for nested parentheses–they use recursive regexes.

I think this may be solution about that problem

irb(main):027:0> a=" "
=> " "
irb(main):028:0> x=[“line 1”,“line2”,“line3”]
=> [“line 1”, “line2”, “line3”]
irb(main):029:0> x.each do |r|
irb(main):030:1* r=a+r
irb(main):031:1* a=r+" "
irb(main):032:1> end
=> [“line 1”, “line2”, “line3”]
irb(main):033:0> puts a
line 1 line2 line3
=> nil
irb(main):034:0>

by
bdeveloper01

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs