Retrieving and copying element from array

luislavena · January 13, 2011, 8:06pm

If I have an array like this:

[“category: cat1”,
“item1, item2, item3”,
“category: cat2”,
“item1”,
“category: cat3”,
“item1, item2, item3, item4”,]

How can I have a new array like this:

[[“cat1”, “item1”, “item2”, item3"],
[“cat2”, “item1”],
[“cat3”, “item1”, “item2”, “item3”, “item4”]
]

Thanks for the help.

simonh · January 13, 2011, 8:34pm

[“cat2”, “item1”],
[“cat3”, “item1”, “item2”, “item3”, “item4”]
]

If your initial array is called ‘list’ :

result = []
list.each_slice(2) {|i, j| result.push(i.sub(/category: /, ‘’));
b.push(*j.split(’, '))}

Iterate over your list in pairs (each_slice), and remove 'category: ’
from the first element, while split the second element over ', ’ and
append the result to an array.

simonh · January 13, 2011, 9:09pm

arr.grep(/category:.*/).map{|a| arr.at(arr.index(a) +1).split(",")}

will work as long as there is only 1 element after the ‘category’

simonh · January 13, 2011, 9:19pm

Thanks for reply, Anurag. Can’t get this to work though:

irb(main):001:0> lines = File.readlines(‘test.txt’)
=> [“category: cat1\n”, " item1\n", " item2\n", " item3\n", “category:
cat2\n”, " item1\n", “category: cat3\n”, " item1\n", " item2\n", "
item3\n", " item4\n", “\n”]
irb(main):002:0> puts lines
category: cat1
item1
item2
item3
category: cat2
item1
category: cat3
item1
item2
item3
item4

=> nil
irb(main):003:0> result = []
=> []
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, ‘’)); b.push(*j.split(’, ‘))}
NameError: undefined local variable or method b' for main:Object from (irb):4:inblock in irb_binding’
from (irb):4:in each' from (irb):4:ineach_slice’
from (irb):4
from /usr/local/bin/irb:12:in <main>' irb(main):005:0> lines.each_slice(2) { |i, j| result.push(i.sub(/category: /, '')); j.push(*j.split(', '))} NoMethodError: undefined methodpush’ for " item1\n":String
from (irb):5:in block in irb_binding' from (irb):5:ineach’
from (irb):5:in each_slice' from (irb):5 from /usr/local/bin/irb:12:in’

simonh · January 14, 2011, 10:03pm

Josh: Thanks for the reply and link to your github example. The thing
is, this data is coming from a text file. An export from an MS Access
database. I wouldn’t choose to save in that format.

simonh · January 13, 2011, 11:23pm

item3
=> []
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, ‘’)); b.push(*j.split(’, ‘))}
NameError: undefined local variable or method `b’ for main:Object

My bad; typed in wrong. The ‘b’ should be ‘result’ - we want to
collect the processed element in the same array.

simonh · January 14, 2011, 12:01am

On Thu, Jan 13, 2011 at 2:19 PM, Simon H.
[email protected]wrote:

item3
=> []
NoMethodError: undefined method push' for " item1\n":String from (irb):5:in block in irb_binding’
from (irb):5:in each' from (irb):5:in each_slice’
from (irb):5
from /usr/local/bin/irb:12:in `’

–
Posted via http://www.ruby-forum.com/.

I recommend you don’t store your data like this, it is fragile and error
prone. You can see your data already does not look like you have said in
your first post, each item in cat1 is its own line (ie index 1 in your
first
post is “item1, item2, item3” but in your actual data, it is “item1\n”,
so
even after you fix the part where he said b.push instead of result.push,
it
will still be wrong.

I recommend using a real data format such as yaml, xml, or json. It’s
actually much easier to get started with this than you think, you can
just
build the data in memory how you want it to look, then tell YAML to
convert
it, and store it in a file. Ta-da, a valid YAML representation of your
data.
Here is an example with this data Showing how I would handle some data by using yaml instead of an ad hoc format. · GitHub

It is slightly different in that I read them into hashes, because I
dislike
storing category and items in the same array – if it were me, I might
even
go a step further and store them in a struct instead of a hash.

simonh · January 15, 2011, 12:55pm

On Fri, Jan 14, 2011 at 3:03 PM, Simon H.
[email protected]wrote:

Josh: Thanks for the reply and link to your github example. The thing
is, this data is coming from a text file. An export from an MS Access
database. I wouldn’t choose to save in that format.

–
Posted via http://www.ruby-forum.com/.

Hi, Simon. Okay, well, if we assume that all data will be nested below a
category, and a category is denoted by "category: ", and there isn’t
leading
or trailing whitespace, and all data under the category is given on one
line, then this should work with your data format.

goal format for the data, as given in the original post

goal = [
[“cat1”, “item1”, “item2”, “item3”],
[“cat2”, “item1”],
[“cat3”, “item1”, “item2”, “item3”, “item4”]
]

categories = Array.new
File.foreach “test.txt” do |line|
if line =~ /^category:/
categories << [ line.sub(/^category: /,‘’).chomp ]
else
categories.last << line.strip
end
end

goal == categories # => true

puts File.read(‘test.txt’)

>> category: cat1

>> item1

>> item2

>> item3

>> category: cat2

>> item1

>> category: cat3

>> item1

>> item2

>> item3

>> item4

simonh · January 15, 2011, 3:54pm

On Sat, Jan 15, 2011 at 8:10 AM, Simon H.
[email protected]wrote:

That works great, thanks Josh. A couple of questions if you don’t mind.

What is the purpose of the [] in line below? Does it mean collect
whatever matches into an array?

categories << [ line.sub(/^category: /,‘’).chomp ]

Yes, but not whatever matches. The call to #sub, with the second arg
being
an empty string, says to remove "category: " from the string, if it is
at
the beginning. And the chomp removes the newline. So if line is
“category:
cat1\n”, then line.sub(/^category: /,‘’).chomp will return “cat1”. Then
we
stick that in the Array

irb(main):041:1* if item =~ /^cat/
irb(main):042:2> arr2 << [ item ]
irb(main):043:2> else
irb(main):044:2* arr2.last << item
irb(main):045:2> end
irb(main):046:1> end

You are right on, here, just getting confused about your data format,
again.
Your code will work correctly if arr is an array of the lines of your
file,
such as you would get with File.readlines.

In other words, in your irb example,
arr is [[“cat1”, “1”, “2”, “3”], [“cat2”, “1”, “2”], [“cat3”, “1”, “2”]]

but in mine, it was read in straight from the file, so it would be
[“cat1”, “1”, “2”, “3”, “cat2”, “1”, “2”, “cat3”, “1”, “2”]

If you fix that, it will work correctly.

As a side note, you are doing arr.map (
module Enumerable - RDoc Documentation), but what you
really mean is arr.each
(class Array - RDoc Documentation).
It isn’t harming anything, but it is misleading, because map implies you
are
trying to create a new array by collecting the results of the blocks for
each element, but really you are just trying to iterate.

simonh · January 15, 2011, 3:09pm

That works great, thanks Josh. A couple of questions if you don’t mind.

What is the purpose of the [] in line below? Does it mean collect
whatever matches into an array?

categories << [ line.sub(/^category: /,’’).chomp ]

I’ve tried to achieve the same result using an existing array, rather
than reading from the file and I’m stuck. I’m using JRuby 1.6RC1 and
getting this error about NilClass. Any ideas?

irb(main):038:0> arr2 = []
irb(main):039:0> arr
=> [[“cat1”, “1”, “2”, “3”], [“cat2”, “1”, “2”], [“cat3”, “1”, “2”]]

irb(main):040:0> arr.map do |item|
irb(main):041:1* if item =~ /^cat/
irb(main):042:2> arr2 << [ item ]
irb(main):043:2> else
irb(main):044:2* arr2.last << item
irb(main):045:2> end
irb(main):046:1> end

NoMethodError: undefined method <<' for nil:NilClass from (irb):44:inevaluate’
from org/jruby/RubyArray.java:2460:in collect' from (irb):40:inevaluate’
from org/jruby/RubyKernel.java:1091:in eval' from /opt/jruby/lib/ruby/1.8/irb.rb:158:ineval_input’
from /opt/jruby/lib/ruby/1.8/irb.rb:271:in signal_status' from /opt/jruby/lib/ruby/1.8/irb.rb:270:insignal_status’
from /opt/jruby/lib/ruby/1.8/irb.rb:155:in eval_input' from org/jruby/RubyKernel.java:1421:inloop’
from org/jruby/RubyKernel.java:1194:in rbCatch' from /opt/jruby/lib/ruby/1.8/irb.rb:154:ineval_input’
from /opt/jruby/lib/ruby/1.8/irb.rb:71:in start' from org/jruby/RubyKernel.java:1194:inrbCatch’
from /opt/jruby/lib/ruby/1.8/irb.rb:70:in `start’

irb(main):047:0> arr
=> [[“cat1”, “1”, “2”, “3”], [“cat2”, “1”, “2”], [“cat3”, “1”, “2”]]
irb(main):048:0> arr2
=> []
irb(main):049:0> arr2.empty?
=> true

irb(main):052:0> arr.each do |item|
irb(main):053:1* if item =~ /^cat/
irb(main):054:2> arr2 << item
irb(main):055:2> else
irb(main):056:2* arr2.last << item
irb(main):057:2> end
irb(main):058:1> end

NoMethodError: undefined method <<' for nil:NilClass from (irb):56:inevaluate’
from org/jruby/RubyArray.java:1671:in each' from (irb):52:inevaluate’
from org/jruby/RubyKernel.java:1091:in eval' from /opt/jruby/lib/ruby/1.8/irb.rb:158:ineval_input’
from /opt/jruby/lib/ruby/1.8/irb.rb:271:in signal_status' from /opt/jruby/lib/ruby/1.8/irb.rb:270:insignal_status’
from /opt/jruby/lib/ruby/1.8/irb.rb:155:in eval_input' from org/jruby/RubyKernel.java:1421:inloop’
from org/jruby/RubyKernel.java:1194:in rbCatch' from /opt/jruby/lib/ruby/1.8/irb.rb:154:ineval_input’
from /opt/jruby/lib/ruby/1.8/irb.rb:71:in start' from org/jruby/RubyKernel.java:1194:inrbCatch’
from /opt/jruby/lib/ruby/1.8/irb.rb:70:in `start’
irb(main):059:0>

simonh · January 15, 2011, 4:36pm

You’ve been very helpful. Thanks again Josh.