Forum: Ruby How to parse a "line"?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Martin S. (Guest)
on 2009-04-27 00:48
Hi there:

I have a line, the format is like this
"num_of_item item1 item2...item_n" for example

"3 Tokyo Newyork Paris"

I want ruby parse this format, extract these keywords and save them in
an array.
I googled and search many forums, but can't find any matches.

Would anyone give me any ideas??
Thanks a lot!
James G. (Guest)
on 2009-04-27 00:57
(Received via mailing list)
On Apr 26, 2009, at 3:48 PM, Martin S. wrote:

>
> Would anyone give me any ideas??

If you just need to bust up the words via the spaces, this should
probably work for you:

 >> items = "3 Tokyo Newyork Paris".split[1..-1]
=> ["Tokyo", "Newyork", "Paris"]

James Edward G. II
Robert D. (Guest)
on 2009-04-27 01:12
(Received via mailing list)
On Sun, Apr 26, 2009 at 10:56 PM, James G. 
<removed_email_address@domain.invalid>
wrote:
>> an array.
>> I googled and search many forums, but can't find any matches.
>>
>> Would anyone give me any ideas??
>
> If you just need to bust up the words via the spaces, this should probably
> work for you:
>
>>> items = "3 Tokyo Newyork Paris".split[1..-1]
> => ["Tokyo", "Newyork", "Paris"]

This is another opportunity to show Ruby's versatile style, e.g.
functional

_, *items = "3 .a b c".split

R.
Todd B. (Guest)
on 2009-04-27 02:56
(Received via mailing list)
On Sun, Apr 26, 2009 at 4:11 PM, Robert D. 
<removed_email_address@domain.invalid>
wrote:
>>> I want ruby parse this format, extract these keywords and save them in
>
> This is another opportunity to show Ruby's versatile style, e.g. functional
>
> _, *items = "3 .a b c".split

I like...

(items = my_item_list).split.shift

Both you and I have examples, though, that do not return what the
statement using [1..-1] does.  Mine, returns 3.  Ah well, "items"
exists, though :)

Todd
Todd B. (Guest)
on 2009-04-27 03:00
(Received via mailing list)
On Sun, Apr 26, 2009 at 5:55 PM, Todd B. <removed_email_address@domain.invalid>
wrote:
>
> (items = my_item_list).split.shift

Oops...

(items = my_item_list.split).shift

Todd
Martin S. (Guest)
on 2009-04-27 04:25
Thank you Todd, but the number of the keywords are dynamic.

and the array size is not fixed until program reads the first keywords
of the input line.
The line may be "3 tokyo newyork paris", also maybe
"6 toyota bmw honda GM Ford"

So I can't hard coded this array in the program.
How can I parse the line?

Thanks!
Todd B. wrote:
> On Sun, Apr 26, 2009 at 5:55 PM, Todd B. <removed_email_address@domain.invalid>
> wrote:
>>
>> (items = my_item_list).split.shift
>
> Oops...
>
> (items = my_item_list.split).shift
>
> Todd
7stud -. (Guest)
on 2009-04-27 05:08
Martin S. wrote:
> Thank you Todd, but the number of the keywords are dynamic.
>
> and the array size is not fixed until program reads the first keywords
> of the input line.
>


Explain why these solutions won't work for you:

strs = ["6 toyota bmw honda GM Ford", "3 tokyo newyork paris"]

strs.each do |str|
  puts "1) split with subscripts:"
  p str.split()[1..-1]

  puts "2) split with parallel assignment:"
  first, *therest = str.split()
  p therest

  puts
end

--output:--
1) split with subscripts:
["toyota", "bmw", "honda", "GM", "Ford"]
2) split with parallel assignment:
["toyota", "bmw", "honda", "GM", "Ford"]

1) split with subscripts:
["tokyo", "newyork", "paris"]
2) split with parallel assignment:
["tokyo", "newyork", "paris"]
James G. (Guest)
on 2009-04-27 06:06
(Received via mailing list)
On Apr 26, 2009, at 7:25 PM, Martin S. wrote:

> Thank you Todd, but the number of the keywords are dynamic.
>
> and the array size is not fixed until program reads the first keywords
> of the input line.
> The line may be "3 tokyo newyork paris", also maybe
> "6 toyota bmw honda GM Ford"
>
> So I can't hard coded this array in the program.
> How can I parse the line?

Did you try our code?  All of our solutions work for all examples
you've posted so far…  :)

James Edward G. II
Todd B. (Guest)
on 2009-04-27 06:11
(Received via mailing list)
On Sun, Apr 26, 2009 at 7:25 PM, Martin S. 
<removed_email_address@domain.invalid>
wrote:
> Thank you Todd, but the number of the keywords are dynamic.
>
> and the array size is not fixed until program reads the first keywords
> of the input line.
> The line may be "3 tokyo newyork paris", also maybe
> "6 toyota bmw honda GM Ford"
>
> So I can't hard coded this array in the program.
> How can I parse the line?

If you look closely, you'll see that James' method grabs everything
from 1 to -1 (the end) of the array, omitting the zeroth element.

Robert's method assigns the garbage -- in this case, the whole array
-- to a dummy variable, and the important stuff to the var that you
care about (items).

My way was only slightly different.  I opted to create the whole
array, and then drop the first without creating an extra object.  I
had to use (items = my_item_list.split) inside parens like that
because #shift returns the object you popped off the front of the
list, and not the actual remaining stuff.

In all examples given, the size doesn't really matter... hah!  Or as
they say, it depends :-)

There are probably several ways to do this; I just like that particular
one.

hth,
Todd
Todd B. (Guest)
on 2009-04-27 06:48
(Received via mailing list)
On Sun, Apr 26, 2009 at 9:11 PM, Todd B. <removed_email_address@domain.invalid>
wrote:
>
> because #shift returns the object you popped off the front of the
> list, and not the actual remaining stuff.
>
> In all examples given, the size doesn't really matter... hah!  Or as
> they say, it depends :-)
>
> There are probably several ways to do this; I just like that particular one.

You know after looking at your original post, maybe you are trying to
create a digest, i.e. a Hash or associative array, with your keys as
the first "column".

In that case, you might try...

h = {}
f = my_string_list.each_line do |line|
  key = (value = line.split).shift
  h[key] = value
end

...or something like that.

Todd
Todd B. (Guest)
on 2009-04-27 06:57
(Received via mailing list)
On Sun, Apr 26, 2009 at 9:47 PM, Todd B. <removed_email_address@domain.invalid>
wrote:
>
> In that case, you might try...
>
> h = {}
> f = my_string_list.each_line do |line|
>  key = (value = line.split).shift
>  h[key] = value
> end
>
> ...or something like that.

It's late and I'm golfing with myself.  Apologies to the list for this
almost one-liner...

h = {}
s.each_line{|i|h[(v=i.split).shift]=v}
Martin S. (Guest)
on 2009-04-27 07:06
Oh my god.....

You and Jaem's codes really work!

Thanks a lot!
Todd B. wrote:
> My way was only slightly different.  I opted to create the whole
> array, and then drop the first without creating an extra object.  I
> had to use (items = my_item_list.split) inside parens like that
> because #shift returns the object you popped off the front of the
> list, and not the actual remaining stuff.
>
> In all examples given, the size doesn't really matter... hah!  Or as
> they say, it depends :-)
>
> There are probably several ways to do this; I just like that particular
> one.
>
> hth,
> Todd
Martin S. (Guest)
on 2009-04-27 07:32
it works!

s.each instead of s.each_line

thanks!

Todd B. wrote:
> On Sun, Apr 26, 2009 at 9:47 PM, Todd B. <removed_email_address@domain.invalid>
> wrote:
>>
>> In that case, you might try...
>>
>> h = {}
>> f = my_string_list.each_line do |line|
>>  key = (value = line.split).shift
>>  h[key] = value
>> end
>>
>> ...or something like that.
>
> It's late and I'm golfing with myself.  Apologies to the list for this
> almost one-liner...
>
> h = {}
> s.each_line{|i|h[(v=i.split).shift]=v}
Todd B. (Guest)
on 2009-04-27 07:35
(Received via mailing list)
On Sun, Apr 26, 2009 at 10:32 PM, Martin S. 
<removed_email_address@domain.invalid>
wrote:
> it works!
>
> s.each instead of s.each_line

Using 1.9.1.  Sorry I didn't mention that.  I haven't been following
the differences, but I seem to recall that was a change somewhere
along the line.

Todd
7stud -. (Guest)
on 2009-04-27 07:59
Todd B. wrote:
> My way was only slightly different.  I opted to create the whole
> array, and then drop the first without creating an extra object.

Note that you made ruby go through the laborious chore of shifting every
element in the array over one spot to the left.
Todd B. (Guest)
on 2009-04-27 08:05
(Received via mailing list)
On Sun, Apr 26, 2009 at 10:59 PM, 7stud -- 
<removed_email_address@domain.invalid>
wrote:
> Todd B. wrote:
>> My way was only slightly different.  I opted to create the whole
>> array, and then drop the first without creating an extra object.
>
> Note that you made ruby go through the laborious chore of shifting every
> element in the array over one spot to the left.

Is that really what happens?  I thought it just reassigned a starting
point, and indexed from there.  At least, that would make more sense
to me if it did.

Todd
Todd B. (Guest)
on 2009-04-27 08:08
(Received via mailing list)
On Sun, Apr 26, 2009 at 11:03 PM, Todd B. <removed_email_address@domain.invalid>
wrote:
> to me if it did.
Let me rephrase.  What is different on the underlying structure with
[1..-1] and #shift?  I'm not sure if shift really moves everything,
but if it does, I wouldn't mind knowing.

Todd
Harry K. (Guest)
on 2009-04-27 08:41
(Received via mailing list)
On Mon, Apr 27, 2009 at 1:07 PM, Todd B. <removed_email_address@domain.invalid>
wrote:
>> point, and indexed from there.  At least, that would make more sense
>> to me if it did.
>
> Let me rephrase.  What is different on the underlying structure with
> [1..-1] and #shift?  I'm not sure if shift really moves everything,
> but if it does, I wouldn't mind knowing.
>
> Todd
>
>

If it shifts everything, it sure does it fast :)

p Time.now
arr = (1..100_000_000).to_a
p Time.now
arr.shift
p arr[0]
p Time.now


###output

Mon Apr 27 13:34:06 +0900 2009
Mon Apr 27 13:36:29 +0900 2009
2
Mon Apr 27 13:36:29 +0900 2009


Harry
7stud -. (Guest)
on 2009-04-27 08:45
Todd B. wrote:
> On Sun, Apr 26, 2009 at 11:03 PM, Todd B. <removed_email_address@domain.invalid>
> wrote:
>> to me if it did.
> Let me rephrase.  What is different on the underlying structure with
> [1..-1] and #shift?  I'm not sure if shift really moves everything,
> but if it does, I wouldn't mind knowing.
>
> Todd

shift changes the array in place:

1)
arr = [10, 20, 30]

y = arr.shift
puts y
p arr

--output:--
10
[20, 30]


subscripts create a new array:

2)
arr = [10, 20, 30]
p arr
print "id: ", arr.object_id
puts

x = arr[1..-1]
p x
print "id: ", x.object_id
puts
p arr

--output:--
[10, 20, 30]
id: 75950
[20, 30]
id: 75860
[10, 20, 30]

> Is that really what happens?  I thought it just reassigned a starting
> point, and indexed from there.  At least, that would make more sense
> to me if it did.

Would it?  What if you had an array that took up 3GB of memory and you
shifted off every element but the last one.  Would you expect your array
to still occupy 3GB of memory?
Todd B. (Guest)
on 2009-04-27 09:06
(Received via mailing list)
On Sun, Apr 26, 2009 at 11:45 PM, 7stud -- 
<removed_email_address@domain.invalid>
wrote:
> shift changes the array in place:
I looked at an old thread where Bob Hutchinson piped in about
potential GC problems with this, but that was back in '07

> [20, 30]
> x = arr[1..-1]
new object is created by x =

> [10, 20, 30]
>
>> Is that really what happens?  I thought it just reassigned a starting
>> point, and indexed from there.  At least, that would make more sense
>> to me if it did.
>
> Would it?  What if you had an array that took up 3GB of memory and you
> shifted off every element but the last one.  Would you expect your array
> to still occupy 3GB of memory?

I would hope that the GC could keep up, and I'd plan accordingly if
not.  But if your 3GB array was duped/cloned, hmm, is that better?
Please tell me we are talking about two different arguments.

Todd
Martin S. (Guest)
on 2009-04-27 09:31
h = {}

strs.each{|i|h[(v=i.split).shift]=v}

The array size is 1 by this sentense...

Todd B. wrote:
> On Sun, Apr 26, 2009 at 11:45 PM, 7stud -- <removed_email_address@domain.invalid>
> wrote:
>> shift changes the array in place:
> I looked at an old thread where Bob Hutchinson piped in about
> potential GC problems with this, but that was back in '07
>
>> [20, 30]
>> x = arr[1..-1]
> new object is created by x =
>
>> [10, 20, 30]
>>
Todd B. (Guest)
on 2009-04-27 09:47
(Received via mailing list)
On Mon, Apr 27, 2009 at 12:31 AM, Martin S. 
<removed_email_address@domain.invalid>
wrote:
>> I looked at an old thread where Bob Hutchinson piped in about
>> potential GC problems with this, but that was back in '07
>>
>>> [20, 30]
>>> x = arr[1..-1]
>> new object is created by x =
>>
>>> [10, 20, 30]
>>>

Maybe we don't know exactly what it is you are trying to do.  strs
was, in my example, supposed to be a large list separated by new
lines, each containing something like...

number city city city ...

I just assumed that you wanted a list with an identifying marker,
which would be most likely a Hash.

You can iterate over hashes, just like arrays, but the nomenclature
can be different.

What do you have as data and what do you want to do with it?

Todd
Martin S. (Guest)
on 2009-04-27 16:49
:)I am really confused with ruby now.
OK, what I want to do is this, let me describe by c

input is one line: number city1 city2....city2

for(i=0;i<number;i++)
{
   function(cityi)
}


These ruby sentenses
h = {}

strs.each{|i|h[(v=i.split).shift]=v}

can split the city names, but can I loop h{}? how?

Thanks!

> Maybe we don't know exactly what it is you are trying to do.  strs
> was, in my example, supposed to be a large list separated by new
> lines, each containing something like...
>
> number city city city ...
>
> I just assumed that you wanted a list with an identifying marker,
> which would be most likely a Hash.
>
> You can iterate over hashes, just like arrays, but the nomenclature
> can be different.
>
> What do you have as data and what do you want to do with it?
>
> Todd
Mark T. (Guest)
on 2009-04-27 17:10
(Received via mailing list)
On Apr 27, 8:50 am, Martin S. <removed_email_address@domain.invalid> wrote:
> :)I am really confused with ruby now.
> OK, what I want to do is this, let me describe by c
>
> input is one line: number city1 city2....city2
>
> for(i=0;i<number;i++)
> {
>    function(cityi)
>
> }

Is that all you want to do? Once you have the cities,

cities.each do |city|
  function(city)
end

So, putting it all together,

lines.each do |line|
  (cities = line.split).shift
  cities.each do |city|
    function(city)
  end
end

You don't really need the number at all.

-- Mark.
Todd B. (Guest)
on 2009-04-28 00:39
(Received via mailing list)
On Sun, Apr 26, 2009 at 9:11 PM, Todd B. <removed_email_address@domain.invalid>
wrote:
> had to use (items = my_item_list.split) inside parens like that
>
That looks like I'm buoying my idea.  I think any single one of these
methods will work just fine.  I'm somewhat leaning towards [1..-1]
because it's very clear what's happening.

I'll stick with my #shift for now because it makes logical sense to
me.  I'm not overly concerned in my work with speed or memory use.
James knows what he's talking about so really listen to him.

Todd
This topic is locked and can not be replied to.