Forum: Ruby How to parse a "line"?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Ef876fc6e1baa12cef2c381ea6feabbe?d=identicon&s=25 Martin Sharon (martinh)
on 2009-04-26 22:48
Hi there:

I have a line, the format is like this
"num_of_item item1 item2...item_n" for example

"3 Tokyo Newyork Paris"

I want ruby parse this format, extract these keywords and save them in
an array.
I googled and search many forums, but can't find any matches.

Would anyone give me any ideas??
Thanks a lot!
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2009-04-26 22:57
(Received via mailing list)
On Apr 26, 2009, at 3:48 PM, Martin Sharon wrote:

>
> Would anyone give me any ideas??

If you just need to bust up the words via the spaces, this should
probably work for you:

 >> items = "3 Tokyo Newyork Paris".split[1..-1]
=> ["Tokyo", "Newyork", "Paris"]

James Edward Gray II
703fbc991fd63e0e1db54dca9ea31b53?d=identicon&s=25 Robert Dober (Guest)
on 2009-04-26 23:12
(Received via mailing list)
On Sun, Apr 26, 2009 at 10:56 PM, James Gray <james@grayproductions.net>
wrote:
>> an array.
>> I googled and search many forums, but can't find any matches.
>>
>> Would anyone give me any ideas??
>
> If you just need to bust up the words via the spaces, this should probably
> work for you:
>
>>> items = "3 Tokyo Newyork Paris".split[1..-1]
> => ["Tokyo", "Newyork", "Paris"]

This is another opportunity to show Ruby's versatile style, e.g.
functional

_, *items = "3 .a b c".split

R.
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 00:56
(Received via mailing list)
On Sun, Apr 26, 2009 at 4:11 PM, Robert Dober <robert.dober@gmail.com>
wrote:
>>> I want ruby parse this format, extract these keywords and save them in
>
> This is another opportunity to show Ruby's versatile style, e.g. functional
>
> _, *items = "3 .a b c".split

I like...

(items = my_item_list).split.shift

Both you and I have examples, though, that do not return what the
statement using [1..-1] does.  Mine, returns 3.  Ah well, "items"
exists, though :)

Todd
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 01:00
(Received via mailing list)
On Sun, Apr 26, 2009 at 5:55 PM, Todd Benson <caduceass@gmail.com>
wrote:
>
> (items = my_item_list).split.shift

Oops...

(items = my_item_list.split).shift

Todd
Ef876fc6e1baa12cef2c381ea6feabbe?d=identicon&s=25 Martin Sharon (martinh)
on 2009-04-27 02:25
Thank you Todd, but the number of the keywords are dynamic.

and the array size is not fixed until program reads the first keywords
of the input line.
The line may be "3 tokyo newyork paris", also maybe
"6 toyota bmw honda GM Ford"

So I can't hard coded this array in the program.
How can I parse the line?

Thanks!
Todd Benson wrote:
> On Sun, Apr 26, 2009 at 5:55 PM, Todd Benson <caduceass@gmail.com>
> wrote:
>>
>> (items = my_item_list).split.shift
>
> Oops...
>
> (items = my_item_list.split).shift
>
> Todd
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2009-04-27 03:08
Martin Sharon wrote:
> Thank you Todd, but the number of the keywords are dynamic.
>
> and the array size is not fixed until program reads the first keywords
> of the input line.
>


Explain why these solutions won't work for you:

strs = ["6 toyota bmw honda GM Ford", "3 tokyo newyork paris"]

strs.each do |str|
  puts "1) split with subscripts:"
  p str.split()[1..-1]

  puts "2) split with parallel assignment:"
  first, *therest = str.split()
  p therest

  puts
end

--output:--
1) split with subscripts:
["toyota", "bmw", "honda", "GM", "Ford"]
2) split with parallel assignment:
["toyota", "bmw", "honda", "GM", "Ford"]

1) split with subscripts:
["tokyo", "newyork", "paris"]
2) split with parallel assignment:
["tokyo", "newyork", "paris"]
4299e35bacef054df40583da2d51edea?d=identicon&s=25 James Gray (bbazzarrakk)
on 2009-04-27 04:06
(Received via mailing list)
On Apr 26, 2009, at 7:25 PM, Martin Sharon wrote:

> Thank you Todd, but the number of the keywords are dynamic.
>
> and the array size is not fixed until program reads the first keywords
> of the input line.
> The line may be "3 tokyo newyork paris", also maybe
> "6 toyota bmw honda GM Ford"
>
> So I can't hard coded this array in the program.
> How can I parse the line?

Did you try our code?  All of our solutions work for all examples
you've posted so far…  :)

James Edward Gray II
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 04:11
(Received via mailing list)
On Sun, Apr 26, 2009 at 7:25 PM, Martin Sharon <huangshuo.9@gmail.com>
wrote:
> Thank you Todd, but the number of the keywords are dynamic.
>
> and the array size is not fixed until program reads the first keywords
> of the input line.
> The line may be "3 tokyo newyork paris", also maybe
> "6 toyota bmw honda GM Ford"
>
> So I can't hard coded this array in the program.
> How can I parse the line?

If you look closely, you'll see that James' method grabs everything
from 1 to -1 (the end) of the array, omitting the zeroth element.

Robert's method assigns the garbage -- in this case, the whole array
-- to a dummy variable, and the important stuff to the var that you
care about (items).

My way was only slightly different.  I opted to create the whole
array, and then drop the first without creating an extra object.  I
had to use (items = my_item_list.split) inside parens like that
because #shift returns the object you popped off the front of the
list, and not the actual remaining stuff.

In all examples given, the size doesn't really matter... hah!  Or as
they say, it depends :-)

There are probably several ways to do this; I just like that particular
one.

hth,
Todd
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 04:48
(Received via mailing list)
On Sun, Apr 26, 2009 at 9:11 PM, Todd Benson <caduceass@gmail.com>
wrote:
>
> because #shift returns the object you popped off the front of the
> list, and not the actual remaining stuff.
>
> In all examples given, the size doesn't really matter... hah!  Or as
> they say, it depends :-)
>
> There are probably several ways to do this; I just like that particular one.

You know after looking at your original post, maybe you are trying to
create a digest, i.e. a Hash or associative array, with your keys as
the first "column".

In that case, you might try...

h = {}
f = my_string_list.each_line do |line|
  key = (value = line.split).shift
  h[key] = value
end

...or something like that.

Todd
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 04:57
(Received via mailing list)
On Sun, Apr 26, 2009 at 9:47 PM, Todd Benson <caduceass@gmail.com>
wrote:
>
> In that case, you might try...
>
> h = {}
> f = my_string_list.each_line do |line|
>  key = (value = line.split).shift
>  h[key] = value
> end
>
> ...or something like that.

It's late and I'm golfing with myself.  Apologies to the list for this
almost one-liner...

h = {}
s.each_line{|i|h[(v=i.split).shift]=v}
Ef876fc6e1baa12cef2c381ea6feabbe?d=identicon&s=25 Martin Sharon (martinh)
on 2009-04-27 05:06
Oh my god.....

You and Jaem's codes really work!

Thanks a lot!
Todd Benson wrote:
> My way was only slightly different.  I opted to create the whole
> array, and then drop the first without creating an extra object.  I
> had to use (items = my_item_list.split) inside parens like that
> because #shift returns the object you popped off the front of the
> list, and not the actual remaining stuff.
>
> In all examples given, the size doesn't really matter... hah!  Or as
> they say, it depends :-)
>
> There are probably several ways to do this; I just like that particular
> one.
>
> hth,
> Todd
Ef876fc6e1baa12cef2c381ea6feabbe?d=identicon&s=25 Martin Sharon (martinh)
on 2009-04-27 05:32
it works!

s.each instead of s.each_line

thanks!

Todd Benson wrote:
> On Sun, Apr 26, 2009 at 9:47 PM, Todd Benson <caduceass@gmail.com>
> wrote:
>>
>> In that case, you might try...
>>
>> h = {}
>> f = my_string_list.each_line do |line|
>>  key = (value = line.split).shift
>>  h[key] = value
>> end
>>
>> ...or something like that.
>
> It's late and I'm golfing with myself.  Apologies to the list for this
> almost one-liner...
>
> h = {}
> s.each_line{|i|h[(v=i.split).shift]=v}
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 05:35
(Received via mailing list)
On Sun, Apr 26, 2009 at 10:32 PM, Martin Sharon <huangshuo.9@gmail.com>
wrote:
> it works!
>
> s.each instead of s.each_line

Using 1.9.1.  Sorry I didn't mention that.  I haven't been following
the differences, but I seem to recall that was a change somewhere
along the line.

Todd
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2009-04-27 05:59
Todd Benson wrote:
> My way was only slightly different.  I opted to create the whole
> array, and then drop the first without creating an extra object.

Note that you made ruby go through the laborious chore of shifting every
element in the array over one spot to the left.
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 06:05
(Received via mailing list)
On Sun, Apr 26, 2009 at 10:59 PM, 7stud -- <bbxx789_05ss@yahoo.com>
wrote:
> Todd Benson wrote:
>> My way was only slightly different.  I opted to create the whole
>> array, and then drop the first without creating an extra object.
>
> Note that you made ruby go through the laborious chore of shifting every
> element in the array over one spot to the left.

Is that really what happens?  I thought it just reassigned a starting
point, and indexed from there.  At least, that would make more sense
to me if it did.

Todd
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 06:08
(Received via mailing list)
On Sun, Apr 26, 2009 at 11:03 PM, Todd Benson <caduceass@gmail.com>
wrote:
> to me if it did.
Let me rephrase.  What is different on the underlying structure with
[1..-1] and #shift?  I'm not sure if shift really moves everything,
but if it does, I wouldn't mind knowing.

Todd
2f4d4f9c35ea851bffb9a9cc2e086365?d=identicon&s=25 Harry Kakueki (Guest)
on 2009-04-27 06:41
(Received via mailing list)
On Mon, Apr 27, 2009 at 1:07 PM, Todd Benson <caduceass@gmail.com>
wrote:
>> point, and indexed from there.  At least, that would make more sense
>> to me if it did.
>
> Let me rephrase.  What is different on the underlying structure with
> [1..-1] and #shift?  I'm not sure if shift really moves everything,
> but if it does, I wouldn't mind knowing.
>
> Todd
>
>

If it shifts everything, it sure does it fast :)

p Time.now
arr = (1..100_000_000).to_a
p Time.now
arr.shift
p arr[0]
p Time.now


###output

Mon Apr 27 13:34:06 +0900 2009
Mon Apr 27 13:36:29 +0900 2009
2
Mon Apr 27 13:36:29 +0900 2009


Harry
54404bcac0f45bf1c8e8b827cd9bb709?d=identicon&s=25 7stud -- (7stud)
on 2009-04-27 06:45
Todd Benson wrote:
> On Sun, Apr 26, 2009 at 11:03 PM, Todd Benson <caduceass@gmail.com>
> wrote:
>> to me if it did.
> Let me rephrase.  What is different on the underlying structure with
> [1..-1] and #shift?  I'm not sure if shift really moves everything,
> but if it does, I wouldn't mind knowing.
>
> Todd

shift changes the array in place:

1)
arr = [10, 20, 30]

y = arr.shift
puts y
p arr

--output:--
10
[20, 30]


subscripts create a new array:

2)
arr = [10, 20, 30]
p arr
print "id: ", arr.object_id
puts

x = arr[1..-1]
p x
print "id: ", x.object_id
puts
p arr

--output:--
[10, 20, 30]
id: 75950
[20, 30]
id: 75860
[10, 20, 30]

> Is that really what happens?  I thought it just reassigned a starting
> point, and indexed from there.  At least, that would make more sense
> to me if it did.

Would it?  What if you had an array that took up 3GB of memory and you
shifted off every element but the last one.  Would you expect your array
to still occupy 3GB of memory?
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 07:06
(Received via mailing list)
On Sun, Apr 26, 2009 at 11:45 PM, 7stud -- <bbxx789_05ss@yahoo.com>
wrote:
> shift changes the array in place:
I looked at an old thread where Bob Hutchinson piped in about
potential GC problems with this, but that was back in '07

> [20, 30]
> x = arr[1..-1]
new object is created by x =

> [10, 20, 30]
>
>> Is that really what happens?  I thought it just reassigned a starting
>> point, and indexed from there.  At least, that would make more sense
>> to me if it did.
>
> Would it?  What if you had an array that took up 3GB of memory and you
> shifted off every element but the last one.  Would you expect your array
> to still occupy 3GB of memory?

I would hope that the GC could keep up, and I'd plan accordingly if
not.  But if your 3GB array was duped/cloned, hmm, is that better?
Please tell me we are talking about two different arguments.

Todd
Ef876fc6e1baa12cef2c381ea6feabbe?d=identicon&s=25 Martin Sharon (martinh)
on 2009-04-27 07:31
h = {}

strs.each{|i|h[(v=i.split).shift]=v}

The array size is 1 by this sentense...

Todd Benson wrote:
> On Sun, Apr 26, 2009 at 11:45 PM, 7stud -- <bbxx789_05ss@yahoo.com>
> wrote:
>> shift changes the array in place:
> I looked at an old thread where Bob Hutchinson piped in about
> potential GC problems with this, but that was back in '07
>
>> [20, 30]
>> x = arr[1..-1]
> new object is created by x =
>
>> [10, 20, 30]
>>
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 07:47
(Received via mailing list)
On Mon, Apr 27, 2009 at 12:31 AM, Martin Sharon <huangshuo.9@gmail.com>
wrote:
>> I looked at an old thread where Bob Hutchinson piped in about
>> potential GC problems with this, but that was back in '07
>>
>>> [20, 30]
>>> x = arr[1..-1]
>> new object is created by x =
>>
>>> [10, 20, 30]
>>>

Maybe we don't know exactly what it is you are trying to do.  strs
was, in my example, supposed to be a large list separated by new
lines, each containing something like...

number city city city ...

I just assumed that you wanted a list with an identifying marker,
which would be most likely a Hash.

You can iterate over hashes, just like arrays, but the nomenclature
can be different.

What do you have as data and what do you want to do with it?

Todd
Ef876fc6e1baa12cef2c381ea6feabbe?d=identicon&s=25 Martin Sharon (martinh)
on 2009-04-27 14:49
:)I am really confused with ruby now.
OK, what I want to do is this, let me describe by c

input is one line: number city1 city2....city2

for(i=0;i<number;i++)
{
   function(cityi)
}


These ruby sentenses
h = {}

strs.each{|i|h[(v=i.split).shift]=v}

can split the city names, but can I loop h{}? how?

Thanks!

> Maybe we don't know exactly what it is you are trying to do.  strs
> was, in my example, supposed to be a large list separated by new
> lines, each containing something like...
>
> number city city city ...
>
> I just assumed that you wanted a list with an identifying marker,
> which would be most likely a Hash.
>
> You can iterate over hashes, just like arrays, but the nomenclature
> can be different.
>
> What do you have as data and what do you want to do with it?
>
> Todd
134ea397777886d6f0aa992672a50eaa?d=identicon&s=25 Mark Thomas (Guest)
on 2009-04-27 15:10
(Received via mailing list)
On Apr 27, 8:50 am, Martin Sharon <huangshu...@gmail.com> wrote:
> :)I am really confused with ruby now.
> OK, what I want to do is this, let me describe by c
>
> input is one line: number city1 city2....city2
>
> for(i=0;i<number;i++)
> {
>    function(cityi)
>
> }

Is that all you want to do? Once you have the cities,

cities.each do |city|
  function(city)
end

So, putting it all together,

lines.each do |line|
  (cities = line.split).shift
  cities.each do |city|
    function(city)
  end
end

You don't really need the number at all.

-- Mark.
289cf19aa581c445915c072bf45c5e25?d=identicon&s=25 Todd Benson (Guest)
on 2009-04-27 22:39
(Received via mailing list)
On Sun, Apr 26, 2009 at 9:11 PM, Todd Benson <caduceass@gmail.com>
wrote:
> had to use (items = my_item_list.split) inside parens like that
>
That looks like I'm buoying my idea.  I think any single one of these
methods will work just fine.  I'm somewhat leaning towards [1..-1]
because it's very clear what's happening.

I'll stick with my #shift for now because it makes logical sense to
me.  I'm not overly concerned in my work with speed or memory use.
James knows what he's talking about so really listen to him.

Todd
This topic is locked and can not be replied to.