Re: iterate chars in a string


#1

Maybe it’s a
problem with Ruby docs? Or maybe it is just
counter-intuitive - I would expect
each() iterate over bytes, and provide each_lines() to
iterate over lines instead.

Mike

http://www.rubycentral.com/ref/

Modifying String#to_a to return an array of characters has been brought
up before (ruby-talk:148588 and following). I don’t think Matz likes
the idea, though.

Dan


#2

Mike I really agree, I was expecting that behavior from “each” too,
some
time ago, and it took me some time to figure it out
BTW I think the “nicest” solutions is mine combined with Robert’s (pun
intended)

“Ty Mr. Klemme”.each_byte { |b| puts b.chr }

my “%” stuff was rather clumsy.

bye for now
Robert

On 3/20/06, Berger, Daniel removed_email_address@domain.invalid wrote:

I’m surprised that not many people knew about ‘each_byte()’.
Modifying String#to_a to return an array of characters has been brought
up before (ruby-talk:148588 and following). I don’t think Matz likes
the idea, though.

Dan


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

#3

The thing I don’t like about this behaviour is that an algorithm which
operates on containers and expects #each can’t work immediately with
strings.

Daniel Tse


#4

jogloran wrote:

The thing I don’t like about this behaviour is that an algorithm which
operates on containers and expects #each can’t work immediately with
strings.

That’s not true. It just depends on what you consider to be the parts
of a string. I’d agree that naturally one would expect characters to be
that - but “lines” is another option. And that’s the one that has been
chose by Matz. It has some advantages, too, e.g. if you slurp in a file
into a single string and then want to iterate the lines.

Kind regards

robert

#5

On Tue, Mar 21, 2006 at 05:58:50PM +0900, Robert K. wrote:

jogloran wrote:

The thing I don’t like about this behaviour is that an algorithm which
operates on containers and expects #each can’t work immediately with
strings.

That’s not true. It just depends on what you consider to be the parts
of a string. I’d agree that naturally one would expect characters to be
that - but “lines” is another option. And that’s the one that has been
chose by Matz. It has some advantages, too, e.g. if you slurp in a file
into a single string and then want to iterate the lines.

Besides,

RUBY_VERSION # => “1.8.4”
require ‘enumerator’

def foo(x)
x.each{|y| p y.chr}
end

str = “foo bar baz”
foo(str.enum_for(:each_byte))

>> “f”

>> “o”

>> “o”

>> " "

>> “b”

>> “a”

>> “r”

>> " "

>> “b”

>> “a”

>> “z”


#6

Would it not be nice to simply extend the behavior of String#each e.g.
like
this

“Really nothing intelligent to tell”.each <some_intelligent_choice> do
|c|
puts c
end
==>
R
e
etc. etc.

<some_intelligent_choice> might be 1 (imagine what n could do!)
[ we could even formulate endless loops like this
“”.each 0 do
puts “Eternity is a hack of a long time”
end
that is really great :wink:
]
or shall everybody do it oneself

class String
def my_iterator …

I just do not think so.

Cheers
Robert

On 3/21/06, Robert K. removed_email_address@domain.invalid wrote:

into a single string and then want to iterate the lines.

Kind regards

    robert


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

#7

Robert D. wrote:

Please don’t top post.

etc. etc.

<some_intelligent_choice> might be 1 (imagine what n could do!)

We have that already: it’s Enumerator.

irb(main):003:0> require ‘enumerator’
=> true
irb(main):004:0> “foo\nbar”.to_enum(:each_byte).each {|c| puts c.chr}
f
o
o

b
a
r
=> “foo\nbar”
irb(main):005:0> “foo\nbar”.to_enum(:scan, /./m).each {|c| puts c}
f
o
o

b
a
r
=> “foo\nbar”

Cheers

robert

#8

Sorry Robert but I do not think I made myself too clear

we now about each_byte and I already combined your #chr and
String#each_byte

so we had already established

“Ty Mr. Klemme”.each_byte { |b| puts b.chr }

which is pretty elegant, don’t you agree?
So Enumerators are not needed.

Now after that we switched discussing the behaviour of String#each and I
really feal that String#each should us give that kind of behaviour.
As I see that a lot of people think that the current behaviour of
String#each is a good one, and changing it would not be an option anyway
I
thaught that the only solution would be to enhance the behaviour of
String#each.
The idea for that behaviour comes from Ruby itself (look at IO#gets)

then

“Ty Mr. Klemme”.each_byte { |b| puts b.chr }

would be the same as

"Ty Mr. Klemme".each( 1 ) {  |b| puts b }

There is always more than one way to do it :wink:

Robert

On 3/21/06, Robert K. removed_email_address@domain.invalid wrote:

      |c|

=> “foo\nbar”
Cheers

    robert


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein

#9

Ross B. wrote:

So Enumerators are not needed.
just_iterate(“abc”)
String#each is a good one, and changing it would not be an option anyway I
“Ty Mr. Klemme”.each( 1 ) { |b| puts b }

“jkl”

“abc\ndef\nghi\njkl\n”.enum_slice(2).each { |e| p e }
“abc\ndef\nghi\njkl\n”.enum_for(:each_byte).enum_slice(2).map do |(a,b)|
(a + b).chr
end

=> ["\303", “m”, “\311”, “p”, “\317”, “s”, “\325”, “v”]

How would map and other Enumerable methods work if ‘each’ needed
arguments?

Maybe a little off the topic, but I still think it would be pretty nifty
if
calling enumerators without a block returned it’s own enum_for():

“abcdefg”.each_byte.each_slice(2).map do |a, b|
(a + b).chr
end

To me it feels like partial application. It can’t do anything without a
block,
so it simply passes its enumerator. No more messy 'enum_for()'s
littered
everywhere. Here’s another example:

“abcdefg”.each_byte.with_index.map do |a, i|
a + i
end

Just my 2c, and thinking out loud
Mike


#10

Mike A. wrote:

Maybe a little off the topic, but I still think it would be pretty nifty
if calling enumerators without a block returned it’s own enum_for():

“abcdefg”.each_byte.each_slice(2).map do |a, b|
(a + b).chr
end

I believe Ruby 1.9 (experimental) has this feature.

Cheers,
Dave


#11

On Tue, 2006-03-21 at 18:47 +0900, Robert D. wrote:

Duck typing?

require ‘enumerator’

def just_iterate(obj)
obj.each { |e| p e }
end

just_iterate(“abc”)

(prints) “abc”

just_iterate(“abc”.enum_for(:each_byte))

(prints) 97

98

99

“Ty Mr. Klemme”.each_byte { |b| puts b.chr }

would be the same as

"Ty Mr. Klemme".each( 1 ) {  |b| puts b }

There is always more than one way to do it :wink:

Umm, there’s already more than one way to do it :slight_smile: But Enumerator
really is a very useful class:

“abc\ndef\nghi\njkl”.each { |e| p e }

“abc\n”

“def\n”

“ghi\n”

“jkl”

“abc\ndef\nghi\njkl\n”.enum_for(:each_byte).each { |e| p e.chr }

“a”

“b”

“c”

“\n”

“d”

“e”

etc.

“abc\ndef\nghi\njkl\n”.enum_slice(2).each { |e| p e }

[“abc\n”, “def\n”]

[“ghi\n”, “jkl\n”]

“abc\ndef\nghi\njkl\n”.enum_for(:each_byte).enum_slice(2).each { |e| p e
}

[97, 98]

[99, 10]

[100, 101]

And crucially:

“abc\ndef\nghi\njkl\n”.enum_for(:each_byte).enum_slice(2).map do |(a,b)|
(a + b).chr
end

=> ["\303", “m”, “\311”, “p”, “\317”, “s”, “\325”, “v”]

How would map and other Enumerable methods work if ‘each’ needed
arguments?


#12

Modifying String#to_a to return an array of characters has been
brought
up before (ruby-talk:148588 and following). I don’t think Matz likes
the idea, though.

It should be pointed out that there really is a flaw in the way Ruby
handles this. #each takes a parameter which defaults to the global
variable $/ to determine the actual split to perform. Not only can’t
enumerable handle the parameter, but worse relying on a globabl
varaible like that is dangerous! To be safe one would have to set the
global everytime it is used, or aways give the parameter. Otherwise
another lib might change it on you. The upshot of all this is that we
are much less inclined to even bother with String#each, which is too
bad.

BTW, see Calibre’s EnumerablePass for a way to allow #each to take
parameters and still have enumerablity.

T.


#13

On Tue, Mar 21, 2006 at 04:31:39PM +0900, jogloran wrote:
} The thing I don’t like about this behaviour is that an algorithm which
} operates on containers and expects #each can’t work immediately with
} strings.

A decision had to be made on how to split up a string when using each.
The
decision was made to split it up by lines, since that was deemed to be
the
most often used case. I think that’s probably a correct assessment. Now,
if
you want it to use something other than newlines as its split, you can
explicitly split by whatever you want:

‘foobar’.split(’’).each { |c| puts c }

Basically, stringvar.each is a shortcut for stringvar.split("\n").each
because line splitting is the common case.

} Daniel Tse
–Greg

} On 3/21/06, Robert D. removed_email_address@domain.invalid wrote:
} >
} > Mike I really agree, I was expecting that behavior from “each”
too, some
} > time ago, and it took me some time to figure it out
} > BTW I think the “nicest” solutions is mine combined with Robert’s
(pun
} > intended)
} >
} > “Ty Mr. Klemme”.each_byte { |b| puts b.chr }
} >
} > my “%” stuff was rather clumsy.
} >
} > bye for now
} > Robert
} >
} > On 3/20/06, Berger, Daniel removed_email_address@domain.invalid wrote:
} > >
} > > > -----Original Message-----
} > > > From: Mike A. [mailto:removed_email_address@domain.invalid]
} > > > Sent: Monday, March 20, 2006 12:39 PM
} > > > To: ruby-talk ML
} > > > Subject: Re: iterate chars in a string
} > > >
} > > >
} > > > “I am puzzled”.each_byte { |b| puts b.chr }
} > > >
} > > > I’m surprised that not many people knew about ‘each_byte()’.
} > > > Maybe it’s a
} > > > problem with Ruby docs? Or maybe it is just
} > > > counter-intuitive - I would expect
} > > > each() iterate over bytes, and provide each_lines() to
} > > > iterate over lines instead.
} > > >
} > > > Mike
} > > >
} > > > http://www.rubycentral.com/ref/
} > >
} > > Modifying String#to_a to return an array of characters has been
brought
} > > up before (ruby-talk:148588 and following). I don’t think Matz
likes
} > > the idea, though.
} > >
} > > Dan
} > >
} > >
} >
} >
} > –
} > Deux choses sont infinies : l’univers et la b?tise humaine ; en ce
qui
} > concerne l’univers, je n’en ai pas acquis la certitude absolue.
} >
} > - Albert Einstein
} >
} >


#14

Hi –

On Tue, 21 Mar 2006, Trans wrote:

global everytime it is used, or aways give the parameter. Otherwise
another lib might change it on you. The upshot of all this is that we
are much less inclined to even bother with String#each, which is too
bad.

Another lib might change anything on you :slight_smile: We all have to trust
each other not to do that.

David


David A. Black (removed_email_address@domain.invalid)
Ruby Power and Light, LLC (http://www.rubypowerandlight.com)

“Ruby for Rails” chapters now available
from Manning Early Access Program! http://www.manning.com/books/black


#15

Another lib might change anything on you :slight_smile: We all have to trust
each other not to do that.

Of course, but the behavior of String#each is not a good context to
invite global modification. The behavior needs to be more reliable
then that.

T.


#16

Hi –

On Tue, 21 Mar 2006, Trans wrote:

Another lib might change anything on you :slight_smile: We all have to trust
each other not to do that.

Of course, but the behavior of String#each is not a good context to
invite global modification. The behavior needs to be more reliable
then that.

Well… I’d rather rule out global modification as a response to
features one doesn’t like, even if it means living with a few of
those.

David


David A. Black (removed_email_address@domain.invalid)
Ruby Power and Light, LLC (http://www.rubypowerandlight.com)

“Ruby for Rails” chapters now available
from Manning Early Access Program! http://www.manning.com/books/black


#17

Well… I’d rather rule out global modification as a response to
features one doesn’t like, even if it means living with a few of
those.

What other kind of features would one be inclined to modify?

T.


#18

Sorry for the doubletons, I am working on it :frowning:

I am not aware of the impact this kind of discussion might have on the
evolution of Ruby
<>
but I fail to understand why extending the behavior of String#each in a
complete backward compatible way might be a problem.

In my ideal Ruby World
aString.each would do what it ever did
aString.each(anotherString) would do what it ever did

but
aString.each(aFixnum) will get us slices “doing what one might expect it
to
do”.
aString.each(0) might implement an endless loop (naah too dangerous)
aString.each(nil) might become meaningful in some other ways

Matz did such a great and beautiful work on getting close to that
ambitous
goal and I still have the feeling that String#each does not fit into
that
picture.

The beauty of the thing is that ruby is probably one of the few
languages
where that kind of discussion can occur, others beeing too ugly anyway
to
pleed for beauty ( very biased oppinion, I admit)

On 3/21/06, Trans removed_email_address@domain.invalid wrote:


Deux choses sont infinies : l’univers et la bêtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

  • Albert Einstein