Iterate chars in a string


#1

Hi there!
I’m a ruby newbie, and I’m searching for a way to iterate every char in
a string, but I cannot find any easy way. My problem is to look at every
char in a string and match it with some known letter.
I use the String#each_byte iterator for now, but it still be a poor
solution :confused:
Thanks,

shinya.


#2

On Dec 20, 2005, at 4:52 AM, shinya wrote:

Hi there!
I’m a ruby newbie, and I’m searching for a way to iterate every
char in a string, but I cannot find any easy way. My problem is to
look at every char in a string and match it with some known letter.
I use the String#each_byte iterator for now, but it still be a poor
solution :confused:
Thanks,

shinya.

The usual idiom is str.split(//).each do |character|
# do stuff with character
end

eg:

logan:/Users/logan% irb
irb(main):001:0> str = “Hello, world!”
=> “Hello, world!”
irb(main):002:0> str.split(//).each do |character|
irb(main):003:1* puts character
irb(main):004:1> end
H
e
l
l
o
,
w
o
r
l
d
!
=> [“H”, “e”, “l”, “l”, “o”, “,”, " ", “w”, “o”, “r”, “l”, “d”, “!”]

if the extraneous typing bothers you, you can always add it to String.

class String
def each_char(&block)
split(//).each(&block)
self
end
end


#3

Logan C. wrote:

The usual idiom is str.split(//).each do |character|
# do stuff with character
end

Thank you very much! I did it :slight_smile:
bye!

shinya.


#4

“hello world”.each_byte{|i| puts “%c” % i}


#5

On 12/20/05, Logan C. removed_email_address@domain.invalid wrote:

shinya.

The usual idiom is str.split(//).each do |character|
# do stuff with character
end

String#scan with a block is lighter weight, and less wordy:
str.scan(/./) do |character|

stuff

end


#6

Logan C. wrote:

“r”
“l”
“d”

Where’d my newlines go? :frowning:

Heh, good point. Thanks for mentioning this. I think this might be quite
a common pit fall.

Time to grep my code and see if this could possibly cause trouble
anywhere…


#7

On Dec 20, 2005, at 6:21 AM, Mark H. wrote:

Thanks,

stuff

end

str = “Hello\nWorld”

str.scan(/./) do |character|
p character
end

irb(main):015:0> str.scan(/./) do |character|
irb(main):016:1* p character
irb(main):017:1> end
“H”
“e”
“l”
“l”
“o”
“W”
“o”
“r”
“l”
“d”

Where’d my newlines go? :frowning:

irb(main):021:0> str.split(//).each do |character|
irb(main):022:1* p character
irb(main):023:1> end
“H”
“e”
“l”
“l”
“o”
“\n”
“W”
“o”
“r”
“l”
“d”

If you want to use scan, you should use scan(/./m)

irb(main):024:0> str.scan(/./m) do |character|
irb(main):025:1* p character
irb(main):026:1> end
“H”
“e”
“l”
“l”
“o”
“\n”
“W”
“o”
“r”
“l”
“d”


#8

i’ve always used this:

0.upto(string.length-1) do |n|
p string[n,1]
end

anything wrong with this?
greetings, Dirk.

2005/12/20, Florian GroÃ? removed_email_address@domain.invalid:


#9

“Wrong” is a strong word, but I’d say this isn’t ideal Ruby for two
reasons:

  1. Generally in Ruby internal iterators (i.e. each, each_byte, etc.)
    are preferred over external iterators like what you have here (and for
    and while loops.)
  2. Your code isn’t as efficient, though it wasn’t as bad as I thought:

| QuickBench Session Started |

300 Iterations
                              user     system      total        real
  1. s.each_byte{|b| b.chr … 2.656000 0.000000 2.656000 (
    2.669000)
  2. 0.upto(s.length-1) { |… 2.859000 0.000000 2.859000 (
    2.887000)
  3. s.split(//).each { |ch… 3.297000 0.000000 3.297000 (
    3.290000)
  4. s.scan(/./m) { |charac… 7.594000 0.156000 7.750000 (
    7.776000)

| Fastest was <1. s.each_byte{|b| b.chr …> |

The string s above was 4000 x characters.

I was quite surprised the scan was so much slower, especially compared
to split, which creates an array every time.

Ryan


#10

shinya wrote:

Hi there!
I’m a ruby newbie, and I’m searching for a way to iterate every char in
a string, but I cannot find any easy way. My problem is to look at every
char in a string and match it with some known letter.

If you just want to know if a string contains a character:

puts "Yep!" if str.include?("X")

I use the String#each_byte iterator for now, but it still be a poor
solution :confused:

Others have suggested various iterators, but why not use standard jcode
lib?

require 'jcode'
str.each_char { |c| puts c }

#11

On Dec 20, 2005, at 3:35 PM, Dirk M. wrote:

i’ve always used this:

0.upto(string.length-1) do |n|
p string[n,1]
end

anything wrong with this?

You mean besides the fact that it’s not very Rubyish?
Seriously, it seems fine, though I prefer each_byte() or scan(/./m).

James Edward G. II


#12

On 12/20/05, Bob S. removed_email_address@domain.invalid wrote:

Others have suggested various iterators, but why not use standard jcode lib?

require 'jcode'
str.each_char { |c| puts c }

In case anyone is curious, the above just uses scan(/./m), and is even
slower in my benchmark because of the extra method calls. But in
reality all are quite fast enough, and the performance differences
aren’t all that significant.

Ryan


#13

a=“123”

0.upto(a.length) { |i| puts a[i…i] }


#14

| QuickBench Session Started |

300 Iterations
                             user     system      total        real
  1. s.each_byte{|b| b.chr … 2.656000 0.000000 2.656000 (
    2.669000)

each_char should beat the current leader as b.chr is not needed.

Strange it is not a part of Ruby as the char is the most natural part of
a string.

Christer


#15

a.split( // ).each { |c| puts c }

j.

On 12/20/05, Lyndon S. removed_email_address@domain.invalid wrote:

char in a string and match it with some known letter.
Into RFID? www.rfidnewsupdate.com Simple, fast, news.


“Remember. Understand. Believe. Yield! -> http://ruby-lang.org

Jeff W.


#16

On Dec 20, 2005, at 6:32 PM, Christer N. wrote:

Strange it is not a part of Ruby as the char is the most natural
part of
a string.

I don’t think that is true. The semantics of each_byte are quite clear
but exactly what is a character? In some encodings
one byte is the same as one character but in other encodings it might
be two bytes or in others in might be a variable number of bytes.

Iterating by ‘character’ only has meaning with respect to a character
set
encoding and a Ruby string generally doesn’t have that sort of
information.

I think it is has been said before but, a Ruby string is more like an
array of bytes than a sequence of code points in an (implicit)
character set.

Gary W.


#17

Robert K. wrote:

Many roads to Rome…

robert

or even

a.scan(/./) { |c| p c }

Kev


#18

Kev J. wrote:

a.length.times {|i| puts a[i].chr}

Many roads to Rome…

robert

or even

a.scan(/./) { |c| p c }

We had that already: your version ignores newlines. :slight_smile:

robert

#19

Lyndon S. wrote:

a=“123”

0.upto(a.length) { |i| puts a[i…i] }

Alternatively:

a.length.times {|i| puts a[i].chr}

Many roads to Rome…

robert

#20

Kev J. wrote:

a = “ook\nook\tEeek!\n”
a.scan(/.|\s/) { |c| p c }

a.scan(/./m) {|c| p c } # m is for multi-line

Cheers,
Dave