For a string “ZBBBCZZ”, I want to produce a list [“Z”, “BBB”, “C”, “ZZ”]
That is, break the string into pieces based on change of character.
Though this works:
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
I’m new to Ruby and am interested to learn if there is a better way to
do it.
BTW, in Python, it can be done with a regex (similar to above) or via
their itertools library:
import itertools
s = “ZBBBCCZZ”
x = [’’.join(g) for k, g in itertools.groupby(s)]
Does anyone know if Ruby has a similar library to Python’s itertools?
Thanks,
/-\
From: Andrew S. [mailto:[email protected]]
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
when it comes to string patterns like this, nothing beats regex
import itertools
s = “ZBBBCCZZ”
x = [‘’.join(g) for k, g in itertools.groupby(s)]
Does anyone know if Ruby has a similar library to Python’s itertools?
hmm, you seem to like this than your previous regex+map solution, why?
(i ask because i prefer your first solution --not that it’s ruby)
in 1.9 or the upcoming ruby, it keeps getting better and better and may
look like this,
s = “ZBBBCZZ”
x = s.split(‘’).group_by{|x| x}.entries
or possibly to
x = s.split(‘’).group_by.entries
but unfortunately i don’t have a 1.9 build here to test (grrr, shouldn’t
have deleted that vm).
kind regards -botp
— Peña, Botp [email protected] wrote:
hmm, you seem to like this than your previous regex+map solution, why? (i ask
because i prefer your first solution --not that it’s ruby)
Actually, I’m not super happy with either solution.
The annoyance with the regex solution:
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
is that you capture the backref (when you don’t really want to), only
to discard it in the map, which seems a bit awkard and inefficient.
However, I can’t see any way around this owing to the regex semantics
of returning all fields in parens (one has the same problem in Perl
and Python, BTW). [If there were a way to specify a non-capturing
back-ref, that would do the trick.]
in 1.9 or the upcoming ruby, it keeps getting better and better and may look
like this,
s = “ZBBBCZZ”
x = s.split(‘’).group_by{|x| x}.entries
My reading of:
eigenclass.org
indicates that Enumerable#group_by can’t work because it would seem to
lose
the ordering and, grouping by key, will have only one group for ‘Z’
above,
when I want two distinct groups. (I would be delighted to be proved
wrong,
however).
I also scanned the Facets library but didn’t find anything obvious.
Cheers,
/-\
On Sat, Aug 11, 2007 at 09:52:24AM +0900, Andrew S. wrote:
BTW, in Python, it can be done with a regex (similar to above) or via
their itertools library:
import itertools
s = “ZBBBCCZZ”
x = [’’.join(g) for k, g in itertools.groupby(s)]
Does anyone know if Ruby has a similar library to Python’s itertools?
Nothing off the top of my head, but how does this work for you ?
in_str.split('').inject([]) do |m,l|
if m.last and m.last[0].chr == l
m[-1] += l
else
m << l
end
m
end
Its not too lines, but it will return the same array
enjoy
-jeremy
Andrew S. wrote:
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
Maybe this ist faster:
result = []
“ZBBBCZZ”.scan(/((.)\2*)/){erg.push [$~[0]]}
p erg # => [[“Z”], [“BBB”], [“C”], [“ZZ”]]
Wolfgang Nádasi-Donner
Maybe this ist faster:
result = []
“ZBBBCZZ”.scan(/((.)\2*)/){erg.push [$~[0]]}
p erg # => [[“Z”], [“BBB”], [“C”], [“ZZ”]]
Wolfgang Nádasi-Donner
result = []
“ZBBBCZZ”.scan(/((.)\2*)/){result.push [$~[0]]}
p erg # => [[“Z”], [“BBB”], [“C”], [“ZZ”]]
Sorry - typo by translation of variable name
Wolfgang Nádasi-Donner
Hi –
On Sat, 11 Aug 2007, Peña, Botp wrote:
hmm, you seem to like this than your previous regex+map solution, why? (i ask because i prefer your first solution --not that it’s ruby)
in 1.9 or the upcoming ruby, it keeps getting better and better and may look like this,
s = “ZBBBCZZ”
x = s.split(’’).group_by{|x| x}.entries
or possibly to
x = s.split(’’).group_by.entries
I’m going to have to get special glasses that can read invisible
ink…
David
Andrew S. schrieb:
For a string “ZBBBCZZ”, I want to produce a list [“Z”, “BBB”, “C”, “ZZ”]
That is, break the string into pieces based on change of character.
Though this works:
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
you may want to write it as …map{|i,|i}
I’m new to Ruby and am interested to learn if there is a better way to
do it.
BTW, in Python, it can be done with a regex (similar to above) or via
their itertools library:
import itertools
s = “ZBBBCCZZ”
x = [’’.join(g) for k, g in itertools.groupby(s)]
Does anyone know if Ruby has a similar library to Python’s itertools?
No idea, here is another variant to play with:
x = /#{s.gsub(/(.)\1*/, ‘(\1+)’)}/.match(s).captures
funny little problem.
cheers
Simon
Hi –
On Sat, 11 Aug 2007, Andrew S. wrote:
For a string “ZBBBCZZ”, I want to produce a list [“Z”, “BBB”, “C”, “ZZ”]
That is, break the string into pieces based on change of character.
Though this works:
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
I’m new to Ruby and am interested to learn if there is a better way to
do it.
Probably not better, but just for fun, here’s a way using the strscan
extension. I’d be very interested if anyone can get this to be less
clunky – in particular, the - [""] at the end.
require ‘strscan’
s = StringScanner.new(“AABCCCDAAAEE”)
s.string.split(//).inject([]) {|a,b| a << s.scan_until(/(?!#{b})/) } -
[""]
=> [“AA”, “B”, “CCC”, “D”, “AAA”, “EE”]
David
On Aug 11, 2007, at 2:52 AM, Andrew S. wrote:
For a string “ZBBBCZZ”, I want to produce a list [“Z”, “BBB”, “C”,
“ZZ”]
That is, break the string into pieces based on change of character.
Though this works:
s = “ZBBBCZZ”
x = s.scan(/((.)\2*)/).map {|i| i[0]}
Yeah, it’s short but I agree with things you dislike about it. My
approach was essentially the same as Jeremy’s;
s.split(//).inject([]) {|g, c| (g.last && g.last[c] ? g.last : g)
<< c; g}
That’s just playing around though, I think that approach is not better.
In my view a better idiom would be to split on character switches.
That would be concise. But as you know if you put groups you get them
back. I see no way to express the condition for boundaries without
using groups.
– fxn
On Aug 11, 2007, at 8:14 AM, [email protected] wrote:
s = “ZBBBCZZ”
require ‘strscan’
s = StringScanner.new(“AABCCCDAAAEE”)
s.string.split(//).inject([]) {|a,b| a << s.scan_until(/(?!#
{b})/) } - [""]
=> [“AA”, “B”, “CCC”, “D”, “AAA”, “EE”]
My best effort:
require “strscan”
=> true
scanner = StringScanner.new(“ZBBBCZZ”)
=> #<StringScanner 0/7 @ “ZBBBC…”>
char_runs = Array.new
=> []
char_runs << scanner.matched while scanner.scan(/(.)\1*/m)
=> nil
char_runs
=> [“Z”, “BBB”, “C”, “ZZ”]
James Edward G. II
On Aug 10, 7:52 pm, Andrew S. [email protected] wrote:
/-\
Sick of deleting your inbox? Yahoo!7 Mail has free unlimited storage.http://au.docs.yahoo.com/mail/unlimitedstorage.html
s = “ZBBBCZZ”
==>“ZBBBCZZ”
s.scan( /((.)\2*)/ ).transpose.first
==>[“Z”, “BBB”, “C”, “ZZ”]
s.gsub( /(.)(?!\1)/, “\1\n” ).split
==>[“Z”, “BBB”, “C”, “ZZ”]
On 8/11/07, [email protected] [email protected] wrote:
s = “ZBBBCZZ”
x = s.split(‘’).group_by{|x| x}.entries
or possibly to
x = s.split(‘’).group_by.entries
I’m going to have to get special glasses that can read invisible
ink…
whoops, sorry =)
that should be
fr
x = s.split(‘’).group_by{|x| x}.entries.map{|x| x.join}
to
x = s.split(‘’).group_by.entries.map{|x| x.join}
i assume that group_by without a block would group the elements by
themselves. maybe i should name it group not group_by
kind regards -botp
From: William J. [mailto:[email protected]]
s = “ZBBBCZZ”
==>“ZBBBCZZ”
s.scan( /((.)\2*)/ ).transpose.first
==>[“Z”, “BBB”, “C”, “ZZ”]
s.gsub( /(.)(?!\1)/, “\1\n” ).split
==>[“Z”, “BBB”, “C”, “ZZ”]
ruby hacker, James, that is cool! gotta keep this.
kind regards -botp
Peña schrieb:
From: William J. [mailto:[email protected]]
s = “ZBBBCZZ”
==>“ZBBBCZZ”
s.scan( /((.)\2*)/ ).transpose.first
==>[“Z”, “BBB”, “C”, “ZZ”]
s.gsub( /(.)(?!\1)/, “\1\n” ).split
==>[“Z”, “BBB”, “C”, “ZZ”]
ruby hacker, James, that is cool! gotta keep this.
kind regards -botp
Yeah, nice!
i think one can simplify from
s.gsub( /(.)(?!\1)/, “\1\n” ).split
to
s.gsub(/(.)\1*/, '\0 ').split
?
cheers
Simon
On 8/12/07, Simon Kröger [email protected] wrote:
ruby hacker, James, that is cool! gotta keep this.
s.gsub(/(.)\1*/, '\0 ').split
Yes it appears so. Another variation would be (this lets you use the
method on strings that contain whitespace already correctly):
require ‘enumerator’
s.enum_for(:gsub, /(.)\1*/).to_a
Which is sort of back to the original scan method.
?
Hi –
On Sun, 12 Aug 2007, botp wrote:
themselves. maybe i should name it group not group_by
Actually I think group_by with nothing specified just returns an
enumerator over the array itself, so it probably will never be used (I
hope
I don’t think group_by will work for this problem, though, because it
groups everything together:
irb(main):014:0> s
=> “AABCDAAE”
irb(main):015:0> s.split(//).group_by {|x| x }.map {|e| e.join }
=> [“AAAAA”, “CC”, “EE”, “DD”, “BB”]
Notice how all the A’s got put in one result.
David
On 8/12/07, [email protected] [email protected] wrote:
irb(main):015:0> s.split(//).group_by {|x| x }.map {|e| e.join }
=> [“AAAAA”, “CC”, “EE”, “DD”, “BB”]
Notice how all the A’s got put in one result.
arrghh, sorry, yes. it’s really grouping w no regards to sequence.
thank you for the update
kind regards -botp
On Aug 12, 3:06 am, Simon Kröger [email protected] wrote:
ruby hacker, James, that is cool! gotta keep this.
s.gsub(/(.)\1*/, '\0 ').split
?
cheers
Simon
Yes, with the possible exception of
“\1\n” . I was anticipating the need to allow
the string to contain any character but a
newline.
s = ‘ZBBBC ZZ’
==>“ZBBBC ZZ”
s.gsub(/(.)\1*/, “\0\n”).split(“\n”)
==>[“Z”, “BBB”, “C”, " ", “ZZ”]