Forum: Ruby halving a string

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 05:11
(Received via mailing list)
I have a vacuum fluorescent display in my office, and I've been
messing around with it. Now that I've figured out how to communicate
with the serial connection it's time for some fun. Its shortcoming is
its 2 lines of 20 chars. So I naturally wrote a fortune cookie program
for it.

I want to make adding new fortunes easy (and without having to figure
out myself where to force a line break), so I basically need to split
a string (approximately) in half at a word boundary. This is what I
have:

class String
  def halve
    first_half = ''
    second_half = self
    until first_half.length >= length / 2
      match = / /.match(second_half)
      first_half << match.pre_match << ' '
      second_half = match.post_match
    end
    [first_half.strip, second_half]
  end
end

I have a feeling there's a one-line regexp that can do this. Am I
right? If not, is there a better way?
45196398e9685000d195ec626d477f0e?d=identicon&s=25 Trans (Guest)
on 2007-02-02 05:20
(Received via mailing list)
On Feb 1, 11:10 pm, "Chris Shea" <cms...@gmail.com> wrote:
>
>   end
> end
>
> I have a feeling there's a one-line regexp that can do this. Am I
> right? If not, is there a better way?

untested but...

  i = [0..str.size / 2].index(' ')
  first_half, second_half = str[0...i], str[i..-1].strip

however, you might just prefer

  require 'facets/core/string/word_wrap'
  str.word_wrap(20)

T.
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 05:30
(Received via mailing list)
On Feb 1, 9:20 pm, "Trans" <transf...@gmail.com> wrote:
> > I want to make adding new fortunes easy (and without having to figure
> >       first_half << match.pre_match << ' '
>
>   i = [0..str.size / 2].index(' ')
>   first_half, second_half = str[0...i], str[i..-1].strip
>
> however, you might just prefer
>
>   require 'facets/core/string/word_wrap'
>   str.word_wrap(20)
>
> T.

Tested, and failed. The index returns the position of the very first
space. (Also, it would have to be str[0..(str.size / 2)].index(' ')
but, still, it's no good).

Word wrapping is wrong for the occasion, as I'd like to split even
lines less than 20 chars, and have them of approximately equal length.
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2007-02-02 05:58
(Received via mailing list)
From: "Chris Shea" <cmshea@gmail.com>
>
> Tested, and failed. The index returns the position of the very first
> space. (Also, it would have to be str[0..(str.size / 2)].index(' ')
> but, still, it's no good).

rindex()

:)


Regards,

Bill
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 06:25
(Received via mailing list)
On Feb 1, 9:57 pm, "Bill Kelly" <b...@cts.com> wrote:
> :)
>
> Regards,
>
> Bill

rindex, of course. Here's my current version then:

class String
  def halve
    i = self[0..(length / 2)].rindex(' ')
    if i.nil?
      if include?(' ')
        split(' ', 1)
      else
        [self, '']
      end
    else
      [self[0..i].strip, self[i..-1].strip]
    end
  end
end
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 06:30
(Received via mailing list)
On Feb 1, 10:24 pm, "Chris Shea" <cms...@gmail.com> wrote:
> > :)
>     if i.nil?
>       if include?(' ')
>         split(' ', 1)
>       else
>         [self, '']
>       end
>     else
>       [self[0..i].strip, self[i..-1].strip]
>     end
>   end
> end

ahem: split(' ', 2)
918c6daad03c85e51ad1a11f57017947?d=identicon&s=25 Devin Mullins (twifkak)
on 2007-02-02 07:03
(Received via mailing list)
Naive solution:

class String
   def halve(max_len=nil)
     raise if max_len and size > max_len * 2 + 1
     space_indices = (0...size).find_all {|i| self[i] == ' '[0] }
     splitter = space_indices.sort_by {|i| (i - size/2).abs }.first
     if splitter.nil? ||
       (max_len && (splitter > max_len || size - splitter - 1 >
max_len))
       [self[0...size/2], self[size/2..-1]]
     else
       [self[0...splitter], self[splitter+1..-1]]
     end
   end
end

if __FILE__ == $0
   require 'test/unit'

   class SplitterTest < Test::Unit::TestCase
     def test_stuff
       assert_equal ['the frog', 'is green'], 'the frog is green'.halve
       assert_equal ['ponchielli', 'wrote songs'],
                    'ponchielli wrote songs'.halve
       assert_raise(RuntimeError) { 'ab d'.halve 1 }
       assert_nothing_raised { 'a b'.halve 1 }
       assert_equal ['lizardman', 'lives'], 'lizardman lives'.halve
       assert_equal ['lizardm', 'an lives'], 'lizardman lives'.halve(8)
       assert_equal ['abcdef','ghijkl'], 'abcdefghijkl'.halve
     end
   end
end
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 07:20
(Received via mailing list)
On Feb 1, 11:03 pm, Devin Mullins <twif...@comcast.net> wrote:
>      else
>        assert_equal ['the frog', 'is green'], 'the frog is green'.halve
>        assert_equal ['ponchielli', 'wrote songs'],
>                     'ponchielli wrote songs'.halve
>        assert_raise(RuntimeError) { 'ab d'.halve 1 }
>        assert_nothing_raised { 'a b'.halve 1 }
>        assert_equal ['lizardman', 'lives'], 'lizardman lives'.halve
>        assert_equal ['lizardm', 'an lives'], 'lizardman lives'.halve(8)
>        assert_equal ['abcdef','ghijkl'], 'abcdefghijkl'.halve
>      end
>    end
> end

Yes. I see how expressions like space_indices.sort_by {|i| (i - size/
2).abs }.first work, I just need to be able to think of them. It does
what I really want (get the closest space to the middle of the
string), instead of what I almost want (the last space in the first
half of the string). Maybe I just gave up too early, or started
barking up the wrong tree. I said, "I know, I'll use regular
expressions" and then I had two problems. Thanks.
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-02 07:30
(Received via mailing list)
Chris Shea wrote:
> I want to make adding new fortunes easy (and without having to figure
> out myself where to force a line break), so I basically need to split
> a string (approximately) in half at a word boundary. This is what I
> have:
[snip[
> I have a feeling there's a one-line regexp that can do this. Am I
> right? If not, is there a better way?

This is what I would do:

strings = DATA.read.split( /\n/ )

strings.each{ |str|
  puts str.gsub( /^(.{#{str.length/2},}?)\s(.+)/ ){ "#{$1}\n#{$2}" }
  puts
}
#=> Hello
#=> World

#=> It's the end of the
#=> world as we know it

#=> If you didn't know any better
#=> you'd think this was magic.

__END__
Hello World
It's the end of the world as we know it
If you didn't know any better you'd think this was magic.
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2007-02-02 12:56
(Received via mailing list)
On 02.02.2007 05:26, Chris Shea wrote:
> Word wrapping is wrong for the occasion, as I'd like to split even
> lines less than 20 chars, and have them of approximately equal length.

If you want to first fill the first string and then the second you can
do this:

first, second = s.scan /.{1,20}/

For evenly distribution you could do:

irb(main):019:0> s="foo bar ajd as dashd kah sdhakjshd ahdk ahsd asjh"
=> "foo bar ajd as dashd kah sdhakjshd ahdk ahsd asjh"
irb(main):020:0> l=[40,s.length].min / 2
=> 20
irb(main):021:0> first = s[0...l]
=> "foo bar ajd as dashd"
irb(main):022:0> second = s[l...l+l]
=> " kah sdhakjshd ahdk "

If you want to break at white space, you can do it with one regexp (you
did ask for the one regexp solution :-)):

irb(main):030:0> s="aasd laksjd  asdj asjkd asdj jlas d"
=> "aasd laksjd  asdj asjkd asdj jlas d"
irb(main):031:0> %r[(.{1,#{s.length/2}})\s*(.{1,#{s.length/2}})] =~ s or
raise "cannot split"
=> 0
irb(main):032:0> first = $1
=> "aasd laksjd  asdj"
irb(main):033:0> second = $2
=> "asjkd asdj jlas d"

Kind regards

  robert
35b0b4029fd4387842ec88a8e99d84de?d=identicon&s=25 Jason Mayer (Guest)
on 2007-02-02 13:14
(Received via mailing list)
puts str.gsub( /^(.{#{str.length/2},}?)\s(.+)/ ){ "#{$1}\n#{$2}" }
irb(main):031:0> %r[(.{1,#{s.length/2}})\s*(.{1,#{s.length/2}})] =~ s or
raise "cannot split"

Kind regards

       robert


you guys are hard core when it comes to regexes.   damn.
45196398e9685000d195ec626d477f0e?d=identicon&s=25 Trans (Guest)
on 2007-02-02 14:53
(Received via mailing list)
On Feb 1, 11:30 pm, "Chris Shea" <cms...@gmail.com> wrote:
> > > for it.
> > >     until first_half.length >= length / 2
>
> > T.
>
> Tested, and failed. The index returns the position of the very first
> space. (Also, it would have to be str[0..(str.size / 2)].index(' ')
> but, still, it's no good).

ah, size/2 has to be added to i, then it works.

> Word wrapping is wrong for the occasion, as I'd like to split even
> lines less than 20 chars, and have them of approximately equal length.

not sure i understand, the size can be set, for example:

  word_wrap(str.size/2)

t.
45196398e9685000d195ec626d477f0e?d=identicon&s=25 Trans (Guest)
on 2007-02-02 14:57
(Received via mailing list)
On Feb 1, 11:30 pm, "Chris Shea" <cms...@gmail.com> wrote:
>
> > T.
>
> Tested, and failed. The index returns the position of the very first
> space. (Also, it would have to be str[0..(str.size / 2)].index(' ')
> but, still, it's no good).

To clarify...

  class String
    def halve
      i = size / 2
      j = i + self[i..-1].index(' ')
      return self[0...j].strip, self[j..-1].strip
    end
  end

T.
Ef3aa7f7e577ea8cd620462724ddf73b?d=identicon&s=25 Rob Biedenharn (Guest)
on 2007-02-02 16:57
(Received via mailing list)
On Feb 1, 2007, at 11:10 PM, Chris Shea wrote:
>
>   end
> end
>
> I have a feeling there's a one-line regexp that can do this. Am I
> right? If not, is there a better way?

Good solutions already, but I had to chime in with one less clever
and without regexps.
I also didn't like "halve" as a name so I used "cleave".

Enjoy!


class String
   def cleave
     middle = self.length/2
     early = self.rindex(' ', middle)
     late = self.index(' ', middle)

     if self[middle,1] == ' '
       [ self[0...middle], self[middle+1..-1] ]
     elsif early.nil? && late.nil?
       [ self.dup, '' ]
     elsif early.nil?
       [ self[0...late], self[late+1..-1] ]
     elsif late.nil?
       [ self[0...early], self[early+1..-1] ]
     else
       middle = middle - early < late - middle ? early : late
       [ self[0...middle], self[middle+1..-1] ]
     end
   end
end

if __FILE__ == $0
   require 'test/unit'
   class StringCleaveTest < Test::Unit::TestCase
     def test_nospaces
       assert_equal [ 'whole',
                      '' ], 'whole'.cleave
       assert_equal [ 'Supercalifragilisticexpialidocious',
                      '' ], 'Supercalifragilisticexpialidocious'.cleave
     end
     def test_exact_middle
       assert_equal [ 'fancy',
                      'split' ], 'fancy split'.cleave
       assert_equal [ 'All good Rubyists',
                      'know how to party' ], 'All good Rubyists know
how to party'.cleave
     end
     def test_closer_to_start
       assert_equal [ 'short',
                      'splitter' ], 'short splitter'.cleave
       assert_equal [ 'Four score and',
                      'seven years ago...' ], 'Four score and seven
years ago...'.cleave
       assert_equal [ 'abc def',
                      'ghijklm nop' ] , 'abc def ghijklm nop'.cleave
     end
     def test_closer_to_end
       assert_equal [ 'extended',
                      'split' ], 'extended split'.cleave
       assert_equal [ 'abc defghi',
                      'jklm nop' ] , 'abc defghi jklm nop'.cleave
     end
   end
end


Rob Biedenharn    http://agileconsultingllc.com
Rob@AgileConsultingLLC.com
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 17:01
(Received via mailing list)
On Feb 2, 6:53 am, "Trans" <transf...@gmail.com> wrote:
> > > > with the serial connection it's time for some fun. Its shortcoming is
> > > >     first_half = ''
> > > > I have a feeling there's a one-line regexp that can do this. Am I
> > >   str.word_wrap(20)
> > lines less than 20 chars, and have them of approximately equal length.
>
> not sure i understand, the size can be set, for example:
>
>   word_wrap(str.size/2)
>
> t.

I thought of that, but:

irb> str = "longlongword bit bits"
irb> puts str.word_wrap(str.size/2)
longlongwo
rd bit
bits
=> nil
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-02 17:30
(Received via mailing list)
On Feb 1, 11:25 pm, "Phrogz" <g...@refinery.com> wrote:
> strings.each{ |str|
>   puts str.gsub( /^(.{#{str.length/2},}?)\s(.+)/ ){ "#{$1}\n#{$2}" }
>   puts
>  }

Seeing Robert Klemme's regexp, it does make sense to be a hair
greedier on the whitespace match, just in case the source string has
more than one space between words at the boundary:

str.gsub( /^(.{#{str.length/2},}?)\s+(.+)/ ){ "#{$1}\n#{$2}" }
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2007-02-02 17:45
(Received via mailing list)
On 02.02.2007 17:28, Phrogz wrote:
> str.gsub( /^(.{#{str.length/2},}?)\s+(.+)/ ){ "#{$1}\n#{$2}" }
You're making length/2 the minimum for matching.  I believe that should
rather be the max:

# untested
str.sub( /\A(.{1,#{str.length/2}})\s+(.+)/ ){ "#{$1}\n#{$2}" }

Also, #sub seems sufficient.

Kind regards

  robert
852a62a28f1de229dc861ce903b07a60?d=identicon&s=25 Gavin Kistner (phrogz)
on 2007-02-02 17:50
(Received via mailing list)
On Feb 2, 9:44 am, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 02.02.2007 17:28, Phrogz wrote:
> > str.gsub( /^(.{#{str.length/2},}?)\s+(.+)/ ){ "#{$1}\n#{$2}" }
>
> You're making length/2 the minimum for matching.  I believe that should
> rather be the max:
>
> # untested
> str.sub( /\A(.{1,#{str.length/2}})\s+(.+)/ ){ "#{$1}\n#{$2}" }

I suppose it depends on whether you want the first line to be longer
or shorter than the 2nd line. In my mind, it looks better longer. The
{x,} range does make it the minimum, but the non-greedy quantifier
ensures that it breaks as soon as possible after starting the word
that you're in the middle of.

> Also, #sub seems sufficient.

Good point. I don't think I have ever used #sub, so it's never at the
forefront of my mind.
45196398e9685000d195ec626d477f0e?d=identicon&s=25 Trans (Guest)
on 2007-02-02 18:53
(Received via mailing list)
On Feb 2, 10:55 am, Rob Biedenharn <R...@AgileConsultingLLC.com>
wrote:
> On Feb 1, 2007, at 11:10 PM, Chris Shea wrote:

>      late = self.index(' ', middle)
[snip]

This is nice and versitle. One good augmentation might be...

    def cleave(middle=nil)
      middle ||= self.length/2

T.
7fc012e64218065c35da8232ea609a04?d=identicon&s=25 Tiago Pinto (Guest)
on 2007-02-02 19:33
(Received via mailing list)
Hi Chris,

> I have a vacuum fluorescent display in my office, and I've been
> messing around with it. Now that I've figured out how to communicate
> with the serial connection it's time for some fun.

aside the string hackery, will the comunication be done using ruby? ;P
896cfc242a7762467c2a0b2af86598e5?d=identicon&s=25 Simon Strandgaard (Guest)
on 2007-02-02 20:42
(Received via mailing list)
On 2/2/07, Chris Shea <cmshea@gmail.com> wrote:
[snip]
> I have a feeling there's a one-line regexp that can do this. Am I
> right? If not, is there a better way?

no big regexp here

def half(s, threshold)
  l = s.size / 2
  0.upto([threshold, l].min) do |i|
    (l -= i; break) if s[l-i, 1] =~ /\s/
    (l += i; break) if s[l+i, 1] =~ /\s/
  end
  s.dup.insert(l, "\n").gsub(/^\s+|\s+$/, '')
end

ary = [
  "abcd efgh ijkl",
  "abcdef",
  "hello world",
  "aasd laksjd  asdj asjkd asdj jlas d",
  "foo bar ajd as dashd kah sdhakjshd ahdk ahsd asjh",
  "It's the end of the world as we know it",
  "If you didn't know any better you'd think this was magic.",
  "lizardman lives",
  "ponchielli wrote songs",
  "NSIntersectionRect YES NO"
]

ary.each{|s| p half(s, 6) }
p '---------------------'
ary.each{|s| p half(s, 2) }



output below:

"abcd efgh\nijkl"
"abc\ndef"
"hello\nworld"
"aasd laksjd  asdj\nasjkd asdj jlas d"
"foo bar ajd as dashd kah\nsdhakjshd ahdk ahsd asjh"
"It's the end of the\nworld as we know it"
"If you didn't know any better\nyou'd think this was magic."
"lizardman\nlives"
"ponchielli\nwrote songs"
"NSIntersectionRect\nYES NO"
"---------------------"
"abcd efgh\nijkl"
"abc\ndef"
"hello\nworld"
"aasd laksjd  asdj\nasjkd asdj jlas d"
"foo bar ajd as dashd kah\nsdhakjshd ahdk ahsd asjh"
"It's the end of the\nworld as we know it"
"If you didn't know any better\nyou'd think this was magic."
"lizardman\nlives"
"ponchielli\nwrote songs"
"NSIntersecti\nonRect YES NO"
017e05d1a49ffa59ea03e149e7af720b?d=identicon&s=25 Chris Shea (Guest)
on 2007-02-02 20:42
(Received via mailing list)
On Feb 2, 11:32 am, "Tiago Pinto" <thpi...@gmail.com> wrote:
> Hi Chris,
>
> > I have a vacuum fluorescent display in my office, and I've been
> > messing around with it. Now that I've figured out how to communicate
> > with the serial connection it's time for some fun.
>
> aside the string hackery, will the comunication be done using ruby? ;P
>
[snip]

Mine will be. The code for handling the communication is ugly (the
win32api pretty much guarantees that), but I've wrapped it in a module
so I never have to look at it again (hopefully). I'm using the display
as stdout for a couple of little maintenance scripts, too.

Otherwise, there's been some great solutions here. Thanks everyone.
Ef3aa7f7e577ea8cd620462724ddf73b?d=identicon&s=25 Rob Biedenharn (Guest)
on 2007-02-03 01:50
(Received via mailing list)
On Feb 2, 2007, at 12:52 PM, Trans wrote:
>> class String
>       middle ||= self.length/2
>
> T.

That sounded good to me and so I did that and added support for
negative (from the end) offsets, too.  And like String#[], it returns
nil if the desired cleavage position is outside the string.

   def cleave(middle=nil)
     middle ||= self.length/2
     return nil unless (-self.length ... self.length).include?(middle)

     middle += self.length if middle < 0

     #...
   end

-Rob

P.S. If anyone wants the full thing (with the new tests), email me
directly.

Rob Biedenharn    http://agileconsultingllc.com
Rob@AgileConsultingLLC.com
This topic is locked and can not be replied to.