Re: Twisting a Rope (#137)

I had a little trouble debugging my solution, and thus I’m turning it in
late. I’m also being forced to finish up early, so some of the lesser
features sufferred. It implements all three parts (including part 3
subpart d), but I commented out part 3 subpart c, as I was having
trouble with it in and earlier version, and did not have time to test
whether it works now and debug. Additionally, part 3 subpart a is only
kinda implemented, as I didn’t have time to make the slices other than
(start, length) and range return a rope.

I had plans to implement a frequency-count-based substring slice (and
partially did) that could know where it was possible for the substring
to occur and where it was most likely, but, again, I ran out of time to
implement the main part of that.

My code runs pretty well, but it’s extremely ugly due to all the
optimizations I made, such as using a stack-based iterative
normalization algorithm that tracks its path using binary instead of a
recursive traversal.

Lastly, in order to debug normalize at one point, I wrote a method
that prints a minimalist but accurate and readable tree representation
of the Rope. I left that in there.

$Fib = Hash.new{ |h, n| h[n] = h[n - 1] + h[n - 2] }
$Fib[0] = 0
$Fib[1] = 1

CHUNK_SIZE = 16

#Ropes cache frequency counts of characters in the rope, Lazily
evaluated
#Calling dup is faster than constantly creating new Hashes
$blank_freq_count = Hash.new{|h,c|
h[c] = @left.freq_count[c] + @right.freq_count[c]}

class String

alias at slice
alias slice_offsetted_section slice

def shift(n=1)
slice!(0)
end

def freq_count
if @freq_count
@freq_count
else
@freq_count = [0]*128
each_byte {|c| @freq_count[c] += 1}
@freq_count
end
end

def width; 1; end
def depth; 1; end

end

ObjectSpace.each_object(Class) do |klass|
if klass <= IO and klass.private_method_defined? :initialize
klass.class_eval <<-EOC
alias old_initialize initialize
def initialize(*args)
@shifted_chars = 0
old_initialize(*args)
end
EOC
end
end

class IO

def length
[email protected]
@length
else
seek(0, IO::SEEK_END)
@length = pos
end
end

#Pretends it’s immutable
def to_s
if @text
@text
else
$stdout.puts @shifted_chars
$stdout.puts IO::SEEK_SET
seek(@shifted_chars, IO::SEEK_SET)
@text = read
end
end

def shift
seek(@shifted_chars, IO::SEEK_SET)
@shifted_chars += 1
getc
end

def freq_count
to_s.freq_count
end

def slice_substring(str)
if @text
@text.slice(str)
else #Avoid loading file into memory
seek(@shifted_chars, IO::SEEK_SET)
last_char = nil
while true
last_char = getc until last_char == str[0] or eof?
return nil if eof?
return str if read(str.length - 1) ==str[1…-1]
end
end
end

def slice_offsetted_section(start, length)
seek(@shifted_chars + start, IO::SEEK_SET)
read(length)
end

def at(ind)
seek(@shifted_chars + ind, IO::SEEK_SET)
getc
end

def width; 1; end
def depth; 1; end
end

class Rope
attr_accessor :left, :right, :length, :freq_count

def width; @removed_email_address@domain.invalid; end
def depth; [@left.width,@right.width].max+1; end

def initialize(left="", right="")
@left, @right = left, right
@length = @left.length + @right.length
@freq_count = $blank_freq_count.dup
end

def append(strope)
#if @right.length < CHUNK_SIZE and strope.length < CHUNK_SIZE
# @right = Rope.new(@right,strope)
# @length = left.length + right.length
# @freq_count = $blank_freq_count.dup
#else
@left = Rope.new(@left,@right)
@right = strope
@left.freq_count = @freq_count
@length = @left.length + @right.length
#end
self
end
alias << append

def normalize
###Stack based algorithm removes the overhead of method calls,
###and the huge overhead of proc calls
path = 0b0
path_nodes = [self]
cur_node = self

seq = []
while true
  if Rope === cur_node
    path = (path << 1) | 1
    cur_node = cur_node.left
    path_nodes.push(cur_node)
  else
    if path & 1 == 1
      path ^= 1 #flip the last bit
      path_nodes.pop
      cur_node = path_nodes.last.right
      path_nodes.push(cur_node)
    else
      break if path == 0 #Already visited all nodes
      until path&1 == 1
        path >>= 1
        path_nodes.pop
      end
      path ^= 1 #flip the last bit
      path_nodes.pop
      cur_node = path_nodes.last.right
      path_nodes.push(cur_node)
    end
  end
  next if Rope === cur_node
  str = cur_node
  old_n = 0
  n = 0
  n += 1 until $Fib[n] > str.length
  n -= 1
  smallers = seq[0..n].reject{|o| o.nil?|| o.length == 0}
  until [] == smallers
   seq[old_n..n] = [nil]*(n-old_n+1)
   bal_rope = (smallers[1..-1]).inject(smallers[0]) {|r,s|
    Rope.new(s,r)}
  bal_rope = Rope.new(bal_rope, str)
  old_n = n
   n += 1 until $Fib[n] > bal_rope.length
   n -= 1
   str = bal_rope
   seq[n] = nil if n >= seq.length
   smallers = seq[old_n..n].reject{|o| o.nil? || o.length == 0}
  end
  seq[n] = str
end
seq.compact!
seq[1..-1].inject(seq[0]) {|r,s| r = Rope.new(s,r)}

end

def to_s
@left.to_s + @right.to_s
end

def slice(*args)
if 2 == args.length and Fixnum === args[0]
#modulus for negative start
slice_offsetted_section(args[0] % @length, args[1])
elsif 1 == args.length and Fixnum === args[0]
index(args.first % @length)
elsif 1 == args.length and Range === args[0]
rng = args[0]
slice_offsetted_section(rng.begin % @length,
(rng.end - rng.begin + (rng.exclude_end? ? 0 : 1)) % @length)
else
slice_substring(args[0])
end
end

def slice_offsetted_section(start, length)
if start < @left.length
if start + length < @left.length
@left.slice_offsetted_section(start,length)
else
Rope.new(@left.slice_offsetted_section(start, @left.length -
start),
@right.slice_offsetted_section(0,
length - (@left.length - start)))
end
else
@right.slice_offsetted_section(start - @left.length, length)
end
end

#def slice_substring(str)

#end

def index(offset)
if offset < ([email protected])
@left.at(offset)
else
@right.at(offset-left_len)
end
end
alias at index

def shift(n=1)
([0]*n).map do
@length -= 1
@left.length > 0 ? @left.shift : @right.shift
end
end
end

$x = 0
$y = 0

def sumupto(n)
s = 0
1.upto(n){|i| s+= i}
s
end

def print_rope(tree,str=$stdout)
arr = ([nil]*sumupto(tree.depth+1)).map{[nil] *
(sumupto(tree.depth+1))}
$y = 0
$x = arr[0].length / 2
coord_trav(tree) do |node|
if Rope === node
arr[$y][$x] = “/ \”
else
arr[$y][$x] = node.to_s.inspect
end
end
arr.each do |row|
row.each do |cell|
if cell == nil
str.print " "
else
str.print cell
end
end
str.print “\n”
end
nil
end

def coord_trav(tree, &block)
unless Rope === tree
block.call(tree)
return
end
$x -= tree.depth
$y +=tree.depth
coord_trav(tree.left,&block)
$x +=tree.depth
$y -=tree.depth
block.call(tree)
$x +=tree.depth
$y +=tree.depth
coord_trav(tree.right, &block)
$x -=tree.depth
$y -=tree.depth
end

----- Original Message ----
From: Ruby Q. [email protected]
To: ruby-talk ML [email protected]
Sent: Friday, August 31, 2007 8:19:54 AM
Subject: [QUIZ] Twisting a Rope (#137)

The three rules of Ruby Q.:

  1. Please do not post any solutions or spoiler discussion for this quiz
    until
    48 hours have passed from the time on this message.

  2. Support Ruby Q. by submitting ideas as often as you can:

http://www.rubyquiz.com/

  1. Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem helps
everyone
on Ruby T. follow the discussion. Please reply to the original quiz
message,
if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

by John M.

This week’s task is to implement the Rope data structure as a Ruby
class. This
topic comes out of the ICFP programming competition
(http://www.icfpcontest.com/) which had competitors manipulating a 7.5
million
character string this year.

What is a Rope:

You may not realize it, but for many changes the content to a String,
Ruby
creates a new copy of the original with the modifications applied. For
small
strings that are created once and read often this is actually a very
efficient
way to do thing, but what happens when the string starts to get long,
and is
undergoing a lot of changes? First, the program will spend more and
more of its
processing cycles just copying bits around. Second, the garbage
collector will
be called more and more often to pick up the little stringy scraps
you’ve left
all over the memory.

Ropes (the name is a pun for a heavy duty string) are tree structures
where a
node represents the concatenation of its left branch with its right, and
leaves
are flat strings. (This is a little sloppy. A rope may contain shared
subtrees,
and is thus really a directed acyclic graph, where the out-edges of each
vertex
are ordered. We will continue to be sloppy.) E.g. To prepend Text A to
Text B,
one creates a Node (call it N1) with A as its left branch and B as its
right.
To further append Text C create a new Node N2 with its left branch
pointing to
N1 and its right to C. Easy, right? To find out more see Boehm,
Atkinson and
Plass “Ropes: an Alternative to Strings” at:

http://rubyurl.com/2FRbO

The task comes in three parts, each increasing in difficulty:

Part one:

Create a Rope class that can do the following:

a. 'append' or 'prepend' a String or another Rope
   (alias the << operator to the append function)
b. Return the fully concatenated text with 'to_s'
c. define 'slice' to call to_s.slice
d. provide a 'length' method

Part two:

Add the following:

a. Add the ability to 'index' a single character given a 0-based 

offset
from the beginning of the string.
b. Add the ability to ‘shift’ a single character from the front of a
Rope.
(Remove and return the character)
c. Add your own ‘slice’ method that returns a String. Implement as
many of
the String method’s forms as possible. To run the example code
this
function will only need to understand the slice(offset,length)
form.
Major Bonus for Regex and Substring forms.
d. “Balance” the tree with a ‘normalize’ method.
(see Boehm, Atkinson and Plass 1319 Rebalancing)

Part three: (bonus)

Add the following:

a. Change the 'slice' method to return a Rope. Ideally this method 

should
do as little string copying as possible. (Doing this will well
dramatically increase the speed of the example code)
b. Allow ‘shift’ to optionally accept an integer number of
characters to
remove and return.
c. modify the ‘<<’ operator so that can efficiently append a few
characters at a time. (see Boehm, Atkinson and Plass 1318 para.
4)
d. Major Bonus Add the ability to append and prepend IO classes in
a
lazy fashion. (see Boehm, Atkinson and Plass 1318 para. 2)

The following code may help you explore how efficient your code is and
show
where Ropes are useful. ruby -r /path/to/your/rope/class this_script.rb Rope
will run the test with your code. Run the script without arguments to
see how
well String does. Also play around with the SIZE and CHUNKS constants
to get a
feel for how they affect performance.

require 'benchmark'

#This code make a String/Rope of  CHUNCKS chunks of text
#each chunck is SIZE bytes long.  Each chunck starts with
#an 8 byte number.  Initially the chuncks are shuffled the
#qsort method sorts them into ascending order.
#
#pass the name of the class to use as a parameter
#ruby -r rope.rb this_file Rope

puts 'preparing data...'
TextClass = Object.const_get(ARGV.shift || :String)

def qsort(text)
  return TextClass.new if text.length == 0
  pivot = text.slice(0,8).to_s.to_i
  less = TextClass.new
  more = TextClass.new
  offset = 8+SIZE
  while (offset < text.length)
    i = text.slice(offset,8).to_s.to_i
    (i < pivot ? less : more) << text.slice(offset,8+SIZE)
    offset = offset + 8+SIZE
  end
  print "*"
  return qsort(less) << text.slice(0,8+SIZE) << qsort(more)
end

SIZE  = 512 * 1024
CHUNCKS = 128
CHARS = %w[R O P E]
data = TextClass.new
bulk_string =
  TextClass.new(Array.new(SIZE) { CHARS[rand(4)] }.join)
puts 'Building Text...'
build = Benchmark.measure do
  (0..CHUNCKS).sort_by { rand }.each do |n|
    data<< sprintf("%08i",n) << bulk_string
  end
  data.normalize  if data.respond_to? :normalize
end
GC.start
sort = Benchmark.measure do
  puts "Sorting Text..."
  qsort(data)
  puts"\nEND"
end

puts "Build: #{build}Sort: #{sort}"

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs