The benefits of tail recursion

I thought I’d share a useful little trick with the list. I’ve put
together a little library (27 lines for the actual implementation) that
can convert any function into a partially tail-recursive function,
without any alteration required, using a single command.

I ran a few tests on this (included with the library) to show the time
it took to step through my sample recursive function, which stepped
forward through the fibonacci sequence, returning the Nth number (where
N was chosen in advance). Tests were run separately for each one.

N = 10000
Recursive: 0.101 seconds, 22.2 MB RAM (VSZ for process, measured on
final return)
Tail Recursive: 0.132 seconds, 3.44 MB RAM

N = 15000
Recursive: 0.216 seconds, 35.6 MB RAM
Tail Recursive: 0.204 seconds, 3.71 MB RAM

N = 30000
Recursive: 0.984 seconds, 87.5 MB RAM
Tail Recursive: 0.465 seconds, 4.50 MB RAM

N = 100000
Recursive: 79.8 seconds, 579 MB RAM (warning: time is misleading due to
swapping)
Tail Recursive: 2.70 seconds, 8.49 MB RAM

As you can see, benefits in terms of memory begin immediately. Benefits
in terms of processing time take a while to realise, at about a depth
of 15000 in this example. The technique works by aliasing the method to
a random name, replacing it with a method that handles the tail
recursion, and relying on the other function to call it again.

There are a few restrictions:

  • The last action of the function must be to recurse or return the
    final value
  • The function must not spawn threads internally (this should work in
    threaded applications though).

Other than that, just define your method and call:

tail_recurse :my_method, max_depth

The max_depth factor is used because the library uses a continuation
internally (which is slow) so it is much faster to recurse to a fixed
depth (15 in the tests above) rather than using full tail recursion.
This will default to one, but increasing it will usually improve
performance. For examples of use, see the demonstration included with
the library.

Here’s the library itself:
require ‘thread’

class Module
def tail_recurse(meth, recurse_depth = 1)
random_name = (rand((16 ** 10) / 2) + 16 ** 10).to_s(16).to_sym;

# Create a randomly named alias of our original method
alias_method random_name, meth.to_sym

# Create a new method in the place of the old one
define_method meth.to_sym do |*args|
  return_val = nil
  if Thread.current[random_name]
    # This has been called from the recursive function
    Thread.current[random_name][:count] += 1 # Add one to the

period counter
if Thread.current[random_name][:count] % recurse_depth == 0
# Save our arguments and jump back using the continuation
Thread.current[random_name][:args] = args
Thread.current[random_name][:continuation].call
else
# Recurse another level
return_val = send(random_name, *args)
end
else
# This is a new call. Set up for recursion.
Thread.current[random_name] = {}
Thread.current[random_name][:count] = 0
Thread.current[random_name][:args] = args
callcc {|cont| Thread.current[random_name][:continuation] =
cont}
return_val = send(random_name,
*Thread.current[random_name][:args])
end
# Clear out the recursion information
Thread.current[random_name] = nil
return return_val
end
end
end

This is an example that will compare recursive and tail recursive

performance.

if $0 == FILE
class Fibonacci
def fib_tail(a, b, depth)
if depth == 1
#system('ps u -p ’ + Process.pid.to_s)
return b
else
return fib_tail(b, a+b, depth - 1)
end
end
tail_recurse :fib_tail, 15 # Partially tail recurse (15 levels
deep)

def fib_recursive(a, b, depth)
  if depth == 1
    #system('ps u -p ' + Process.pid.to_s)
    return b
  else
    return fib_recursive(b, a+b, depth - 1)
  end
end

end

fib = Fibonacci.new

depth = ARGV.shift
depth ||= 15000
depth = depth.to_i

puts “Normal recursive: #{depth} deep”
start_time = Time.now
val1 = fib.fib_recursive(1,1,depth)
difference = Time.now - start_time
puts “Took #{difference} seconds”

puts “Tail recursive: #{depth} deep”
start_time = Time.now
val2 = fib.fib_tail(1,1,depth)
difference = Time.now - start_time
puts “Took #{difference} seconds”

if val1 == val2
puts “Values Matched”
else
puts “ERROR: Values different”
puts “Recursive:”
puts val1
puts “Tail Recursive:”
puts val2
end
end

Sorry guys, I have a patch already. After the line

 alias_method random_name, meth.to_sym

I should have added

 private random_name

this ensures that nobody calls it directly from outside (this could
cause a problem).

do you know whether it’s possible to do Ackermann with this?

Very nice. However, I couldn’t run the normal recursed fibonacci test
beyond 1500 iterations on my machine without getting a ‘stack level
too deep’ error. How did you manage 100,000?

I was able to improve the performance of tail recursion quite a bit by
reducing the number of lookups:

class Module
def tail_recurse(meth, recurse_depth = 1)
random_name = (rand((16 ** 10) / 2) + 16 ** 10).to_s(16).to_sym;

Create a randomly named alias of our original method

alias_method random_name, meth.to_sym

Create a new method in the place of the old one

define_method meth.to_sym do |*args|
return_val = nil
if (local_store = Thread.current[random_name])
# This has been called from the recursive function
local_store[:count] += 1 # Add one to the period counter
if local_store[:count] % recurse_depth == 0
# Save our arguments and jump back using the continuation
local_store[:args] = args
local_store[:continuation].call
else
# Recurse another level
return_val = send(random_name, *args)
end
else
# This is a new call. Set up for recursion.
local_store = (Thread.current[random_name] = {})
local_store[:count] = 0
local_store[:args] = args
callcc {|cont| local_store[:continuation] = cont}
return_val = send(random_name, *local_store[:args])
end
# Clear out the recursion information
Thread.current[random_name] = nil
return return_val
end
end
end

This version provides gains from about 50 iterations up. I might have
accidentally broken the thread safety, though…

Paul.

Yes, this is a very nice little trick, especially with the performance
gains
at higher recursion depths.

It’s important to remember this isn’t a general-purpose tail-call
optimization, of course, since it requires a method be only
self-recursive
and you have to specify the method to optimize after-the-fact. It’s
still
very cool, and useful for a lot of algorithms that use self-recursion.
I’ve
recently seen similar tricks done in Python where an exception
containing
the next recursion’s call arguments is thrown back to the top “wrapper”
method, and the recursion continues from that point. It’s a similar
solution, but this is the first I’ve seen that has variable “throwback
depth”, which really helps optimize it.

Very nice!

Someone wanna run a comparison of performance between the two? If
continuations were fast enough, I’d expect it to be faster, but I’ve
heard
they’re not as fast as we’d like in Ruby.

On Tue, 14 Mar 2006 05:18:33 +0900, Charles O Nutter
[email protected]
wrote:

[snip]

Someone wanna run a comparison of performance between the two? If
continuations were fast enough, I’d expect it to be faster, but I’ve heard
they’re not as fast as we’d like in Ruby.

I just ran a simple test using this function:

def foo(n)
if n == 0
return n
else
return foo(n-1)
end
end

(I didn’t use factorial or fibonacci because I didn’t want the timings
to
be swamped by the costs of large Bignums when n is 1,000,000).

Timings for n = [10000, 100000, 1000000] put the throw/catch version at
approximately a factor of 1.6(ish) faster than the continuation based
version. (the throw/catch version is from ruby-talk:145593 as Pit
pointed
out above).

regards,
andrew

Timothy G. schrieb:

I thought I’d share a useful little trick with the list. I’ve put
together a little library (27 lines for the actual implementation) that
can convert any function into a partially tail-recursive function,
without any alteration required, using a single command.

If you’re interested, see ruby-talk:145593 for an implementation with
throw/catch instead of continuations.

Regards,
Pit