Replacing callcc by catch/throw


#1

Hello,

in his book ``The Ruby P.ming Language’’ Mats discourages from
using continuations. Nevertheless I’ve found a real world example,
where at a first look callcc could be easily replaced by catch and
throw, but indeed it’s a little bit problematic:

#!/usr/bin/env ruby

--mode: ruby; coding: utf-8;--

require ‘jcode’
require ‘pp’

$-K=‘U’

module MultiLine
def transform( input )
document( input, 0 ){ |p, value| value.flatten }
end

private
def document( input, p, acc = [] )
  loop {
    callcc { |proceed|
      return yield( p, acc ) unless input[p]
      multiline( input, p ){ |p0, v0|
        p = p0
        acc << v0
        proceed.call
      }
      character( input, p ){ |p1, v1|
        p = p1
        acc << v1
        proceed.call
      }
      return nil
    }
  }
end

def multiline( input, p )
  '(' == input[p] && multiline_tail( input, p+1 ){ |p2, v2|
    return yield( p2, v2 )
  }
end

def multiline_tail( input, p, acc=[] )
  loop {
    callcc { |proceed|
      case input[p]
      when nil
        return nil
      when ')'
        return yield( p+1, acc )
      when "\n", "\r"
        p += 1
        acc << ' '
        proceed.call
      end
      multiline( input, p ){ |p2, v2|
        p = p2
        acc << v2
        proceed.call
      }
      character( input, p ){ |p4, v4|
        p = p4
        acc << v4
        proceed.call
      }
      return nil
    }
  }
end

def character( input, p )
  case input[p]
  when nil
    return nil
  when '\\'
    case input[p+1]
    when nil
      return nil
    when '\\', '(', ')'
      return yield( p+2, input[p+1] )
    end
  end
  return yield( p+1, input[p] )
end

extend( self )

end

if FILE == $0
input = STDIN.read.split(//)
output = MultiLine.transform( input )
print output
end

The problem is, that the methods can be called recursively, and
catching from a throw should continue at the correct recursion level.
No symbol that has to be thrown/cached should be reused. How about
providing each ``catch’’ with an individual symbol?:

# ...
def document( input, p, acc = [] )
  proceed = gensym
  loop { catch proceed do
      return yield( p, acc ) unless input[p]
      multiline( input, p ){ |p0, v0|
        p = p0
        acc << v0
        throw proceed
      }
      character( input, p ){ |p1, v1|
        p = p1
        acc << v1
        throw proceed
      }
      return nil
    end }
end
# ... a.s.o. ...
def multiline_tail( input, p, acc=[] )
  proceed = gensym
  loop { catch proceed do
      case input[p]
      when nil
        return nil
      when ')'
        return yield( p+1, acc )
      when "\n", "\r"
        p += 1
        acc << ' '
        throw proceed
      end
      multiline( input, p ){ |p2, v2|
        p = p2
        acc << v2
        throw proceed
      }
      character( input, p ){ |p4, v4|
        p = p4
        acc << v4
        throw proceed
      }
      return nil
    end }
end
# ... a.s.o. ...

A new method is needed to generate the unique symbols:

def gensym
  [].object_id.to_s.to_sym
end

The catch/throw solution seems uglier to me than the callcc solution.
Do you have other suggestions?

Regards
Thomas


#2

Thomas H. wrote:

The catch/throw solution seems uglier to me than the callcc solution.
Do you have other suggestions?

Well, I don’t really understand what’s going on, but I observe from the
catch/throw version that the generated symbol is in a local variable and
doesn’t appear to be passed downwards. So what’s wrong with using the
same symbol everywhere?

Or indeed, what’s wrong with using loop { … next … } ?


#3

Brian C. removed_email_address@domain.invalid wrote/schrieb
removed_email_address@domain.invalid:

Well, I don’t really understand what’s going on, but I observe from the
catch/throw version that the generated symbol is in a local variable and
doesn’t appear to be passed downwards. So what’s wrong with using the
same symbol everywhere?

I’ve looked for the simpliest example to demonstrate the diffence.

Run the program and let read it from standard input one line
consisting of an opening and a closing parenthesis, nothing else:
()

The output will be an empty line. (Well, not very exiting, but I’ve
looked for the simpliest example.)

BUT when using the same symbol everywhere rather than generated
symbols, then the program never returns: it hangs! Why?

Or indeed, what’s wrong with using loop { … next … } ?

Just replacing throw'' bynext’’ doesn’t help in these situations:
loop { … multiline { … next … } … }
loop { … character { … next … } … }
^^^
executed despite ``next’’!

``next’’ will then quit only the innermost block, that’s not enough.
The code between the two closing curly braces will be executed, but it
shouldn’t.

Regards
Thomas


#4

Brian C. removed_email_address@domain.invalid wrote/schrieb
removed_email_address@domain.invalid:

So I think the solution is easy: use two different symbols, e.g.
catch(:next_document) in document, and catch(:next_multiline_tail) in
multiline_tail.

That’s not enough. Now recursion comes into place: multiline_tail can
call itselve via the method multiline. Then there’s the same nesting
problem, but renaming doesn’t help, because inner and outer
catch/throw pairs would be all renamed at the same time to the same
new symbol.

This works for the trivial document you provided. You might want to try
a more substantial example though :slight_smile:

Of course I want to try:

This input line
(())
should result in an empty output line.

Or even simpler, this input line
(()
should result in that output line:
(

Perl has statement labels for this:

OUTER:
while(1) {

next OUTER
}

Yes, I remember the labeled loops in Perl. It’s a proper lexicographic
label. As a reader of the source code you don’t have to track the
dynamic behaviour to find out where the jump will go to.

but I don’t think Ruby has a direct equivalent.

which seems to be a pitty. At a first look catch/throw pairs seem to
be equivalent, but as soon as two conflicting functions or – even
worse – recursions can happen, it turns out to be too week as a
direct replacement.

P.S. If I understand your code right, I think you can simplify multiline
to:
[…]
does is to relay the call back to the original block, so why not just
yield to the original block directly? You were using the ‘return’
statement too, which has special use within a block, but I think it may
not be needed here.

Thanks for your hints.

Regards
Thomas


#5

Thomas H. wrote:

Run the program and let read it from standard input one line
consisting of an opening and a closing parenthesis, nothing else:
()

The output will be an empty line. (Well, not very exiting, but I’ve
looked for the simpliest example.)

OK, I see the problem now. If you use the same symbol everywhere, this
happens:

def document …
loop do
catch(:foo) do

multiline do

throw(:foo) <-- invoked when block is yielded to

However, multiline calls multiline_tail, which has its own catch/throw
pair. If it also uses catch(:foo) then the throw will be caught inside
there instead.

So I think the solution is easy: use two different symbols, e.g.
catch(:next_document) in document, and catch(:next_multiline_tail) in
multiline_tail.

This works for the trivial document you provided. You might want to try
a more substantial example though :slight_smile:

Or indeed, what’s wrong with using loop { … next … } ?

Just replacing throw'' bynext’’ doesn’t help in these situations:
loop { … multiline { … next … } … }
loop { … character { … next … } … }
^^^
executed despite ``next’’!

``next’’ will then quit only the innermost block, that’s not enough.

You’re right, and realising that ‘next’ was being executed inside an
inner block is what put me on the right track. Perl has statement labels
for this:

OUTER:
while(1) {

next OUTER
}

but I don’t think Ruby has a direct equivalent.

As to the original question: I think the catch/throw version of the code
is far, far clearer than callcc. Continuations are strange beasts: they
are entire execution contexts which can be squirreled away and restarted
at any point in the future. They are essentially threads, albeit
non-preemptive.

A program which uses callcc can be very hard to understand. Your example
isn’t, but only when you realise that |proceed| is (a) a local variable,
(b) reset each time around the loop, and © is never passed to any
other method. This guarantees that previous continuation objects will
not be invoked later and will be garbage-collected.

But you have to analyse the code very carefully to draw this conclusion,
whereas a loop plus throw/catch is pretty obvious (I think).

Regards,

Brian.

P.S. If I understand your code right, I think you can simplify multiline
to:

def multiline( input, p, &blk )
  '(' == input[p] && multiline_tail( input, p+1, &blk )
end

or

def multiline( input, p, &blk )
  multiline_tail( input, p+1, &blk ) if '(' == input[p]
end

All that a block like

{ |p2,v2| yield p2,v2 }

does is to relay the call back to the original block, so why not just
yield to the original block directly? You were using the ‘return’
statement too, which has special use within a block, but I think it may
not be needed here.


#6

Thomas H. wrote:

Hello,

in his book ``The Ruby P.ming Language’’ Mats discourages from
using continuations. Nevertheless I’ve found a real world example,
where at a first look callcc could be easily replaced by catch and
throw, but indeed it’s a little bit problematic:

In an almost totally unrelated vein, I just committed support for
“one-shot downward continuations” to JRuby. Essentially it allows you to
use callcc in JRuby as long as the continuation doesn’t escape the
original block activation:

puts callcc {|cc| cc.call(‘ok!’)} #=> ok!

Or…

def foo(cc); cc.call(‘ok!’); end
puts callcc {|cc| foo(cc)} #=> ok!

But…

x = callcc {|cc| cc}
x.call #=> -e:1:in `call’: continuations can not be called from outside
their scope (LocalJumpError)

Thanks go to Evan P. for pointing out that this limited case is
easy to support, even in JRuby. I don’t know if anyone ever uses this,
but it works now.

Perhaps even as a replacement for catch/throw? (Hey, now I’m on topic!)

  • Charlie

#7

Thomas H. wrote:

That’s not enough. Now recursion comes into place: multiline_tail can
call itselve via the method multiline.

Oh yes - sorry for being dense. So the attraction of callcc is it
generates a new object each time which, when called, brings you back to
exactly this level of nesting. gensym simulates the same for
catch/throw.

It’s a limitation here that catch/throw can only use symbols, which are
not garbage-collectable, so eventually you’ll run out of RAM. You only
need one symbol per level of recursion, so you could keep a stack of
them, but that’s housekeeping.

What I’d suggest as a first step is to factor out the whole ‘loop which
can be restarted’ concept. This makes it very easy to experiment with
alternative implementations. Here’s one from the top of my head:

def restartable_loop
tag = Class.new(RuntimeError)
loop do
begin
yield tag
rescue tag
end
end
end

def restart(tag)
raise tag
end

I’ve attached your code with this modification, and it seems to work.

However if the logic were factored out like this I would object less to
the use of callcc, because I think it makes the intention much clearer.


#8

I’ve been keeping an eye on the mailing list and I haven’t seen
something
come up since I subscribed about zlib.dll, apologies if this is a stupid
question, but when I try and run one of my programs it says zlib.dll
cannot
be found. I’m trying to run it with Ruby 1.9.1… I’m not even entirely
sure
how to install 1.9.1 and Google didn’t help there either, so I just
extracted to an empty directory (C:\ruby19) and ran it from command
prompt
as ruby “pathtoprogram” and Windows tossed a ‘zlib.dll cannot be found’
error. I got IRB working, so I think I got it ‘installed’ right, but
what am
I doing wrong to keep getting this error?

  • Jayce

#9

Tanaka A. wrote:

The argument of catch is optional in Ruby 1.9.

catch {|tag|
throw tag
}

catch generates a unique tag if not given.

… and I see that even with an explicit tag, it doesn’t have to be a
symbol any more.

$ ruby19 -e “catch(Object.new) { }”
$ ruby -e “catch(Object.new) { }”
-e:1:in `catch’: #Object:0xb7e0adb4 is not a symbol (TypeError)
from -e:1


#10

In article removed_email_address@domain.invalid,
Thomas H. removed_email_address@domain.invalid writes:

The catch/throw solution seems uglier to me than the callcc solution.
Do you have other suggestions?

The argument of catch is optional in Ruby 1.9.

catch {|tag|
throw tag
}

catch generates a unique tag if not given.


#11

Robert D. wrote:

On Fri, Jan 30, 2009 at 4:26 AM, Charles Oliver N.
removed_email_address@domain.invalid wrote:

Perhaps even as a replacement for catch/throw? (Hey, now I’m on topic!)
My question is, and apologies if I am being stupid here, what is the
added value of such a cc? I mean what can it do that catch/throw
cannot. I have to admit that cc just makes my mind explode, so I might
miss the obvious.

We pass more specs? :slight_smile:

Seriously though…probably not a great deal, other than that
catch/throw feels a little weird to me itself. I figured if we could at
least support callcc in one way it would be better than not supporting
it at all. Do you think it’s a bad idea to do this, rather than having
callcc fail for all uses?

  • Charlie

#12

On Fri, Jan 30, 2009 at 4:26 AM, Charles Oliver N.
removed_email_address@domain.invalid wrote:

Perhaps even as a replacement for catch/throw? (Hey, now I’m on topic!)
My question is, and apologies if I am being stupid here, what is the
added value of such a cc? I mean what can it do that catch/throw
cannot. I have to admit that cc just makes my mind explode, so I might
miss the obvious.

  • Charlie

Cheers
Robert


#13

Tanaka A. removed_email_address@domain.invalid wrote/schrieb removed_email_address@domain.invalid:

The argument of catch is optional in Ruby 1.9.
[…]
catch generates a unique tag if not given.

Fine! I like that much more than it is in Ruby 1.8. Imagine a Ruby
program uses code from different libraries from different people. In
Ruby 1.8 there’s the danger that the catch/throw pairs do interfer in
a bad way as in my example, and because the tags are not part of the
interface, nobody sees the danger. For that reason the new way of
catch/throw should preferred to the old way whenever possible.

Regards
Thomas


#14

On Sat, Jan 31, 2009 at 9:25 AM, Thomas H.
removed_email_address@domain.invalid wrote:

interface, nobody sees the danger. For that reason the new way of
catch/throw should preferred to the old way whenever possible.
Bad design IMHO
catch/throw can be used perfectly save as follows

class MyClass

def thrown; @thrown ||= Object::new end

def do_throw; throw( @thrown, 42) end

def with_throw; catch( @thrown )do yield end end

end

But it is still a good use case for Charley’s idea !!
Cheers
Robert


#15

Brian C. removed_email_address@domain.invalid wrote/schrieb
removed_email_address@domain.invalid:

A program which uses callcc can be very hard to understand. Your example
isn’t, but only when you realise that |proceed| is (a) a local variable,
(b) reset each time around the loop, and © is never passed to any
other method. This guarantees that previous continuation objects will
not be invoked later and will be garbage-collected.

The lexicographical scope lets one see, if a locally created object is
passed away or not. In the former case you can have side effects, in
the latter not. In the example it’s obvious that the continuation will
not be reused.

But you have to analyse the code very carefully to draw this conclusion,
whereas a loop plus throw/catch is pretty obvious (I think).

I come to a different conclusion with regard to my example.

It’s with catch/throw, that the code had to be analysed very carfully.
May the immanent dangers of catch/throw have caused that you proposed
twice a simplification that turned out not to work? I think so.

On the other side the callcc-solution is simple and save. An easy
check of the variable’s scope makes it obvious that the continuation
is harmless in the example.

Regards
Thomas


#16

Robert D. wrote:

end

But it is still a good use case for Charley’s idea !!

The funny thing is that that’s basically what the “downward” callcc
support looks like inside JRuby; it’s just holding a magic token that’s
invalidated once the continuation leaves that part of the stack.

  • Charlie