Experimentation

I thought I’d implement some missing members on the String class in
order to
get my feet wet and start to understand the software. I chose
String.scanon the grounds that it was a fairly common function
(between 20 and 30
references in the standard library) with straightforward semantics, but
one
which requires dealing with overloads and blocks.

There are basically four variations of this function:
String.scan
String.scan
String.scan ,
String.scan ,

I’ve attempted to implement each of these, and believe that all but the
last
are correct. The variations are implemented in two different
flavors for both CLR strings and mutable strings. A patch can be found
at
http://hagenlocher.org/software/MutableString.scan.patch.txt

The two issues I ran into are as follows:

  1. The overload mechanism is picking the wrong method at runtime. Here
    are
    two of the function prototypes:

[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static List Scan(MutableString/!/ self,
MutableString/!/
searchStr)
[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self,
MutableString searchStr, BlockParam block)

When I run from rbx.exe, I get the following:

“hello world”.scan(“l”)
=> [“l”, “l”, “l”]
“hello world”.scan(“l”) {|x| print x}
=> [“l”, “l”, “l”]

In contrast, CRuby gives this:
irb(main):001:0> “hello world”.scan(“l”)
=> [“l”, “l”, “l”]
irb(main):002:0> “hello world”.scan(“l”) {|x| print x}
lll=> “hello world”

Am I doing something wrong, or is this a bug? (I have obviously updated
Initializer.Generated.cs, or neither scan would have been found :).

  1. My implementation of String.scan , is incomplete.
    This
    function is defined to behave as follows:
    a.scan(/\w+/) {|w| print "<<#{w}>> " }
    a.scan(/(.)(.)/) {|x,y| print y, x }

In other words, the number of parameters being passed to the block is
equal
to the number of groups defined in the regular expression – or 1 if
there
are no groups defined. I haven’t been able to find way to pass
parameters
or define a call site or that would support this.

Finally, it’s a bit annoying to rebuild Initializer.Generated.cs.
Whenever
you change a method signature, you have to manually delete the
appropriate
part of the old file in order to regenerate it, or you’ll get an error.
I’ve made an empty version of that source file and a batch file that
copies
it on top of the previous version before rebuilding ClassInitGenerator.
Assuming that the architecture is going to be here for a while, it would
be
nice if there were a target in the Rakefile that performed these steps.

After a few more hours of this, I may have to figure out how to do that
myself. :slight_smile:

  1.  Change the signatures to:
     [RubyMethodAttribute("scan", 
    

RubyMethodAttributes.PublicInstance)]
public static List/!/ Scan(MutableString/!/ self,
[NotNull]MutableString/!/ searchStr)

    [RubyMethodAttribute("scan", 

RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context,
MutableString/!/ self, [NotNull]BlockParam/!/ block,
[NotNull]MutableString/!/ searchStr)

a) Block is a special parameter and it must follow self parameter.
The order of parameters is:

  •     context
    
  •     self
    
  •     block
    
  •     mandatory parameters
    
  •     optional parameters
    
  •     params array (rest parameters)
    

b) Specify [NotNull] for parameters that cannot be null in order to
select the overload. You can assume that parameters doesn’t have null
value when called from a DLR language. It’s not a CLR attribute though
so it doesn’t prevent a non-dynamic languages to pass null. Since the
library methods are not supposed to be called directly (only Ruby
runtime should invoke the method), you don’t need to check for
non-nullity at runtime. Context parameter is always non-null (no need to
check for null at run-time). Self parameter is also non-null unless the
method is a module method or an instance method of NilClass.

c) Annotate types by /!/ annotation if you assume them to be
non-null. Although the annotation doesn’t affect run-time behavior at
all (being a comment, it’s ignored by C#) it is useful for static
analysis and expresses your assumptions.

d) Note also that unless marked by NotNull attribute, BlockParam is
nullable. Hence the overload might be eligible for invocation even
though no block has been passed. A null reference is used if the block
is not specified in a call site and there is no overload that matches
better. So a single overload with BlockParam parameter also works. It
depends on the semantics of the method which variant to chose. If
presence of the block significantly changes the behavior of the method
then it’s probably better to have two overloads. Code like
[RubyMethod(“foo”)]public static Foo(… block …) { if (block != null)
{ 1st overload implementation } else { 2nd overload implementation} }
should be avoided if possible; two overloads should be defined instead.
On the other hand, if the implementation almost doesn’t depend on
whether the block is present or not (it only affects a small part of the
implementation) then it’s probably better to have a single overload.

  1.  Check out dynamic site in Thread.CreateThread. It should do what 
    

you need. The magic is in ArgumentKind.List (splat).

  1.  Feel free to patch the Rakefile. Note however, that we are going 
    

to change the shape of libraries a little bit (in particular move
Builtins to IronRuby.Libraries.dll), so it might be necessary to adjust
the script again afterwards.

Note that MutableString has an instance method IndexOf, so you don’t
need to convert to a CLR string (the method internally makes the
conversion but that’s only provisional implementation).

Tomas

From: [email protected]
[mailto:[email protected]] On Behalf Of Curt
Hagenlocher
Sent: Saturday, October 13, 2007 8:16 PM
To: [email protected]
Subject: [Ironruby-core] Experimentation

I thought I’d implement some missing members on the String class in
order to get my feet wet and start to understand the software. I chose
String.scan on the grounds that it was a fairly common function (between
20 and 30 references in the standard library) with straightforward
semantics, but one which requires dealing with overloads and blocks.

There are basically four variations of this function:
String.scan
String.scan
String.scan ,
String.scan ,

I’ve attempted to implement each of these, and believe that all but the
last are correct. The variations are implemented in two
different flavors for both CLR strings and mutable strings. A patch can
be found at http://hagenlocher.org/software/MutableString.scan.patch.txt

The two issues I ran into are as follows:

  1. The overload mechanism is picking the wrong method at runtime. Here
    are two of the function prototypes:

[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static List Scan(MutableString/!/ self,
MutableString/!/ searchStr)
[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self, MutableString searchStr, BlockParam block)

When I run from rbx.exe, I get the following:

“hello world”.scan(“l”)
=> [“l”, “l”, “l”]
“hello world”.scan(“l”) {|x| print x}
=> [“l”, “l”, “l”]

In contrast, CRuby gives this:
irb(main):001:0> “hello world”.scan(“l”)
=> [“l”, “l”, “l”]
irb(main):002:0> “hello world”.scan(“l”) {|x| print x}
lll=> “hello world”

Am I doing something wrong, or is this a bug? (I have obviously updated
Initializer.Generated.cs, or neither scan would have been found :).

  1. My implementation of String.scan , is incomplete.
    This function is defined to behave as follows:
    a.scan(/\w+/) {|w| print "<<#{w}>> " }
    a.scan(/(.)(.)/) {|x,y| print y, x }

In other words, the number of parameters being passed to the block is
equal to the number of groups defined in the regular expression – or 1
if there are no groups defined. I haven’t been able to find way to pass
parameters or define a call site or that would support this.

Finally, it’s a bit annoying to rebuild Initializer.Generated.cs.
Whenever you change a method signature, you have to manually delete the
appropriate part of the old file in order to regenerate it, or you’ll
get an error. I’ve made an empty version of that source file and a
batch file that copies it on top of the previous version before
rebuilding ClassInitGenerator. Assuming that the architecture is going
to be here for a while, it would be nice if there were a target in the
Rakefile that performed these steps.

After a few more hours of this, I may have to figure out how to do that
myself. :slight_smile:

Whoops, looks like emails crossed. I think Tomas is right-- we’re not
picking the block overload because the block parameter needs to be after
self, not at the end like you might expect. Well, it’s worth trying to
see if that fixes the problem :slight_smile:

Cheers,
John

From: John M.
Sent: Saturday, October 13, 2007 11:14 PM
To: [email protected]
Subject: RE: [Ironruby-core] Experimentation

Hi Curt,

Sounds like a method binder bug. The workaround is to merge the two
methods into one:

[RubyMethod(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self, MutableString searchStr, BlockParam block) {
if (block == null) {
// block wasn’t supplied
} else {
// …
}
}

Even though a “BlockParam” argument is present, you can still call the
method w/o a block. (Essentially, all “BlockParams” are [Optional])
In the meantime, I’ll see if I can figure out why the binder is picking
the wrong overload.

  • John

From: [email protected]
[mailto:[email protected]] On Behalf Of Curt
Hagenlocher
Sent: Saturday, October 13, 2007 8:16 PM
To: [email protected]
Subject: [Ironruby-core] Experimentation

I thought I’d implement some missing members on the String class in
order to get my feet wet and start to understand the software. I chose
String.scan on the grounds that it was a fairly common function (between
20 and 30 references in the standard library) with straightforward
semantics, but one which requires dealing with overloads and blocks.

There are basically four variations of this function:
String.scan
String.scan
String.scan ,
String.scan ,

I’ve attempted to implement each of these, and believe that all but the
last are correct. The variations are implemented in two
different flavors for both CLR strings and mutable strings. A patch can
be found at http://hagenlocher.org/software/MutableString.scan.patch.txt

The two issues I ran into are as follows:

  1. The overload mechanism is picking the wrong method at runtime. Here
    are two of the function prototypes:

[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static List Scan(MutableString/!/ self,
MutableString/!/ searchStr)
[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self, MutableString searchStr, BlockParam block)

When I run from rbx.exe, I get the following:

“hello world”.scan(“l”)
=> [“l”, “l”, “l”]
“hello world”.scan(“l”) {|x| print x}
=> [“l”, “l”, “l”]

In contrast, CRuby gives this:
irb(main):001:0> “hello world”.scan(“l”)
=> [“l”, “l”, “l”]
irb(main):002:0> “hello world”.scan(“l”) {|x| print x}
lll=> “hello world”

Am I doing something wrong, or is this a bug? (I have obviously updated
Initializer.Generated.cs, or neither scan would have been found :).

  1. My implementation of String.scan , is incomplete.
    This function is defined to behave as follows:
    a.scan(/\w+/) {|w| print "<<#{w}>> " }
    a.scan(/(.)(.)/) {|x,y| print y, x }

In other words, the number of parameters being passed to the block is
equal to the number of groups defined in the regular expression – or 1
if there are no groups defined. I haven’t been able to find way to pass
parameters or define a call site or that would support this.

Finally, it’s a bit annoying to rebuild Initializer.Generated.cs.
Whenever you change a method signature, you have to manually delete the
appropriate part of the old file in order to regenerate it, or you’ll
get an error. I’ve made an empty version of that source file and a
batch file that copies it on top of the previous version before
rebuilding ClassInitGenerator. Assuming that the architecture is going
to be here for a while, it would be nice if there were a target in the
Rakefile that performed these steps.

After a few more hours of this, I may have to figure out how to do that
myself. :slight_smile:

Thanks for all of the excellent advice; everything is now working. I
had
guessed there might be an issue on the overload with the specific
signature
of the function, but I wasn’t able to find any examples of existing code
that contradicted what I had done. Now that I know what to look for, of
course… :). I wonder if it wouldn’t be a good idea for the
ClassInitGenerator to emit a warning when there’s a signature containing
a
BlockParam that doesn’t follow the correct pattern.

The /!/ annotation seems a bit redundant with [NotNull], doesn’t it? I
think I’m just resentful that it’s a bunch of characters that totally
interrupt the flow of my typing. :stuck_out_tongue:

I considered merging the block form of the scan function with the
blockless
form, but it seemed that the result would be considerably harder to read
and
understand than keeping them seperate.

Is this the sort of change you’d like to see submitted to the project?
If
so, I’ll write some tests (I know; should have done it the other way
around)
and generate another patch file.

Hi Curt,

Sounds like a method binder bug. The workaround is to merge the two
methods into one:

[RubyMethod(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self, MutableString searchStr, BlockParam block) {
if (block == null) {
// block wasn’t supplied
} else {
// …
}
}

Even though a “BlockParam” argument is present, you can still call the
method w/o a block. (Essentially, all “BlockParams” are [Optional])
In the meantime, I’ll see if I can figure out why the binder is picking
the wrong overload.

  • John

From: [email protected]
[mailto:[email protected]] On Behalf Of Curt
Hagenlocher
Sent: Saturday, October 13, 2007 8:16 PM
To: [email protected]
Subject: [Ironruby-core] Experimentation

I thought I’d implement some missing members on the String class in
order to get my feet wet and start to understand the software. I chose
String.scan on the grounds that it was a fairly common function (between
20 and 30 references in the standard library) with straightforward
semantics, but one which requires dealing with overloads and blocks.

There are basically four variations of this function:
String.scan
String.scan
String.scan ,
String.scan ,

I’ve attempted to implement each of these, and believe that all but the
last are correct. The variations are implemented in two
different flavors for both CLR strings and mutable strings. A patch can
be found at http://hagenlocher.org/software/MutableString.scan.patch.txt

The two issues I ran into are as follows:

  1. The overload mechanism is picking the wrong method at runtime. Here
    are two of the function prototypes:

[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static List Scan(MutableString/!/ self,
MutableString/!/ searchStr)
[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self, MutableString searchStr, BlockParam block)

When I run from rbx.exe, I get the following:

“hello world”.scan(“l”)
=> [“l”, “l”, “l”]
“hello world”.scan(“l”) {|x| print x}
=> [“l”, “l”, “l”]

In contrast, CRuby gives this:
irb(main):001:0> “hello world”.scan(“l”)
=> [“l”, “l”, “l”]
irb(main):002:0> “hello world”.scan(“l”) {|x| print x}
lll=> “hello world”

Am I doing something wrong, or is this a bug? (I have obviously updated
Initializer.Generated.cs, or neither scan would have been found :).

  1. My implementation of String.scan , is incomplete.
    This function is defined to behave as follows:
    a.scan(/\w+/) {|w| print "<<#{w}>> " }
    a.scan(/(.)(.)/) {|x,y| print y, x }

In other words, the number of parameters being passed to the block is
equal to the number of groups defined in the regular expression – or 1
if there are no groups defined. I haven’t been able to find way to pass
parameters or define a call site or that would support this.

Finally, it’s a bit annoying to rebuild Initializer.Generated.cs.
Whenever you change a method signature, you have to manually delete the
appropriate part of the old file in order to regenerate it, or you’ll
get an error. I’ve made an empty version of that source file and a
batch file that copies it on top of the previous version before
rebuilding ClassInitGenerator. Assuming that the architecture is going
to be here for a while, it would be nice if there were a target in the
Rakefile that performed these steps.

After a few more hours of this, I may have to figure out how to do that
myself. :slight_smile:

On 10/14/07, Tomas M. [email protected] wrote:

Yes, you’re right it seems redundant but unfortunately it’s not. The
problem is that our (i.e. DLR’s) NotNull attribute is currently not
understood by Spec# (static verifier). It’s only influencing the overload
resolution, nothing else. So Spec# would complain that you’re e.g.
“dotting thru” a variable that might be null.

Ah, I didn’t realize there was a program looking directly at the source
code.

If you can define a macro in your favorite text editor, this is the
right use of that feature.

:slight_smile:

Yes, we accept contributions to libraries, i.e. to
IronRuby.Libraries.dll. Although MutableString class will stay where it
is, Ruby method implementations for MutableString will soon get factored out
to the Libraries assembly. Hence we could take your contribution then if it
will be correct and efficient.

I see that the operations have now been factored out into
MutableStringOps.cs, presumably in preparation for this migration. If
it’s
not premature, there’s an updated patch file (is that the preferred
format?)
at http://hagenlocher.org/software/MutableStringOps.scan.patch.txt which
incorporates implementations for scan, upto, swapcase and swapcase!, as
well
as tests for all of these changes in test.string.rb.

These days, i18n is at the top of my brain as my company works on
internationalizing its product. It’s charmingly quaint to see that
IronRuby
(and presumably Ruby) have explicit tests against the ranges ‘A’-‘Z’ and
‘a’-‘z’ for purposes of checking and changing the case of individual
letters
:).

Inline.

From: [email protected]
[mailto:[email protected]] On Behalf Of Curt
Hagenlocher
Sent: Sunday, October 14, 2007 6:52 AM
To: [email protected]
Subject: Re: [Ironruby-core] Experimentation

Thanks for all of the excellent advice; everything is now working. I
had guessed there might be an issue on the overload with the specific
signature of the function, but I wasn’t able to find any examples of
existing code that contradicted what I had done. Now that I know what
to look for, of course… :). I wonder if it wouldn’t be a good idea
for the ClassInitGenerator to emit a warning when there’s a signature
containing a BlockParam that doesn’t follow the correct pattern.

Yes, that’s the plan - CIG will check for other errors as well.

The /!/ annotation seems a bit redundant with [NotNull], doesn’t it? I
think I’m just resentful that it’s a bunch of characters that totally
interrupt the flow of my typing. :stuck_out_tongue:

Yes, you’re right it seems redundant but unfortunately it’s not. The
problem is that our (i.e. DLR’s) NotNull attribute is currently not
understood by Spec# (static verifier). It’s only influencing the
overload resolution, nothing else. So Spec# would complain that you’re
e.g. “dotting thru” a variable that might be null.

If you can define a macro in your favorite text editor, this is the
right use of that feature.

I considered merging the block form of the scan function with the
blockless form, but it seemed that the result would be considerably
harder to read and understand than keeping them seperate.

Is this the sort of change you’d like to see submitted to the project?
If so, I’ll write some tests (I know; should have done it the other way
around) and generate another patch file.

Yes, we accept contributions to libraries, i.e. to
IronRuby.Libraries.dll. Although MutableString class will stay where it
is, Ruby method implementations for MutableString will soon get factored
out to the Libraries assembly. Hence we could take your contribution
then if it will be correct and efficient.

Tomas

On 10/13/07, Tomas M.
<[email protected]mailto:[email protected]>
wrote:

  1.  Change the signatures to:
    
     [RubyMethodAttribute("scan", 
    

RubyMethodAttributes.PublicInstance)]

    public static List<object> /*!*/ Scan(MutableString/*!*/ self, 

[NotNull]MutableString /!/ searchStr)

    [RubyMethodAttribute("scan", 

RubyMethodAttributes.PublicInstance)]

    public static object Scan(CodeContext /*!*/ context, 

MutableString/!/ self, [NotNull]BlockParam /!/ block,
[NotNull]MutableString/!/ searchStr)

a) Block is a special parameter and it must follow self parameter.
The order of parameters is:

  •     context
    
  •     self
    
  •     block
    
  •     mandatory parameters
    
  •     optional parameters
    
  •     params array (rest parameters)
    

b) Specify [NotNull] for parameters that cannot be null in order to
select the overload. You can assume that parameters doesn’t have null
value when called from a DLR language. It’s not a CLR attribute though
so it doesn’t prevent a non-dynamic languages to pass null. Since the
library methods are not supposed to be called directly (only Ruby
runtime should invoke the method), you don’t need to check for
non-nullity at runtime. Context parameter is always non-null (no need to
check for null at run-time). Self parameter is also non-null unless the
method is a module method or an instance method of NilClass.

c) Annotate types by /!/ annotation if you assume them to be
non-null. Although the annotation doesn’t affect run-time behavior at
all (being a comment, it’s ignored by C#) it is useful for static
analysis and expresses your assumptions.

d) Note also that unless marked by NotNull attribute, BlockParam is
nullable. Hence the overload might be eligible for invocation even
though no block has been passed. A null reference is used if the block
is not specified in a call site and there is no overload that matches
better. So a single overload with BlockParam parameter also works. It
depends on the semantics of the method which variant to chose. If
presence of the block significantly changes the behavior of the method
then it’s probably better to have two overloads. Code like
[RubyMethod(“foo”)]public static Foo(… block …) { if (block != null)
{ 1 st overload implementation } else { 2nd overload implementation} }
should be avoided if possible; two overloads should be defined instead.
On the other hand, if the implementation almost doesn’t depend on
whether the block is present or not (it only affects a small part of the
implementation) then it’s probably better to have a single overload.

  1.  Check out dynamic site in Thread.CreateThread. It should do what 
    

you need. The magic is in ArgumentKind.List (splat).

  1.  Feel free to patch the Rakefile. Note however, that we are going 
    

to change the shape of libraries a little bit (in particular move
Builtins to IronRuby.Libraries.dll), so it might be necessary to adjust
the script again afterwards.

Note that MutableString has an instance method IndexOf, so you don’t
need to convert to a CLR string (the method internally makes the
conversion but that’s only provisional implementation).

Tomas

From: [email protected]
mailto:[email protected]
[mailto:[email protected]mailto:[email protected]]
On Behalf Of Curt H.
Sent: Saturday, October 13, 2007 8:16 PM
To: [email protected]mailto:[email protected]
Subject: [Ironruby-core] Experimentation

I thought I’d implement some missing members on the String class in
order to get my feet wet and start to understand the software. I chose
String.scan on the grounds that it was a fairly common function (between
20 and 30 references in the standard library) with straightforward
semantics, but one which requires dealing with overloads and blocks.

There are basically four variations of this function:
String.scan
String.scan
String.scan ,
String.scan ,

I’ve attempted to implement each of these, and believe that all but the
last are correct. The variations are implemented in two
different flavors for both CLR strings and mutable strings. A patch can
be found at http://hagenlocher.org/software/MutableString.scan.patch.txt

The two issues I ran into are as follows:

  1. The overload mechanism is picking the wrong method at runtime. Here
    are two of the function prototypes:

[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static List Scan(MutableString/!/ self,
MutableString/!/ searchStr)

[RubyMethodAttribute(“scan”, RubyMethodAttributes.PublicInstance)]
public static object Scan(CodeContext/!/ context, MutableString/!/
self, MutableString searchStr, BlockParam block)

When I run from rbx.exe, I get the following:

“hello world”.scan(“l”)
=> [“l”, “l”, “l”]

“hello world”.scan(“l”) {|x| print x}
=> [“l”, “l”, “l”]

In contrast, CRuby gives this:

irb(main):001:0> “hello world”.scan(“l”)
=> [“l”, “l”, “l”]
irb(main):002:0> “hello world”.scan(“l”) {|x| print x}
lll=> “hello world”

Am I doing something wrong, or is this a bug? (I have obviously updated
Initializer.Generated.cs, or neither scan would have been found :).

  1. My implementation of String.scan , is incomplete.
    This function is defined to behave as follows:

    a.scan(/\w+/) {|w| print "<<#{w}>> " }
    a.scan(/(.)(.)/) {|x,y| print y, x }

In other words, the number of parameters being passed to the block is
equal to the number of groups defined in the regular expression – or 1
if there are no groups defined. I haven’t been able to find way to pass
parameters or define a call site or that would support this.

Finally, it’s a bit annoying to rebuild Initializer.Generated.cs.
Whenever you change a method signature, you have to manually delete the
appropriate part of the old file in order to regenerate it, or you’ll
get an error. I’ve made an empty version of that source file and a
batch file that copies it on top of the previous version before
rebuilding ClassInitGenerator. Assuming that the architecture is going
to be here for a while, it would be nice if there were a target in the
Rakefile that performed these steps.

After a few more hours of this, I may have to figure out how to do that
myself. :slight_smile:

Curt H.

[email protected]mailto:[email protected]

On 10/14/07, Curt H. [email protected] wrote:

If it’s not premature, there’s an updated patch file (is that the
preferred format?) at
http://hagenlocher.org/software/MutableStringOps.scan.patch.txt which
incorporates implementations for scan, upto, swapcase and swapcase!, as well
as tests for all of these changes in test.string.rb.

Proving that “you can’t eat just one”, I’ve amended this to include
center,
chomp, chomp!, chop and chop!. And they say that watching television
isn’t
productive…