Forum: Ruby local variable assertion

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
unknown (Guest)
on 2007-01-19 17:31
(Received via mailing list)
I've started studying Ruby, and while I like it, one thing that bothers
me is that there is not a way to explicitly declare a variable in order
to say "I want a variable local to this scope: I don't want to reuse
some variable by the same name in a containing scope."  In particular,
the fact that giving a block parameter the same name as an existing
variable can overwrite that variable troubles me.

Now I haven't written enough Ruby code to know whether this is really a
problem in practice or if it is just a theoretical concern.

Nevertheless, I thought that the problem  could be addressed if there
was some way to "declare" your local variables before you use them.  I
put "declare" in quotes because that isn't the right word in Ruby.
What I wanted was a facility to assert that a variable is not yet in
use.

I came up with the code that follows.  The introductory comment
explains.  I imagine that someone has already done this, but I'd be
interested to hear what folks think.

Thanks,

    David Flanagan
------------
module Kernel
  # Assert that the named variables do not exist yet,
  # so that they can be used as local variables in the block without
  # clobbering an existing variable
  #
  # This method expects any number of variable names as arguments.
  # The names may be specified as symbols or strings.
  # The method must be invoked with an associated block, although the
  # block may be empty. It uses the binding of the block with eval to
check
  # whether the variable names are in use yet, and throws a NameError
if
  # any of them are currently used.
  #
  # If the block associated with local expects no arguments, then this
method
  # invokes it. The code within the block can safely use the symbols
  # passed to local.  If the block expects arguments, then local
assumes
  # that the block is intended for the caller and just returns it.
  #
  # Here are typical some uses of this method:
  #
  #   local :x, :y  do     # Execute a block in which x and y are local
vars
  #     data.each do |x|
  #       y = x*x
  #       puts y
  #     end
  #   end
  #
  # Here's a way to use local where nested blocks are not needed:
  #
  #   data.each &local(:x) {|x| puts x*x }
  #
  # Here's a way to use it as an assertion with an empty block
  #
  #   local(:x, :y) {}  # Assert that x and y aren't in use yet.
  #   data.each do |x|  # Now go use those variables
  #     y = x*x
  #     puts y
  #   end
  #
  #
  def local(*syms, &block)
    syms.each do |sym|
      # First, see if the symbol itself is defined as a variable or
method
      # XXX: do I also need to check for methods like x=?
      # XXX Would it be simpler or faster to do eval local_variables
instead?
      value = eval("defined? #{sym.to_s}", block)
      # If it is not defined, then go on to the next symbol
      next if !value
      # Otherwise, the symbol is in use, so raise an exception
      raise NameError.new("#{sym} is already a #{value}")
    end

    # If none of the symbols are in use, then we can proceed.
    # What we do next depends on the arity of the block, however.
    # If the block expects no arguments, then we just call it
    # If the block was declared with arguments, then it is not intended
    # for this method.  Instead, we return it so our caller can invoke
it.
    if block.arity == 0 or block.arity == -1
      block.call
    else
      block
    end
  end
end
unknown (Guest)
on 2007-01-19 17:31
(Received via mailing list)
Funny wordwrapping of the code and comments in that post...

You can also see the code at my blog:
http://www.davidflanagan.com/blog/2007_01.html#000120

    David
gga (Guest)
on 2007-09-26 00:29
(Received via mailing list)
removed_email_address@domain.invalid ha escrito:

> I've started studying Ruby, and while I like it, one thing that bothers
> me is that there is not a way to explicitly declare a variable in order
> to say "I want a variable local to this scope: I don't want to reuse
> some variable by the same name in a containing scope."  In particular,
> the fact that giving a block parameter the same name as an existing
> variable can overwrite that variable troubles me.
>
> Now I haven't written enough Ruby code to know whether this is really a
> problem in practice or if it is just a theoretical concern.

I think it is more theoretical, as that would indicate you are really
writing VERY long functions.
Remember that in Ruby, global variables are $, instance variables are @
and class variables are @@, so there's a very rare chance of conflict.

That being said, your code can be done simpler, like:

def let(*syms, &block)
   raise "No block provided for undefined?" unless block_given?

  syms.each do |sym|
        value = eval("defined? #{sym.to_s}", block)
        next if !value
        raise NameError.new("#{sym} is already a #{value}")
   end

   yield
end


# block check
let( :x ) {  x = 20 }

begin
  let( :x ) {   p 'never run' }
rescue
end

# Forgot block...
let( :x )
unknown (Guest)
on 2007-09-26 00:38
(Received via mailing list)
On Sat, 20 Jan 2007 removed_email_address@domain.invalid wrote:

>
> It changes it because you fail-fast with a NameError rather than possibly
> introducing a bug that may not be near to the source of the error.
>

but you could fail faster?  by the time you decide the names you will
use
you no longer need 'local'?

>> local and which not.
>
> The local vars are the ones that you want to be local in the block.  I
> think this is always easy.  And, when you have to cut-and-paste, you
> copy the entire local block, so that the protection it gives you
> travels with the code.

i think this is misleading. take your example:  if cut and paste this

   local :x do
     data.map!{|x| x ** 2}
   end

somewhere else i'm safe not only if x hasn't been used in the new scope,
but
also if data hasn't, or is has, but it's the correct value.  the thing
is you
are not going paste that code without knowing where 'data' is coming
from: you
haven't eliminated the problem or even really reduced it since you
__still__
must ensure your current scope isn't too big you wrap your brain around:
you've got to know where data is coming from and it's going to come from
exactly the same place 'x' is - the current scope, which you must
understand
in it's entirety in order to use 'data' properly in the new
cut-and-pasted
context...

the entire concept that a programming contruct can make it safe to cut
and
paste code is really quite a strech...

your local impl is __still__ a mixed scope like any other block in ruby
and
therefore suffers exactly the same issues: in the above you could easily
clobber a local version of 'data', especially if you cut and pasted it
into a
scope where it's origin was unknown.

in summary, i don't think one can possibly solve the issues of mixed
scoping
of blocks with a method that takes a mixedly scoped block!  ;-)

in addtion, the local impl requires __twice__ as many definitions of
local
variables and we all know where that goes: more lines almost never
equals
fewer bugs - that's the d.r.y principle that's so big in the ruby
community.

in anycase i think matz's block-local vars, due for ruby 1.9 address the
largest issues with block scoping already.

cheers.

-a
unknown (Guest)
on 2007-09-26 00:39
(Received via mailing list)
On Sat, 20 Jan 2007 removed_email_address@domain.invalid wrote:

>
> I disagree.  Local variables are used most often.  I see a good chance of
> getting bitten by this problem, especially when using generic variable names
> like i or x as block parameters to loop iterators.
>

i've been writing ruby in production 90% of my coding time for nearly 6
years
and have hit this only one or two times.  i'd say it's a valid concern
but
nearly always a massive sign of code smell: one simply should never have
too
many variables to keep in ones head in scope at any given moment.  if
one
does, it's time to refactor.

> Suppose you've got a simple loop to iterate through an array
>
>   data.each { |x| x*x}
>
> Now you refactor some code and end up cutting-and-pasting that loop into a
> method that happens to use x as a parameter.  Suddenly your loop behaves
> differently.  x is no longer local to the block and it overwrites the local
> variable in your method.

you're quite right.  what i fail to see is how a 'local' method changes
that
one bit.  consider, say you paste the above snippet into some code that
has an
'x' defined 20 lines up, you don't notice and introduce a bug.  in order
to
prevent this you are advocating this

   local :x do
     data.each { |x| x*x}
   end

so a re-def of x will raise an error.  at first glance that seems ok.
consider this however: one must __know_in_advance__ which vars to
declare
local and which not.  in your example it's the only obivous one but, in
fact,
there are two candidates: 'x' and 'data'.  now, in this case we know
that we
do, in fact, require the 'data' var __not__ to be local, but to picked
up from
the current scope.  note that it's __precisely__ this ability to do
mixed
scoping which makes blocks useful at all - otherwise we'd all just pass
stuff around.

perhaps you see where i'm going?  in order to use local effectively with
even
a moderately complex peice of code one needs to look at the code and
decide
which vars should be local, which should be block-local (ruby 1.9), and
which
should be scoped normally.  all this has to be known __up_front__!

the thing is, if i have to know, as a programmer, up front which vars to
declare local then i don't have a problem any more!  ;-)

so, imho, you are correct in pointing out a source for errors but
blocks, like
all coding contructs, must be weighed by comparing advantages vs.
disadvatages
and the mixing of scopes certainly resides on both lists.

it would be nice if way existed to solve the problem you have
underscored, but
if that solution requires me to do the same amount of work that i had to
do
before to solve it 'manually' then it simply becomes line noise and, as
we all
know, any code you write that you don't have to is simply adding bugs.

my 2cts.

kind regards.

-a
Rob S. (Guest)
on 2007-09-26 00:42
(Received via mailing list)
On 1/19/07, removed_email_address@domain.invalid 
<removed_email_address@domain.invalid> wrote:
> Funny wordwrapping of the code and comments in that post...
>
> You can also see the code at my blog:
> http://www.davidflanagan.com/blog/2007_01.html#000120
>
>     David

Without diving too much into the implementation of this, I would say
if you really find yourself needing it alot you should refactor to
smaller methods that do less stuff.  I really find composed method and
extract method are some of the most critical refactorings in Ruby.

Oh yea, and welcome to Ruby =).  I'm sure you'll find a lot to love
coming from the Javascript world.  There are some libraries that let
you write prototype style Ruby similiar to idiomatic Javascript.

- Rob
unknown (Guest)
on 2007-09-26 00:44
(Received via mailing list)
Rob,

Yes, breaking long methods up is usually good.  If the smaller methods
that one is refactoring into are not of general utility, however, then
I would argue (perhaps in my JavaScript mindset) that they should not
be methods, but lambdas instead.    But re-factoring into lambdas
doesn't help with the local variable issue since you can never be
confident about the scope of your lambda parameters.

Isn't refactoring, in fact, one of the scenarios where you run into
problems with variable overlap?  If you cut-and-paste a block from one
method into another, and the new method uses a variable that has the
same names as one of the block parameters, you've just set yourself up
for trouble.

In don't like Perl, but I do think that Perl's "my" variables solve
this problem elegantly.

  David
unknown (Guest)
on 2007-09-26 00:45
(Received via mailing list)
On Jan 19, 5:58 am, "gga" <removed_email_address@domain.invalid> wrote:

> > Now I haven't written enough Ruby code to know whether this is really a
> > problem in practice or if it is just a theoretical concern.I think it is more 
theoretical, as that would indicate you are really
> writing VERY long functions.
> Remember that in Ruby, global variables are $, instance variables are @
> and class variables are @@, so there's a very rare chance of conflict.

I disagree.  Local variables are used most often.  I see a good chance
of getting bitten by this problem, especially when using generic
variable names like i or x as block parameters to loop iterators.

Suppose you've got a simple loop to iterate through an array

   data.each { |x| x*x}

Now you refactor some code and end up cutting-and-pasting that loop
into a method that happens to use x as a parameter.  Suddenly your loop
behaves differently.  x is no longer local to the block and it
overwrites the local variable in your method.

>
>    yield
> end

This actually breaks the second-use case for my method.  If the block
expects parameters, then I want my method to return the block so that
it can be passed on to the calling method.  This allows me to use
local() without having to nest blocks.  Consider this invocation:

    data.each &local(:x) {|x| puts x*x }

The block is passed to local, which checks that it is safe to use x as
a variable in the block.  Then local() returns the block, which gets
passed, in turn, to the each() iterator.  The & and the required
parentheses make this syntax a little messy but it allows one block
instead of two.

> # block check
> let( :x ) {  x = 20 }
>
> begin
>   let( :x ) {   p 'never run' }
> rescue
> end

I believe this would actually print 'never run'.  Since x is not a
local variable, its use in the first block remains local to that block.

> # Forgot block...
> let( :x )

This won't work: the block is needed to pass to eval() for checking for
the existance of the local varables.  Otherwise, I'm just checking for
local varaibles inside the local() method itself.

  David
gga (Guest)
on 2007-09-26 00:46
(Received via mailing list)
removed_email_address@domain.invalid wrote:
>
> This actually breaks the second-use case for my method.  If the block
> expects parameters, then I want my method to return the block so that
> it can be passed on to the calling method.

Ah, I see. Sorry, I did not catch that from your docs.  It is indeed an
ugly construct.
I'm not sure trying to save a block is a smart move.  You end up with a
method that behaves and returns something very differently just based
on a block's arity.  That's just a huge headache waiting to happen.

I mean.... if you are really concerned about the efficiency of this:

local(:x) { data.each { |x| x*x } }

I'd say you are definitively guilty of premature optimization.


>
> I believe this would actually print 'never run'.  Since x is not a
> local variable, its use in the first block remains local to that block.
>

Correct, actually.  I seemed to have forgotten the x = 0 before the
begin block.  Sorry.

PS. Welcome to the Ruby community.  Looking forward to see what you'll
do with ruby.
unknown (Guest)
on 2007-09-26 00:48
(Received via mailing list)
On Jan 19, 12:59 pm, removed_email_address@domain.invalid wrote:
> one bit.  consider, say you paste the above snippet into some code that has an
It changes it because you fail-fast with a NameError rather than
possibly introducing a bug that may not be near to the source of the
error.

> 'x' defined 20 lines up, you don't notice and introduce a bug.  in order to
> prevent this you are advocating this
>
>    local :x do
>      data.each { |x| x*x}
>    end
>
> so a re-def of x will raise an error.  at first glance that seems ok.
> consider this however: one must __know_in_advance__ which vars to declare
> local and which not.

The local vars are the ones that you want to be local in the block.  I
think this is always easy.  And, when you have to cut-and-paste, you
copy the entire local block, so that the protection it gives you
travels with the code.
This topic is locked and can not be replied to.