Local variable assertion


#1

I’ve started studying Ruby, and while I like it, one thing that bothers
me is that there is not a way to explicitly declare a variable in order
to say “I want a variable local to this scope: I don’t want to reuse
some variable by the same name in a containing scope.” In particular,
the fact that giving a block parameter the same name as an existing
variable can overwrite that variable troubles me.

Now I haven’t written enough Ruby code to know whether this is really a
problem in practice or if it is just a theoretical concern.

Nevertheless, I thought that the problem could be addressed if there
was some way to “declare” your local variables before you use them. I
put “declare” in quotes because that isn’t the right word in Ruby.
What I wanted was a facility to assert that a variable is not yet in
use.

I came up with the code that follows. The introductory comment
explains. I imagine that someone has already done this, but I’d be
interested to hear what folks think.

Thanks,

David Flanagan

module Kernel

Assert that the named variables do not exist yet,

so that they can be used as local variables in the block without

clobbering an existing variable

This method expects any number of variable names as arguments.

The names may be specified as symbols or strings.

The method must be invoked with an associated block, although the

block may be empty. It uses the binding of the block with eval to

check

whether the variable names are in use yet, and throws a NameError

if

any of them are currently used.

If the block associated with local expects no arguments, then this

method

invokes it. The code within the block can safely use the symbols

passed to local. If the block expects arguments, then local

assumes

that the block is intended for the caller and just returns it.

Here are typical some uses of this method:

local :x, :y do # Execute a block in which x and y are local

vars

data.each do |x|

y = x*x

puts y

end

end

Here’s a way to use local where nested blocks are not needed:

data.each &local(:x) {|x| puts x*x }

Here’s a way to use it as an assertion with an empty block

local(:x, :y) {} # Assert that x and y aren’t in use yet.

data.each do |x| # Now go use those variables

y = x*x

puts y

end

def local(*syms, &block)
syms.each do |sym|
# First, see if the symbol itself is defined as a variable or
method
# XXX: do I also need to check for methods like x=?
# XXX Would it be simpler or faster to do eval local_variables
instead?
value = eval(“defined? #{sym.to_s}”, block)
# If it is not defined, then go on to the next symbol
next if !value
# Otherwise, the symbol is in use, so raise an exception
raise NameError.new("#{sym} is already a #{value}")
end

# If none of the symbols are in use, then we can proceed.
# What we do next depends on the arity of the block, however.
# If the block expects no arguments, then we just call it
# If the block was declared with arguments, then it is not intended
# for this method.  Instead, we return it so our caller can invoke

it.
if block.arity == 0 or block.arity == -1
block.call
else
block
end
end
end


#2

Funny wordwrapping of the code and comments in that post…

You can also see the code at my blog:
http://www.davidflanagan.com/blog/2007_01.html#000120

David

#3

removed_email_address@domain.invalid ha escrito:

I’ve started studying Ruby, and while I like it, one thing that bothers
me is that there is not a way to explicitly declare a variable in order
to say “I want a variable local to this scope: I don’t want to reuse
some variable by the same name in a containing scope.” In particular,
the fact that giving a block parameter the same name as an existing
variable can overwrite that variable troubles me.

Now I haven’t written enough Ruby code to know whether this is really a
problem in practice or if it is just a theoretical concern.

I think it is more theoretical, as that would indicate you are really
writing VERY long functions.
Remember that in Ruby, global variables are $, instance variables are @
and class variables are @@, so there’s a very rare chance of conflict.

That being said, your code can be done simpler, like:

def let(*syms, &block)
raise “No block provided for undefined?” unless block_given?

syms.each do |sym|
value = eval(“defined? #{sym.to_s}”, block)
next if !value
raise NameError.new("#{sym} is already a #{value}")
end

yield
end

block check

let( :x ) { x = 20 }

begin
let( :x ) { p ‘never run’ }
rescue
end

Forgot block…

let( :x )


#4

On Sat, 20 Jan 2007 removed_email_address@domain.invalid wrote:

I disagree. Local variables are used most often. I see a good chance of
getting bitten by this problem, especially when using generic variable names
like i or x as block parameters to loop iterators.

i’ve been writing ruby in production 90% of my coding time for nearly 6
years
and have hit this only one or two times. i’d say it’s a valid concern
but
nearly always a massive sign of code smell: one simply should never have
too
many variables to keep in ones head in scope at any given moment. if
one
does, it’s time to refactor.

Suppose you’ve got a simple loop to iterate through an array

data.each { |x| x*x}

Now you refactor some code and end up cutting-and-pasting that loop into a
method that happens to use x as a parameter. Suddenly your loop behaves
differently. x is no longer local to the block and it overwrites the local
variable in your method.

you’re quite right. what i fail to see is how a ‘local’ method changes
that
one bit. consider, say you paste the above snippet into some code that
has an
‘x’ defined 20 lines up, you don’t notice and introduce a bug. in order
to
prevent this you are advocating this

local :x do
data.each { |x| x*x}
end

so a re-def of x will raise an error. at first glance that seems ok.
consider this however: one must know_in_advance which vars to
declare
local and which not. in your example it’s the only obivous one but, in
fact,
there are two candidates: ‘x’ and ‘data’. now, in this case we know
that we
do, in fact, require the ‘data’ var not to be local, but to picked
up from
the current scope. note that it’s precisely this ability to do
mixed
scoping which makes blocks useful at all - otherwise we’d all just pass
stuff around.

perhaps you see where i’m going? in order to use local effectively with
even
a moderately complex peice of code one needs to look at the code and
decide
which vars should be local, which should be block-local (ruby 1.9), and
which
should be scoped normally. all this has to be known up_front!

the thing is, if i have to know, as a programmer, up front which vars to
declare local then i don’t have a problem any more! :wink:

so, imho, you are correct in pointing out a source for errors but
blocks, like
all coding contructs, must be weighed by comparing advantages vs.
disadvatages
and the mixing of scopes certainly resides on both lists.

it would be nice if way existed to solve the problem you have
underscored, but
if that solution requires me to do the same amount of work that i had to
do
before to solve it ‘manually’ then it simply becomes line noise and, as
we all
know, any code you write that you don’t have to is simply adding bugs.

my 2cts.

kind regards.

-a


#5

On Sat, 20 Jan 2007 removed_email_address@domain.invalid wrote:

It changes it because you fail-fast with a NameError rather than possibly
introducing a bug that may not be near to the source of the error.

but you could fail faster? by the time you decide the names you will
use
you no longer need ‘local’?

local and which not.

The local vars are the ones that you want to be local in the block. I
think this is always easy. And, when you have to cut-and-paste, you
copy the entire local block, so that the protection it gives you
travels with the code.

i think this is misleading. take your example: if cut and paste this

local :x do
data.map!{|x| x ** 2}
end

somewhere else i’m safe not only if x hasn’t been used in the new scope,
but
also if data hasn’t, or is has, but it’s the correct value. the thing
is you
are not going paste that code without knowing where ‘data’ is coming
from: you
haven’t eliminated the problem or even really reduced it since you
still
must ensure your current scope isn’t too big you wrap your brain around:
you’ve got to know where data is coming from and it’s going to come from
exactly the same place ‘x’ is - the current scope, which you must
understand
in it’s entirety in order to use ‘data’ properly in the new
cut-and-pasted
context…

the entire concept that a programming contruct can make it safe to cut
and
paste code is really quite a strech…

your local impl is still a mixed scope like any other block in ruby
and
therefore suffers exactly the same issues: in the above you could easily
clobber a local version of ‘data’, especially if you cut and pasted it
into a
scope where it’s origin was unknown.

in summary, i don’t think one can possibly solve the issues of mixed
scoping
of blocks with a method that takes a mixedly scoped block! :wink:

in addtion, the local impl requires twice as many definitions of
local
variables and we all know where that goes: more lines almost never
equals
fewer bugs - that’s the d.r.y principle that’s so big in the ruby
community.

in anycase i think matz’s block-local vars, due for ruby 1.9 address the
largest issues with block scoping already.

cheers.

-a


#6

On 1/19/07, removed_email_address@domain.invalid removed_email_address@domain.invalid wrote:

Funny wordwrapping of the code and comments in that post…

You can also see the code at my blog:
http://www.davidflanagan.com/blog/2007_01.html#000120

David

Without diving too much into the implementation of this, I would say
if you really find yourself needing it alot you should refactor to
smaller methods that do less stuff. I really find composed method and
extract method are some of the most critical refactorings in Ruby.

Oh yea, and welcome to Ruby =). I’m sure you’ll find a lot to love
coming from the Javascript world. There are some libraries that let
you write prototype style Ruby similiar to idiomatic Javascript.

  • Rob

#7

On Jan 19, 5:58 am, “gga” removed_email_address@domain.invalid wrote:

Now I haven’t written enough Ruby code to know whether this is really a
problem in practice or if it is just a theoretical concern.I think it is more theoretical, as that would indicate you are really
writing VERY long functions.
Remember that in Ruby, global variables are $, instance variables are @
and class variables are @@, so there’s a very rare chance of conflict.

I disagree. Local variables are used most often. I see a good chance
of getting bitten by this problem, especially when using generic
variable names like i or x as block parameters to loop iterators.

Suppose you’ve got a simple loop to iterate through an array

data.each { |x| x*x}

Now you refactor some code and end up cutting-and-pasting that loop
into a method that happens to use x as a parameter. Suddenly your loop
behaves differently. x is no longer local to the block and it
overwrites the local variable in your method.

yield
end

This actually breaks the second-use case for my method. If the block
expects parameters, then I want my method to return the block so that
it can be passed on to the calling method. This allows me to use
local() without having to nest blocks. Consider this invocation:

data.each &local(:x) {|x| puts x*x }

The block is passed to local, which checks that it is safe to use x as
a variable in the block. Then local() returns the block, which gets
passed, in turn, to the each() iterator. The & and the required
parentheses make this syntax a little messy but it allows one block
instead of two.

block check

let( :x ) { x = 20 }

begin
let( :x ) { p ‘never run’ }
rescue
end

I believe this would actually print ‘never run’. Since x is not a
local variable, its use in the first block remains local to that block.

Forgot block…

let( :x )

This won’t work: the block is needed to pass to eval() for checking for
the existance of the local varables. Otherwise, I’m just checking for
local varaibles inside the local() method itself.

David


#8

removed_email_address@domain.invalid wrote:

This actually breaks the second-use case for my method. If the block
expects parameters, then I want my method to return the block so that
it can be passed on to the calling method.

Ah, I see. Sorry, I did not catch that from your docs. It is indeed an
ugly construct.
I’m not sure trying to save a block is a smart move. You end up with a
method that behaves and returns something very differently just based
on a block’s arity. That’s just a huge headache waiting to happen.

I mean… if you are really concerned about the efficiency of this:

local(:x) { data.each { |x| x*x } }

I’d say you are definitively guilty of premature optimization.

I believe this would actually print ‘never run’. Since x is not a
local variable, its use in the first block remains local to that block.

Correct, actually. I seemed to have forgotten the x = 0 before the
begin block. Sorry.

PS. Welcome to the Ruby community. Looking forward to see what you’ll
do with ruby.


#9

Rob,

Yes, breaking long methods up is usually good. If the smaller methods
that one is refactoring into are not of general utility, however, then
I would argue (perhaps in my JavaScript mindset) that they should not
be methods, but lambdas instead. But re-factoring into lambdas
doesn’t help with the local variable issue since you can never be
confident about the scope of your lambda parameters.

Isn’t refactoring, in fact, one of the scenarios where you run into
problems with variable overlap? If you cut-and-paste a block from one
method into another, and the new method uses a variable that has the
same names as one of the block parameters, you’ve just set yourself up
for trouble.

In don’t like Perl, but I do think that Perl’s “my” variables solve
this problem elegantly.

David


#10

On Jan 19, 12:59 pm, removed_email_address@domain.invalid wrote:

one bit. consider, say you paste the above snippet into some code that has an
It changes it because you fail-fast with a NameError rather than
possibly introducing a bug that may not be near to the source of the
error.

‘x’ defined 20 lines up, you don’t notice and introduce a bug. in order to
prevent this you are advocating this

local :x do
data.each { |x| x*x}
end

so a re-def of x will raise an error. at first glance that seems ok.
consider this however: one must know_in_advance which vars to declare
local and which not.

The local vars are the ones that you want to be local in the block. I
think this is always easy. And, when you have to cut-and-paste, you
copy the entire local block, so that the protection it gives you
travels with the code.