Iterator Fu Failing Me


#1

I have a group of classes, all implementing a parse?() class method.
Calling parse?(token) will return the constructed object, if it can
be parsed by this class, or false otherwise.

I want to run a bunch of tokens through these classes, grabbing the
first fit. For example:

elements = tokens.map do |token|
ClassA.parse?(token) or
ClassB.parse?(token) or
ClassC.parse?(token)
end

That works. Now can anyone give me a version of the middle section
that doesn’t require I call parse?() 50 times? I want something
close to:

elements = tokens.map do |token|
[ClassA, ClassB, ClassC].find { |kind| kind.parse?(token) }
end

Except that I want the return result of parse?(), instead of the
class that took it.

Thanks for any tips you can offer.

James Edward G. II


#2

On Sat, Jan 07, 2006 at 01:36:30PM +0900, James Edward G. II wrote:

ClassC.parse?(token)

Except that I want the return result of parse?(), instead of the
class that took it.

do/end doesn’t bind tightly enough for the assignment to ‘elements’.
Using {}
should work.

marcel


#3

On Jan 6, 2006, at 10:39 PM, Marcel Molina Jr. wrote:

should work.
Sure it does:

numbers = (1…3).map do |n|
?> n * 2

end
=> [2, 4, 6]

My problem is not syntax. It’s that I can’t find a way to iterate to
the result of parse?().

James Edward G. II


#4

Are “Write a new Enumerable method that mimics find but returns the
block result instead” or “do a less elegant solution involving
assignment” not acceptable solutions?


#5

On 2006.01.07 13:36, James Edward G. II wrote:

ClassC.parse?(token)

end

That works. Now can anyone give me a version of the middle section
that doesn’t require I call parse?() 50 times? I want something
close to:

elements = tokens.map do |token|
[ClassA, ClassB, ClassC].find { |kind| kind.parse?(token) }
end

Not tested

elements = tokens.map {|token|
[A, B, C].each {|kind|
result = kind.parse?(token) and break result
}
}

Except that I want the return result of parse?(), instead of the
class that took it.

Thanks for any tips you can offer.

James Edward G. II

E


#6

Are “Write a new Enumerable method that mimics find but returns the
block result instead” or “do a less elegant solution involving
assignment” not acceptable solutions?


#7

On Sat, 7 Jan 2006, James Edward G. II wrote:

ClassC.parse?(token)
took it.
harp:~ > cat a.rb

module Parser
module ClassMethods
def parse tokens, klasses
tokens.map{|token| catch(‘parse!’){ klasses.each{|klass|
klass.parse! token}; nil}}
end
def parse! token
ret = parse?(token) and throw ‘parse!’, ret
end
def parse? token
return “<#{ token }> parsed by <#{ self }>” if token =~
self::RE
end
end
module InstanceMethods
end
def self::included other
other.module_eval{ extend ClassMethods and include
InstanceMethods }
end
extend ClassMethods and include InstanceMethods
end

class ClassA
include Parser
RE = /a/
end

class ClassB
include Parser
RE = /b/
end

class ClassC
include Parser
RE = /c/
end

elements = Parser::parse %w( a b c d ), [ClassA, ClassB, ClassC]

p elements

harp:~ > ruby a.rb
[" parsed by ", " parsed by ", " parsed by
", nil]

you can probably come up with something shorter - but using catch/throw
with
parsers is very useful to break of conditions like these.

cheers.

-a


#8

On 1/6/06, James Edward G. II removed_email_address@domain.invalid wrote:

 ClassC.parse?(token)

end

That works. Now can anyone give me a version of the middle section
that doesn’t require I call parse?() 50 times? I want something
close to:

elements = tokens.map do |token|
[ClassA, ClassB, ClassC].find { |kind| kind.parse?(token) }
end

How about:

elements = tokens.map do |token|
[ClassA, ClassB, ClassC].inject(false) {|m, kind| m or
kind.parse?(token)}
end

It isn’t perfectly efficient but it is short.

Brian.


#9

James Edward G. II removed_email_address@domain.invalid wrote:

ClassC.parse?(token)

Except that I want the return result of parse?(), instead of the
class that took it.

Thanks for any tips you can offer.

James Edward G. II

elements = tokens.map do |tok|
parsers.inject(false) {|a,par| a=par.parse(tok) and break a}
end

alternative using a method definition

def parse(token)
parsers.each {|par| a=par.parse(token) and return a}
false
end

But the best solution is

elements = tokens.map do |tok|
parsers.detect {|par| par.parse(tok)}
end

:slight_smile:

Kind regards

robert

#10

On Sat, 7 Jan 2006, Eero S. wrote:

ClassB.parse?(token) or

Not tested

elements = tokens.map {|token|
[A, B, C].each {|kind|
result = kind.parse?(token) and break result
}
}

fails when no kind parses. in this case the result will be

[[A,B,C]] (the return of each…)

regards.

-a

===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| strong and healthy,
| who thinks of sickness until it strikes like lightning?
| preoccupied with the world,
| who thinks of death, until it arrives like thunder?
| – milarepa


#11

On Sat, 2006-01-07 at 05:36, James Edward G. II wrote:

 ClassC.parse?(token)

end

That works. Now can anyone give me a version of the middle section
that doesn’t require I call parse?() 50 times? I want something

I am not certain if this solution fits, but…

There is a refactoring named Replace Conditional with Polymorphism that
might solve your problem. The idea is that when you have a conditional,
you can create a class hierarchy with one subclass for each leg in the
conditional. (See
http://www.refactoring.com/catalog/replaceConditionalWithPolymorphism.html
for a slightly better explanation.)

You already have several different parser classes, and of course you
don’t need to bother with a class hierarchy, so you could do something
like:

parser = ParserFactory.create(parser_type)

elements = tokens.map do |token|
parser.parse?(token)
end

Note that ‘parser_type’ may well be the class of a token in tokens.

This works assuming that all tokens in ‘tokens’ are of the same type.
For example, you have one parser for handling CSV data, another for XML,
a third for SGML, a fourth for non-compliant HTML, etc.

On the other hand, if each parser class handles a subset of the tokens
you get from one type of input, for example you have tokenized an XML
file, and have one parser for elements, another for processing
instructions, a third for text nodes, etc., you will need something
different. A case statement would do the trick:

elements = tokens.map do |token|
case token
when Element
ElementParser.parse?(token)
when Text
TextParser.parse?(token)

else
raise “Unknown token type”
end
end

You can then factor out the case statement into a method named parse?,
move the parse? method to a class of it’s own, and be back to:

elements = tokens.map do |token|
ParseEverything.parse?(token)
end

A while ago I wrote a CSV parser in Java for a talk on Test Driven
Design. The Java code got horribly complex, but it struck me that in
Ruby I could do this:

class Parser
def parse(reader, writer)
tokenize(reader) { |type| |value|
write(type, value, writer)
}
end
end

By mixing in tokenize and write methods, it would be possible to build
parsers that handle most formats.

I hope this helps.

/Henrik

http://www.henrikmartensson.org/ - Reflections on software development


#12

On Sat, 7 Jan 2006, Robert K. wrote:

elements = tokens.map do |tok|
parsers.inject(false) {|a,par| a=par.parse(tok) and break a}
end

nice.

-a

===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| strong and healthy,
| who thinks of sickness until it strikes like lightning?
| preoccupied with the world,
| who thinks of death, until it arrives like thunder?
| – milarepa


#13

removed_email_address@domain.invalid wrote:

On Sat, 7 Jan 2006, Robert K. wrote:

elements = tokens.map do |tok|
parsers.inject(false) {|a,par| a=par.parse(tok) and break a}
end

nice.

Yeah, but #detect is better here. And if I say that another solution
is
bettern than an inject solution… :slight_smile:

robert

#14

On Jan 6, 2006, at 11:36 PM, James Edward G. II wrote:

That works. Now can anyone give me a version of the middle section
that doesn’t require I call parse?() 50 times? I want something
close to:

elements = tokens.map do |token|
[ClassA, ClassB, ClassC].find { |kind| kind.parse?(token) }
end

Except that I want the return result of parse?(), instead of the
class that took it.

Why not:

elements = tokens.map do |token|
result = nil
[ClassA, ClassB, ClassC].find { |kind| result = kind.parse?(token) }
end


Bob H. – blogs at <http://www.recursive.ca/
hutch/>
Recursive Design Inc. – http://www.recursive.ca/
Raconteur – http://www.raconteur.info/
xampl for Ruby – http://rubyforge.org/projects/xampl/


#15

Bob H. removed_email_address@domain.invalid wrote:

Except that I want the return result of parse?(), instead of the
class that took it.

Why not:

elements = tokens.map do |token|
result = nil
[ClassA, ClassB, ClassC].find { |kind| result = kind.parse?(token) }
end

Because this won’t return the proper value from the block. You need at
least to add a line with “result” after the #find. :slight_smile: And then it’s
much
more inelegant than using #detect. :slight_smile:

Kind regards

robert

#16

removed_email_address@domain.invalid a écrit :

ClassA.parse?(token) or
end
def parse tokens, klasses
module InstanceMethods
end
def self::included other
other.module_eval{ extend ClassMethods and include InstanceMethods }
end
extend ClassMethods and include InstanceMethods
end

Could you explain this module ?

I think I can understand why you defined a module ClassMethods : to
include class methods and not only instace methods.

However, I don’t understant why the InstanceMethods module ? What do you
expect with this module ? Is it just for symetry ?

Thanks !


#17

On Sun, 8 Jan 2006, Pierre Barbier de Reuille wrote:

Could you explain this module ?

I think I can understand why you defined a module ClassMethods : to include
class methods and not only instace methods.

However, I don’t understant why the InstanceMethods module ? What do you
expect with this module ? Is it just for symetry ?

yes. just symetry/clarity. is use this pattern alot. even if someone
doesn’t understand the mechanism i generally assume the ‘ClassMethods’
and
‘InstanceMethods’ names will give it away.

cheers.

-a

===============================================================================
| ara [dot] t [dot] howard [at] noaa [dot] gov
| strong and healthy,
| who thinks of sickness until it strikes like lightning?
| preoccupied with the world,
| who thinks of death, until it arrives like thunder?
| – milarepa


#18

Robert K. a écrit :

end
end

Because this won’t return the proper value from the block. You need at
least to add a line with “result” after the #find. :slight_smile: And then it’s
much more inelegant than using #detect. :slight_smile:

Kind regards

robert

Well, looking at the documentation of ruby1.8, detect and find are
aliases … is it specific to ruby1.8 or some newer/older version ?

Pierre


#19

On Jan 7, 2006, at 11:13 AM, Robert K. wrote:

Because this won’t return the proper value from the block. You
need at least to add a line with “result” after the #find. :slight_smile:

cut’n’past-o… oops

And then it’s much more inelegant than using #detect. :slight_smile:

You mean #detect or #select?

elements = tokens.select do |token|
result = nil
[ClassA, ClassB, ClassC].find { |kind| results << kind.parse?
(token) }
result
end

And this is kind of handy too…

elements = tokens.map do |token|
result = nil
[ClassA, ClassB, ClassC].find { |kind| result = [token, kind.parse?
(token)] }
result
end

Kind regards

robert


Bob H. – blogs at <http://www.recursive.ca/
hutch/>
Recursive Design Inc. – http://www.recursive.ca/
Raconteur – http://www.raconteur.info/
xampl for Ruby – http://rubyforge.org/projects/xampl/


#20

Bob H. removed_email_address@domain.invalid wrote:

[ClassA, ClassB, ClassC].find { |kind| kind.parse?(token) }

(token) }
end

Because this won’t return the proper value from the block. You
need at least to add a line with “result” after the #find. :slight_smile:

cut’n’past-o… oops

And then it’s much more inelegant than using #detect. :slight_smile:

You mean #detect or #select?

#detect - because it stops as soon as it has a hit while #select will
return
an array. We need just the first hit.

[nil,nil,nil,“w”,2,3].find {|x|puts x;x}
nil
nil
nil
w
=> “w”

[nil,nil,nil,“w”,2,3].select {|x|puts x;x}
nil
nil
nil
w
2
3
=> [“w”, 2, 3]

elements = tokens.select do |token|
result = nil
[ClassA, ClassB, ClassC].find { |kind| results << kind.parse?
(token) }
result
end

You don’t define results and you never assign result another value than
nil…

And this is kind of handy too…

elements = tokens.map do |token|
result = nil
[ClassA, ClassB, ClassC].find { |kind| result = [token, kind.parse?
(token)] }
result
end

There’s no point in storing token in result because it’s unchanged and
available as block parameter.

Say what you want, find/detect is the most elegant solution.

robert