Another refactor excise

Hi, ALL:

I am happy because i learned a lot last night from this fourm.
Now, I encounter another challenge. Here is my code:

class Fastqfile
attr_accessor :offset

def initialize file
@file = file
@offset ||= 33
end

def each_records
File.foreach(@file).each_slice(4).map {|elem|
if block_given?
yield elem.map(&:chomp)
else
elem.map(&:chomp)
end
}.to_enum
end

def each_ids
each_records {|id,seq,_id,qv|
if block_given?
yield id
else
id
end
}.to_enum
end

def each_reads
each_records {|id,seq,_id,qv|
if block_given?
yield seq
else
seq
end
}.to_enum
end

def each_quals
each_records {|id,seq,_id,qv|
if block_given?
yield decode(qv)
else
decode(qv)
end
}.to_enum
end

private
def decode(qv)
qv.chars.map {|q| q.ord - 33 }
end

end

There are many similar methods within my code. I hope to make it DRY.
At the beginning. I try to use method_missing to solve my question. But
i feel it’s not a good way.
In my case , method_missing may lead to debug issue.

Could someboy give me some suggestion?

Thanks a lot.

Finally:

my code is as follow:

class Fastqfile
attr_accessor :offset

def initialize file
@file = file
@offset ||= 33
end

def each_records
File.foreach(@file).each_slice(4).map {|elem|
if block_given?
yield elem.map(&:chomp)
else
elem.map(&:chomp)
end
}.to_enum
end

[:each_ids, :each_reads, :each_quals].each do |method|
define_method method do |&block|
each_records {|id,seq,_id,qv|
record = {each_ids: id, each_reads: seq, each_quals: decode(qv)
}
if block
block.call record.fetch method
else
record.fetch method
end
}.to_enum
end
end

private
def decode(qv)
qv.chars.map {|q| q.ord - 33 }
end

end

felix chang wrote in post #1121695:

Hi, ALL:

I am happy because i learned a lot last night from this fourm.
Now, I encounter another challenge. Here is my code:

class Fastqfile
attr_accessor :offset

def initialize file
@file = file
@offset ||= 33
end

def each_records
File.foreach(@file).each_slice(4).map {|elem|
if block_given?
yield elem.map(&:chomp)
else
elem.map(&:chomp)
end
}.to_enum
end

def each_ids
each_records {|id,seq,_id,qv|
if block_given?
yield id
else
id
end
}.to_enum
end

def each_reads
each_records {|id,seq,_id,qv|
if block_given?
yield seq
else
seq
end
}.to_enum
end

def each_quals
each_records {|id,seq,_id,qv|
if block_given?
yield decode(qv)
else
decode(qv)
end
}.to_enum
end

private
def decode(qv)
qv.chars.map {|q| q.ord - 33 }
end

end

There are many similar methods within my code. I hope to make it DRY.
At the beginning. I try to use method_missing to solve my question. But
i feel it’s not a good way.
In my case , method_missing may lead to debug issue.

Could someboy give me some suggestion?

Thanks a lot.

One more suggestion - when you have the same expression in both
branches of an if statement, it’s worth moving it outside.

  if block_given?
    yield elem.map(&:chomp)
  else
    elem.map(&:chomp)
  end

chomped = elem.map(&:chomp)
if block_given?
yield chomped
else
chomped
end

(or even

block_given? ? yield chomped : chomped

if you want to move it to one line)

and likewise, rather than

    if block
      block.call record.fetch method
    else
      record.fetch method
    end

r = record.fetch method
block ? block.call® : r

martin

On Tue, Sep 17, 2013 at 8:05 PM, felix chang [email protected]
wrote:

@offset ||= 33

end
Why are you using #map in the first line in combination with #to_enum
at the end? That does not seem to make sense - you are combining too
many operations here for my taste. Plus: #eachXYZ methods usually
return self at the end (unless there is no block). I am not really
sure what your goal is but if I create a similar scenario then the
block is called, a mapping is done and for the result an Enumerator is
returned:

irb(main):001:0> X = Struct.new :d do
irb(main):002:1* def er
irb(main):003:2> d.each_slice(4).map {|e| yield e.map(&:chomp)}.to_enum
irb(main):004:2> end
irb(main):005:1> end
=> X
irb(main):006:0> x=X.new [“foo”, “bar\n”]
=> #<struct X d=[“foo”, “bar\n”]>
irb(main):007:0> x.er {|x| p x}
[“foo”, “bar”]
=> #<Enumerator: [[“foo”, “bar”]]:each>

irb(main):010:0> x=X.new (1…10).map {|i| “#{i}\n”};nil
=> nil
irb(main):011:0> x.er {|x| p x}
[“1”, “2”, “3”, “4”]
[“5”, “6”, “7”, “8”]
[“9”, “10”]
=> #<Enumerator: [[“1”, “2”, “3”, “4”], [“5”, “6”, “7”, “8”], [“9”,
“10”]]:each>

Usually the idiom would be

def each_records
return to_enum(:each_records) unless block_given?

File.foreach(@file).each_slice(4).each{|elem|
yield elem.map(&:chomp)
}

self
end

Then you can still have an enumerator for the mapping

enum = foo.each_records.map {|elem| whatever logic}

I think you are wanting too much with these methods which hurts
modularity.

Could someboy give me some suggestion?

Reduce what a single method does. Stick with the usual idioms of
Enumerable.

Kind regards

robert

Thanks a lot. You guys did teach me a lot!!

I have refactored my code as below:

class FastqFile

attr_accessor :offset

def initialize file
@file = file
@offset ||= 33
end

def each_records
return to_enum(method) unless block_given?

File.foreach(@file).each_slice(4) do |elem|
  yield elem.map(&:chomp)
end

self

end

[:each_ids, :each_reads, :each_quals].each do |method|
define_method method do |&block|
return to_enum( method ) unless block

  each_records do |id,seq,_id,qv|

    record =
      case method
      when :each_ids
        id
      when :each_reads
        seq
      when :each_quals
        decode(qv)
      end

    block ? block.call(record) : record

  end

  self
end

end

private
def decode(qv)
qv.chars.map {|q| q.ord - 33 }
end
end

On Mon, Sep 23, 2013 at 6:03 PM, felix chang [email protected]
wrote:

@offset ||= 33

end

def each_records
return to_enum(method) unless block_given?

File.foreach(@file).each_slice(4) do |elem|
  yield elem.map(&:chomp)
end

I’d probably pick a different approach:

Record = Struct.new :id, :seq, :qv do
def qv_decoded
qv.chars.map {|q| q.ord - 33 }
end
end

and then

def each_record
return to_enum(:each_record) unless block_given?

File.foreach(@file).each_slice(4) do |rec|
rec.slice! 2,1
yield Record[*rec]
end

self
end

Then you can do

x.each_record.map(&:qv_decoded).each do |qv|
p qv
end

and do not need methods you define below.

Maybe you also want do decode qv immediately when placing in Record.

self

end

[:each_ids, :each_reads, :each_quals].each do |method|
define_method method do |&block|
return to_enum( method ) unless block

  each_records do |id,seq,_id,qv|

Btw, I think “each_record” sounds better.

    record =
      case method
      when :each_ids
        id
      when :each_reads
        seq
      when :each_quals
        decode(qv)
      end

That is pretty inefficient: at the top you do know already which field
you want to access yet you repeat the test for every iteration. If
you really want to avoid copy and paste you could define a block which
does the access before “define_method” call and use it.

[
:each_ids => lambda {|id,seq,_id,qv| id},
:each_reads => lambda {|id,seq,_id,qv| seq},
:each_quals => lambda {|id,seq,_id,qv| decode(qv)},
].each do |method, access|
define_method method do |&block|
return to_enum( method ) unless block

  each_records do |id,seq,_id,qv|
    record = access[id, seq, qv]
    # ...
  end

  self
end

I’d probably just copy and paste instead of going through some hoops to
do this.

    block ? block.call(record) : record

I don’t see how that could possibly work: if there is no block then
“record” is just a local variable which is evaluated. This is
basically a nop since the value will go nowhere.

end
end

Cheers

robert

Dear ALL:

Now, i want to add some test for my FastqFile class.
I want to use minitest. Because it is popular and Ruby 2.0 build-in. But
I do not have any idea to write my test.

How should I fake a file ? use stringio?
If i use StringIO, File.foreach(@file) will not handle it. How do I test
it?

Somebody has good suggestion?

Thanks a log
felix

felix chang wrote in post #1122194:

Yes, I am totally agree with you.

This is my code, and i add openfile method which are going to be an
interface to do unit test. I want to use StringIO to as a fake IO.

class FastqFile

attr_accessor :offset

def initialize file
@file = file
@offset ||= 33
end

def each_record
return to_enum(method) unless block_given?

openfile.each_slice(4) do |elem|
  yield elem.map(&:chomp)
end

self

end

m = {
each_ids: lambda { |id,seq,_id,qv| id },
each_reads: lambda { |id,seq,_id,qv| seq },
each_quals: lambda { |id,seq,_id,qv| decode(qv) }.
}

m.each do |method,access|
define_method method do |&block|
return to_enum( method ) unless block

  each_record do |elem|
    block.call access[*elem]
  end
  self
end

end

private
def openfile
File.foreach(@file) unless @file.instance_of? StringIO
end

def decode(qv)
qv.chars.map {|q| q.ord - 33 }
end
end

END

Robert K. wrote in post #1122180:

On Mon, 23 Sep 2013 22:14:08 +0200
felix chang [email protected] wrote:

Yes, you totally agree with you.

Most engineers do agree with themselves, although others, with Multiple
Personality Disorder, disagree with themselves most all of the time. :wink:

Sorry, I couldn’t resist…

Agreed the only way to fake a file is with a file which acts like a file
though it’s faking it. =)

/dev/null is one of my favorite fake files!

Yes, you totally agree with you.

This is my code, and i add openfile method which are going to be an
interface to do unit test. I want to use StringIO to as a fake IO.

class FastqFile

attr_accessor :offset

def initialize file
@file = file
@offset ||= 33
end

def each_record
return to_enum(method) unless block_given?

openfile.each_slice(4) do |elem|
  yield elem.map(&:chomp)
end

self

end

m = {
each_ids: lambda { |id,seq,_id,qv| id },
each_reads: lambda { |id,seq,_id,qv| seq },
each_quals: lambda { |id,seq,_id,qv| decode(qv) }.
}

m.each do |method,access|
define_method method do |&block|
return to_enum( method ) unless block

  each_record do |elem|
    block.call access[*elem]
  end
  self
end

end

private
def openfile
File.foreach(@file) unless @file.instance_of? StringIO
end

def decode(qv)
qv.chars.map {|q| q.ord - 33 }
end
end

END

Robert K. wrote in post #1122180: