Forum: Ruby implement "paste" using ruby

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Oliver (Guest)
on 2009-03-04 06:30
(Received via mailing list)
hi, all

I have a programming question: in the *NIX world, there is a small
utility named "paste", that can combine several files together by
columns. For example:

file "x.dat"'s content is:
1
2
3
...

file "y.dat"'s content is:
a
b
c
...

then "paste x.dat y.dat > z.dat" will generate z.dat as:
1 a
2 b
3 c
...

If I want to do it in Ruby, and number of files is a variable, and
each file itself can be potentially huge ... what would be the most
cost efficient way of implementing this?

Thanks in advance.

Oliver
Brian C. (Guest)
on 2009-03-04 11:05
Oliver wrote:
> If I want to do it in Ruby, and number of files is a variable, and
> each file itself can be potentially huge ... what would be the most
> cost efficient way of implementing this?

Assuming the number of files to paste together is reasonable (say under
1000), then I'd simply open all the files up front:

  files = ARGV.map { |fname| File.open(fname) }

and then inside a loop use 'gets' to pick one line from each, and output
those values together.

HTH,

Brian.
Ryan D. (Guest)
on 2009-03-04 12:14
(Received via mailing list)
On Mar 3, 2009, at 20:28 , Oliver wrote:

> If I want to do it in Ruby, and number of files is a variable, and
> each file itself can be potentially huge ... what would be the most
> cost efficient way of implementing this?

The most cost efficient way is by not reinventing the wheel:

   system "paste", *ARGV
Yossef M. (Guest)
on 2009-03-04 18:24
(Received via mailing list)
On Mar 4, 4:12 am, Ryan D. <removed_email_address@domain.invalid> wrote:
> The most cost efficient way is by not reinventing the wheel:
>
>    system "paste", *ARGV

To me, the reasoning behind the question seemed to be what to do if
the sytem has Ruby installed, but not the 'paste' utility. Or maybe
it's simply an exercise in handling memory usage, speed, and overall
efficiency.

Either way, this answer is not nearly as helpful as you seem to think
it is.

But that's just what I think. Maybe Oliver has different ideas.
Rick D. (Guest)
on 2009-03-04 19:35
(Received via mailing list)
On Tue, Mar 3, 2009 at 11:28 PM, Oliver <removed_email_address@domain.invalid> 
wrote:

> ...
> 3 c
> ...
>
>
 #!/usr/bin/env ruby -wKU

files = ARGV.map { |fname| File.open(fname) }

while (lines = files.map {|file| file.gets}).any? {|line| line}
  puts lines.map {|line| line.to_s.chomp}.join("\t")
end





--
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale
Ryan D. (Guest)
on 2009-03-04 21:43
(Received via mailing list)
On Mar 4, 2009, at 08:22 , Yossef M. wrote:

> Either way, this answer is not nearly as helpful as you seem to think
> it is.

I disagree. The OP emphasized the _most_ cost efficient way. In terms
of both CPU time _and_ developer time efficiency, my method reigns
supreme. I learned this valuable lesson while at Gemstone. They used
unix sort for certain types of sorts (esp for large data) because you
can't beat a 20+ year old tool, so why reinvent the wheel?
This topic is locked and can not be replied to.