Forum: Ruby Join all text files in a folder, with a single line of Ruby code

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
luisbebop (Guest)
on 2008-10-25 15:45
(Received via mailing list)
I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file| File.readlines(file).each { |line| mergedFile << line } } }

Thanks everyone!

www.twitter.com/luisbebop
Brian C. (Guest)
on 2008-10-25 15:56
luisbebop wrote:
> I did a single line of code in Ruby, which joins all text files in a
> folder to a bigfile. I got some tests, and it's works!
> Does anyone knows a better way, or other 'Ruby Way' to do that ?
>
> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
> file| File.readlines(file).each { |line| mergedFile << line } } }

system("cat *.txt >bigfile")

:-?
David A. Black (Guest)
on 2008-10-25 16:56
(Received via mailing list)
Hi --

On Sat, 25 Oct 2008, luisbebop wrote:

> I did a single line of code in Ruby, which joins all text files in a
> folder to a bigfile. I got some tests, and it's works!
> Does anyone knows a better way, or other 'Ruby Way' to do that ?
>
> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
> file| File.readlines(file).each { |line| mergedFile << line } } }

You can use read rather than readlines, and save a loop:

   Dir["*.txt"].each {|f| merged_file.print(File.read(f)) }

or similar.


David
William J. (Guest)
on 2008-10-25 17:51
(Received via mailing list)
luisbebop wrote:

> I did a single line of code in Ruby, which joins all text files in a
> folder to a bigfile. I got some tests, and it's works!
> Does anyone knows a better way, or other 'Ruby Way' to do that ?
>
> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
> file| File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged


"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}
Brian C. (Guest)
on 2008-10-25 18:03
William J. wrote:
> "We still don't need no stinkin' loops!"
>
> File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

Note that 'puts' will add a newline to the end of each file which
doesn't already have one. If you don't want this, use 'print' or 'write'
instead.
Robert K. (Guest)
on 2008-10-25 18:42
(Received via mailing list)
On 25.10.2008 13:56, Brian C. wrote:
> luisbebop wrote:
>> I did a single line of code in Ruby, which joins all text files in a
>> folder to a bigfile. I got some tests, and it's works!
>> Does anyone knows a better way, or other 'Ruby Way' to do that ?
>>
>> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
>> file| File.readlines(file).each { |line| mergedFile << line } } }
>
> system("cat *.txt >bigfile")

Why not directly invoke "cat" from the shell prompt? :-)

Kind regards

  robert
luisbebop (Guest)
on 2008-10-25 19:09
(Received via mailing list)
'Case I'm learning Ruby , and I wanna see some snippets to make some
tasks in a single line of Ruby code.
Directly from prompt is not funny!
luisbebop (Guest)
on 2008-10-25 19:11
(Received via mailing list)
Without loops, it's very nice!
Robert K. (Guest)
on 2008-10-25 19:25
(Received via mailing list)
On 25.10.2008 15:47, William J. wrote:
>
> ruby -e"puts ARGF.to_a" *.txt >merged

There's also

ruby -e '$defout.write(ARGF.read)' *.txt >merged
ruby -e 'File.open("out","w") {|io| io.write(ARGF.read)}' *.txt

> "We still don't need no stinkin' loops!"
>
> File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory
before writing a single byte.  This is not necessary.  You can at least
improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

But a proper solution (i.e. one that deals with arbitrary large files)
would use a fixed buffer size - but that looks ugly on a single line...

Kind regards

  robert
luisbebop (Guest)
on 2008-10-25 20:21
(Received via mailing list)
I got your point. We need an one loop, to be more efficient.
Really, I don't need deal with arbitrary large files.
Like as said, the main goals here are: use ruby (without prompt
commands), and one line of code.
Thanks :)
Joel VanderWerf (Guest)
on 2008-10-25 23:40
(Received via mailing list)
William J. wrote:
> "We don't need no stinkin' loops!"
>
> ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and
steps on ARGV.
Mohit S. (Guest)
on 2008-10-26 11:51
(Received via mailing list)
Brian C. wrote:
> system("cat *.txt >bigfile")
>

and on Windows
system("copy *.txt > bigfile")
(make sure that the bigfile name doesn't match the pattern, so use
bigfile rather than bigfile.txt)

Cheers,
Mohit.
10/26/2008 | 5:49 PM.
Nobuyoshi N. (Guest)
on 2008-10-26 16:07
(Received via mailing list)
Hi,

At Sun, 26 Oct 2008 04:08:48 +0900,
Joel VanderWerf wrote in [ruby-talk:318574]:
> > ruby -e"puts ARGF.to_a" *.txt >merged
>
> Cheating a bit:
>
> ARGV.replace Dir['*']; print ARGF.read
>
> Not recommended, though, since it reads all the data into memory and
> steps on ARGV.

ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'
Joel VanderWerf (Guest)
on 2008-10-26 20:49
(Received via mailing list)
Nobuyoshi N. wrote:
>> steps on ARGV.
>
> ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'
>

Very nice! But if you are going that far, why not go all the way:

ruby -pe'1' *
luisbebop (Guest)
on 2008-10-27 03:57
(Received via mailing list)
> ruby -pe'1' *

Can you explain ? Sorry, but I didn't understand.

Thanks :)
Joel VanderWerf (Guest)
on 2008-10-27 04:45
(Received via mailing list)
luisbebop wrote:
>> ruby -pe'1' *
>
> Can you explain ? Sorry, but I didn't understand.

If you run this in a shell, the * expands to all files. The -p switch
means "for each  line in the files on the command line, store the line
into $_, and print $_. Usually, you want to use -e'some code' to operate
on $_. In this case, the '1' is a no-op, so it just prints the line
without changing it. HTH.
Peña, Botp (Guest)
on 2008-10-27 07:14
(Received via mailing list)
From: Joel VanderWerf [mailto:removed_email_address@domain.invalid]
# luisbebop wrote:
# >> ruby -pe'1' *
# >
# > Can you explain ? Sorry, but I didn't understand.
# If you run this in a shell, the * expands to all files.
# The -p switch means "for each  line in the files on the
# command line, store  the line into $_, and print $_.
# Usually, you want to use -e'some code' to operate
# on $_. In this case, the '1' is a no-op, so it just
# prints the line without changing it. HTH.

wc also means,

  ruby -pe '' *
Joel VanderWerf (Guest)
on 2008-10-27 07:45
(Received via mailing list)
Peña wrote:
> # prints the line without changing it. HTH.
>
> wc also means,
>
>   ruby -pe '' *

Ah, you're right. I tried

ruby -pe'' *

but that failed. With the extra space it works.
Nobuyoshi N. (Guest)
on 2008-10-27 08:40
(Received via mailing list)
Hi,

At Mon, 27 Oct 2008 14:43:58 +0900,
Joel VanderWerf wrote in [ruby-talk:318648]:
> Ah, you're right. I tried
>
> ruby -pe'' *
>
> but that failed. With the extra space it works.

I often use -ep to get rid of quotes and "unused literal"
warning.
Peña, Botp (Guest)
on 2008-10-27 09:11
(Received via mailing list)
From: Nobuyoshi N. [mailto:removed_email_address@domain.invalid]
# I often use -ep to get rid of quotes and "unused literal"
# warning.

i just tried that nobu, but it gives no output

:~$ ruby -ep *.txt
:~$
Nobuyoshi N. (Guest)
on 2008-10-27 12:00
(Received via mailing list)
Hi,

At Mon, 27 Oct 2008 16:10:31 +0900,
Peña, Botp <removed_email_address@domain.invalid> wrote in [ruby-talk:318654]:
> i just tried that nobu, but it gives no output
>
> :~$ ruby -ep *.txt

You needs -p option.

  ruby -pep *.txt
Peña, Botp (Guest)
on 2008-10-27 12:23
(Received via mailing list)
From: Nobuyoshi N. [mailto:removed_email_address@domain.invalid]
# At Mon, 27 Oct 2008 16:10:31 +0900,
# Peña, Botp <removed_email_address@domain.invalid> wrote in [ruby-talk:318654]:
# > i just tried that nobu, but it gives no output
# >
# > :~$ ruby -ep *.txt
#
# You needs -p option.
#
#   ruby -pep *.txt

you're saying -ep is different from -e -p ?
i'm asking since i do not see it ruby -h

thanks for the info -botp
Nobuyoshi N. (Guest)
on 2008-10-27 14:17
(Received via mailing list)
Hi,

At Mon, 27 Oct 2008 19:22:57 +0900,
Peña, Botp <removed_email_address@domain.invalid> wrote in [ruby-talk:318662]:
> # > i just tried that nobu, but it gives no output
> # >
> # > :~$ ruby -ep *.txt
> #
> # You needs -p option.
> #
> #   ruby -pep *.txt
>
> you're saying -ep is different from -e -p ?
> i'm asking since i do not see it ruby -h

Each -e needs a following expression, so 'p' after it is
Kernel#p, however, -p doesn't take arguments so 'e' after it is
-e.
luisbebop (Guest)
on 2008-10-27 15:25
(Received via mailing list)
Thanks for all replies. A lot of interesting solutions!

www.twitter.com/luisbebop
botp (Guest)
on 2008-10-27 15:27
(Received via mailing list)
on Mon, Oct 27, 2008 at 8:17 PM, Nobuyoshi N. 
<removed_email_address@domain.invalid>
wrote:
> Each -e needs a following expression, so 'p' after it is
> Kernel#p, however, -p doesn't take arguments so 'e' after it is -e.
>

dumb me. all the time i thought the quotes were required for -e and
that a space should separate it fr the expression :))
so now even "ruby -pe0 *.txt" should work!

many thanks for the englightenment, nobu.
kind regards -botp
John C. (Guest)
on 2008-10-28 02:19
(Received via mailing list)
On Sun, 26 Oct 2008, Robert K. wrote:

>> File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}
>
> That's vastly inefficient since it reads all the files into memory before
> writing a single byte.  This is not necessary.  You can at least improve to
>
> File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

If we're into fast and ugly...

We need a Ruby interface to Linux "splice"...

   splice()  moves  data  between  two  file  descriptors  without
copying
        between kernel address space and user address space.  It
transfers  up
        to  len  bytes  of  data  from  the  file  descriptor fd_in to
the file
        descriptor fd_out, where one of the descriptors must refer to a
pipe.

See "man splice" for more.


John C.                             Phone : (64)(3) 358 6639
Tait Electronics                        Fax   : (64)(3) 359 4632
PO Box 1645 Christchurch                Email : 
removed_email_address@domain.invalid
New Zealand
This topic is locked and can not be replied to.