Join all text files in a folder, with a single line of Ruby code


#1

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it’s works!
Does anyone knows a better way, or other ‘Ruby Way’ to do that ?

File.open(‘bigfile’,‘w’) { |mergedFile| Dir.glob("*.txt").each { |
file| File.readlines(file).each { |line| mergedFile << line } } }

Thanks everyone!

www.twitter.com/luisbebop


#2

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it’s works!
Does anyone knows a better way, or other ‘Ruby Way’ to do that ?

File.open(‘bigfile’,‘w’) { |mergedFile| Dir.glob("*.txt").each { |
file| File.readlines(file).each { |line| mergedFile << line } } }

system(“cat *.txt >bigfile”)

:-?


#3

Hi –

On Sat, 25 Oct 2008, luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it’s works!
Does anyone knows a better way, or other ‘Ruby Way’ to do that ?

File.open(‘bigfile’,‘w’) { |mergedFile| Dir.glob("*.txt").each { |
file| File.readlines(file).each { |line| mergedFile << line } } }

You can use read rather than readlines, and save a loop:

Dir["*.txt"].each {|f| merged_file.print(File.read(f)) }

or similar.

David


#4

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it’s works!
Does anyone knows a better way, or other ‘Ruby Way’ to do that ?

File.open(‘bigfile’,‘w’) { |mergedFile| Dir.glob("*.txt").each { |
file| File.readlines(file).each { |line| mergedFile << line } } }

“We don’t need no stinkin’ loops!”

ruby -e"puts ARGF.to_a" *.txt >merged

“We still don’t need no stinkin’ loops!”

File.open(“mrg”,“w”){|f|f.puts Dir[’*.txt’].map{|nm|IO.read nm}}


#5

William J. wrote:

“We still don’t need no stinkin’ loops!”

File.open(“mrg”,“w”){|f|f.puts Dir[’*.txt’].map{|nm|IO.read nm}}

Note that ‘puts’ will add a newline to the end of each file which
doesn’t already have one. If you don’t want this, use ‘print’ or ‘write’
instead.


#6

'Case I’m learning Ruby , and I wanna see some snippets to make some
tasks in a single line of Ruby code.
Directly from prompt is not funny!


#7

On 25.10.2008 13:56, Brian C. wrote:

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it’s works!
Does anyone knows a better way, or other ‘Ruby Way’ to do that ?

File.open(‘bigfile’,‘w’) { |mergedFile| Dir.glob("*.txt").each { |
file| File.readlines(file).each { |line| mergedFile << line } } }

system(“cat *.txt >bigfile”)

Why not directly invoke “cat” from the shell prompt? :slight_smile:

Kind regards

robert


#8

Without loops, it’s very nice!


#9

I got your point. We need an one loop, to be more efficient.
Really, I don’t need deal with arbitrary large files.
Like as said, the main goals here are: use ruby (without prompt
commands), and one line of code.
Thanks :slight_smile:


#10

On 25.10.2008 15:47, William J. wrote:

ruby -e"puts ARGF.to_a" *.txt >merged

There’s also

ruby -e ‘$defout.write(ARGF.read)’ *.txt >merged
ruby -e ‘File.open(“out”,“w”) {|io| io.write(ARGF.read)}’ *.txt

“We still don’t need no stinkin’ loops!”

File.open(“mrg”,“w”){|f|f.puts Dir[’*.txt’].map{|nm|IO.read nm}}

That’s vastly inefficient since it reads all the files into memory
before writing a single byte. This is not necessary. You can at least
improve to

File.open(“mrg”,“w”){|f|Dir[’*.txt’].each{|nm|f.write(File.read(nm))}}

But a proper solution (i.e. one that deals with arbitrary large files)
would use a fixed buffer size - but that looks ugly on a single line…

Kind regards

robert


#11

Brian C. wrote:

system(“cat *.txt >bigfile”)

and on Windows
system(“copy *.txt > bigfile”)
(make sure that the bigfile name doesn’t match the pattern, so use
bigfile rather than bigfile.txt)

Cheers,
Mohit.
10/26/2008 | 5:49 PM.


#12

William J. wrote:

“We don’t need no stinkin’ loops!”

ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir[’*’]; print ARGF.read

Not recommended, though, since it reads all the data into memory and
steps on ARGV.


#13

Nobuyoshi N. wrote:

steps on ARGV.

ruby -pe ‘BEGIN{ARGV.replace Dir["*"]}’

Very nice! But if you are going that far, why not go all the way:

ruby -pe’1’ *


#14

Hi,

At Sun, 26 Oct 2008 04:08:48 +0900,
Joel VanderWerf wrote in [ruby-talk:318574]:

ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir[’*’]; print ARGF.read

Not recommended, though, since it reads all the data into memory and
steps on ARGV.

ruby -pe ‘BEGIN{ARGV.replace Dir["*"]}’


#15

ruby -pe’1’ *

Can you explain ? Sorry, but I didn’t understand.

Thanks :slight_smile:


#16

luisbebop wrote:

ruby -pe’1’ *

Can you explain ? Sorry, but I didn’t understand.

If you run this in a shell, the * expands to all files. The -p switch
means "for each line in the files on the command line, store the line
into $, and print $. Usually, you want to use -e’some code’ to operate
on $_. In this case, the ‘1’ is a no-op, so it just prints the line
without changing it. HTH.


#17

From: Joel VanderWerf [mailto:removed_email_address@domain.invalid]

luisbebop wrote:

>> ruby -pe’1’ *

>

> Can you explain ? Sorry, but I didn’t understand.

If you run this in a shell, the * expands to all files.

The -p switch means "for each line in the files on the

command line, store the line into $, and print $.

Usually, you want to use -e’some code’ to operate

on $_. In this case, the ‘1’ is a no-op, so it just

prints the line without changing it. HTH.

wc also means,

ruby -pe ‘’ *


#18

Hi,

At Mon, 27 Oct 2008 14:43:58 +0900,
Joel VanderWerf wrote in [ruby-talk:318648]:

Ah, you’re right. I tried

ruby -pe’’ *

but that failed. With the extra space it works.

I often use -ep to get rid of quotes and “unused literal”
warning.


#19

Peña wrote:

prints the line without changing it. HTH.

wc also means,

ruby -pe ‘’ *

Ah, you’re right. I tried

ruby -pe’’ *

but that failed. With the extra space it works.


#20

From: Nobuyoshi N. [mailto:removed_email_address@domain.invalid]

I often use -ep to get rid of quotes and “unused literal”

warning.

i just tried that nobu, but it gives no output

:~$ ruby -ep *.txt
:~$