What is the best way to edit a file to eliminate a line usin


#1

This sounds an easy task, but I’m certain that I’m yet to find the most
elegant solution.

I have a text file which I want to process using ruby in order to update
it. I want to remove the single line which matches a regexp for which I
have a definition. I’d prefer not to explicitly use temporary files -
however (and this is important) I also don’t want to risk loosing data
with corruptions if the ruby process is killed unexpectedly… and I
definitely don’t want a file other than the one with/without the line
I’m deleting to be read by any other process.

Is there something in a library which would make this task easy?


#2

Steve [RubyTalk] wrote:

Is there something in a library which would make this task easy?

ruby -i.bak -ne ‘print if $_ !~ /foo/’ stuff.txt


#3

William J. wrote:

ruby -i.bak -ne ‘print if $_ !~ /foo/’ stuff.txt
Coo… that’s a new one to me… very nifty.

Unfortunately, I mislead you… I want to transform a file from within a
cgi script… which means I need to use standard out to generate other
feedback to the user. Is a similar facility available within a ruby
program without executing a new ruby process?


#4

steve_rubytalk wrote:

William J. wrote:

ruby -i.bak -ne ‘print if $_ !~ /foo/’ stuff.txt
Coo… that’s a new one to me… very nifty.

Unfortunately, I mislead you… I want to transform a file from within a
cgi script… which means I need to use standard out to generate other
feedback to the user. Is a similar facility available within a ruby
program without executing a new ruby process?

File.new( “stuff.txt” ) do | f |
f.each do |line|
print unless line =~ /foo/
end
end

Or if you needed to rewrite it to a different file

File.new( “stuff.txt” ) do | in |
File.new( “newstuff.txt”, “w” ) do |out|
in.each { | line | out.print line unless line =~ /foo/ }
end
end


#5

Mike F. wrote:

feedback to the user. Is a similar facility available within a ruby

File.new( “stuff.txt” ) do | in |
File.new( “newstuff.txt”, “w” ) do |out|
in.each { | line | out.print line unless line =~ /foo/ }
end
end

File.new doesn’t take a block. Use File.open. Also, “in” is a keyword,
so the above code produces a syntax error. With fixes:

Like grep -v.

File.open( “stuff.txt” ) do |input|
File.open( “newstuff.txt”, “w” ) do |output|
input.each { |line| output.print line unless line =~ /foo/ }
end
end


#6

Mike F. wrote:

File.new( “stuff.txt” ) do | in |
File.new( “newstuff.txt”, “w” ) do |out|
in.each { | line | out.print line unless line =~ /foo/ }
end
end
That’s remarkably similar to my current rough-n-ready approach - the
one I consider inelegant…(N.B. the example above doesn’t address the
problem of atomically replacing stuff.txt with newstuff.txt.) I was
thinking that something like this would be preferable:

FileModify.open ‘stuff.txt’ { |mfile| mfile.delete(/foo/) }

Of course, I’ve just invented FileModify off the top of my head, and I
imagine it being ‘transactional’ - i.e. any exception arising in the
block would prevent any change to stuff.txt. I’d prefer not to go
around re-inventing the wheel if FileModify (or something similar)
already exists. I don’t need it to be desperately scalable or quick -
on the other hand, reliability is a key concern and I’d prefer to use
the neatest possible syntax.


#7

Steve [RubyTalk] wrote:

problem of atomically replacing stuff.txt with newstuff.txt.) I was
the neatest possible syntax.

I have done some things like:

def write filename, data
File.open( filename, ‘w’ ){ |file| file.write data }
end

begin
data = IO.read( ‘stuff.txt’ )
write ‘stuff.txt’, data.gsub( /^foo(\n|$)/, ‘’ )
rescue
write ‘stuff.txt.orig’, data
end

Zach


#8

zdennis wrote:

end

You’d probably want the rescue statement to be a “rescue Exception” so
you catch any/all errors…

Zach


#9

Ok, slightly more elegant…

class File
def self.modify filename
if block_given?
data = IO.read filename
begin
file = File.open filename, ‘w’
yield file, data
rescue Exception => ex
file.open( filename, ‘w’ ){|file| file.write data }
ensure
file.close unless file.closed?
end
end
end
end

File.modify( ‘stuff.txt’ ) do |writable_file, original_file_contents|
writable_file.write original_file_contents.gsub /foo(\n|$)/, ‘’
end

Hope this works better…

Zach


#10

zdennis wrote:

Hope this works better…
It will still permanently and irrecoverably loose data if the process
terminates (e.g. a process hard-limit is exceeded, an administrator
kills the process explicitly; or an old-fashioned power-cut etc.) just
after starting to write the updated file… so I wouldn’t consider it
sufficiently robust for my purposes.


#11

Mike F. wrote:

File.new( “stuff.txt” ) do | f |
f.each do |line|
print unless line =~ /foo/
end
end

IO.foreach(‘stuff’){|s| print s unless s =~ /foo/}


#12

zdennis wrote:

You’d probably want the rescue statement to be a “rescue Exception” so
you catch any/all errors…
Both versions look dangerous to me.

  1. If an exception is raised on opening ‘stuff.txt’ to read then an
    attempt will be made to truncate the file (or to overwrite it with
    whatever happened to be in data previously. [This could be avoided by
    reading before begin.]

  2. If a disk becomes full (or nearly full) during the write operation
    then the rescue will likely not be able to write all the unmodified data
    back - hence permanently loosing valuable information.

I need a more robust approach than this. :slight_smile:


#13

Steve [RubyTalk] wrote:

zdennis wrote:

Hope this works better…

It will still permanently and irrecoverably loose data if the process
terminates (e.g. a process hard-limit is exceeded, an administrator
kills the process explicitly; or an old-fashioned power-cut etc.) just
after starting to write the updated file… so I wouldn’t consider it
sufficiently robust for my purposes.

You could modify for your needs:

require ‘fileutils’

class File
def self.modify filename
if block_given?
data = IO.read filename
FileUtils.mv filename, “#{filename}.orig”
begin
file = File.open filename, ‘w’
yield file, data
ensure
file.close unless file.closed?
end
end
end
end

File.modify( ‘stuff.txt’ ) do |writable_file, original_file_contents|
writable_file.write original_file_contents.gsub( /foo(\n|$)/, ‘’ )
end

Zach


#14

I don’t think you can get away without a tempfile and get safe
“in-place”
modifications. It looks to me like the best compromise would be to

  • read in the original
  • write the modified file to a temp (use ruby’s ‘tempfile’ which, I
    think, should create a temp with secure permissions)
  • use the most atomic os facility you can to copy the modified atop the
    original

On many platforms this might map to Rubys File.rename or FileUtils.mv,
I’m not sure…

HTH,

  • alan

#15

Steve [RubyTalk] wrote:

write ‘stuff.txt.orig’, data
reading before begin.]

  1. If a disk becomes full (or nearly full) during the write operation
    then the rescue will likely not be able to write all the unmodified data
    back - hence permanently loosing valuable information.

I need a more robust approach than this. :slight_smile:

Understood. I don’t know full extent of your issue. It appears you can
run into a lot of possibilities regarding where the power goes up. It
could happen during any system process, not just rubys.

If this helps lead you to an elegant implementation, great!
Otherwise…maybe it will steer you away from a potential disaster! good
luck!

Zach


#16

On Dec 2, 2005, at 4:55 AM, Steve [RubyTalk] wrote:

Yup… that seems pretty reasonable to me too…though I have to
say I’m surprised that I seem to be defining something to do this
rather than just using a library component. It’s exactly the sort
of thing I’d have previously been sure someone would have contributed.

I think most of us have faith that, in general, the computer will not
lose power in the middle of an operation.

Which explains why I had to re-type a half-hour’s worth of wiki page
editing when the my company’s building lost power a few days ago. :slight_smile:


#17

Alan Chen wrote:

On many platforms this might map to Rubys File.rename or FileUtils.mv,
I’m not sure…

Yup… that seems pretty reasonable to me too…though I have to say
I’m surprised that I seem to be defining something to do this rather
than just using a library component. It’s exactly the sort of thing I’d
have previously been sure someone would have contributed.

Steve