Test if file is binary?

From: Rebhan, Gilbert [mailto:[email protected]]

Is there an exisiting standard what is considered as a binary file,

if you’re on a *nix (non-windows) box, you should use the os file
command and then just wrap it in ruby,

irb(main):022:0> def is_bin(f)
irb(main):023:1> %x(file #{f}) !~ /text/
irb(main):024:1> end
=> nil
irb(main):025:0> is_bin “test.rb”
=> false
irb(main):026:0> is_bin “test.txt”
=> false
irb(main):027:0> is_bin “/usr/local/bin/dnscache”
=> true
irb(main):028:0> is_bin “/bin/ps”
=> true
irb(main):029:0> def is_text(f)
irb(main):030:1> %x(file #{f}) =~ /text/
irb(main):031:1> end
=> nil
irb(main):032:0> is_text “test.rb”
=> 27
irb(main):033:0> is_text “test.txt”
=> 16
irb(main):034:0> is_text “/usr/local/bin/dnscache”
=> nil
irb(main):035:0> is_text “/bin/ps”
=> nil

kind regards -botp

On Aug 21, 2007, at 10:21 AM, Rebhan, Gilbert wrote:

always
considered as textfile

??

What’s the heuristic in Subversion?

– fxn

2007/8/21, Rebhan, Gilbert [email protected]:

def self.is_binary?(name)
ascii = total = 0
File.open(name, “rb”) { |io| io.read(1024) }.each_byte do |c|
total += 1;
ascii +=1 if c >= 128 or c == 0
end
ascii.to_f / total.to_f > 0.33
end
end

Yep. But I’d leave the “is_” out - that’s handled by the “?” already.

Cheers

robert

Don’t forget the possibility, that a file ist encoded in UTF-16 or
UTF-32. To recognize these textual data you need an extra recognition
step in front of the rest.

Wolfgang WoNáDo

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs