Test if file is binary?

From: Rebhan, Gilbert [mailto:[email protected]]

Is there an exisiting standard what is considered as a binary file,

if you’re on a *nix (non-windows) box, you should use the os file
command and then just wrap it in ruby,

irb(main):022:0> def is_bin(f)
irb(main):023:1> %x(file #{f}) !~ /text/
irb(main):024:1> end
=> nil
irb(main):025:0> is_bin “test.rb”
=> false
irb(main):026:0> is_bin “test.txt”
=> false
irb(main):027:0> is_bin “/usr/local/bin/dnscache”
=> true
irb(main):028:0> is_bin “/bin/ps”
=> true
irb(main):029:0> def is_text(f)
irb(main):030:1> %x(file #{f}) =~ /text/
irb(main):031:1> end
=> nil
irb(main):032:0> is_text “test.rb”
=> 27
irb(main):033:0> is_text “test.txt”
=> 16
irb(main):034:0> is_text “/usr/local/bin/dnscache”
=> nil
irb(main):035:0> is_text “/bin/ps”
=> nil

kind regards -botp

On Aug 21, 2007, at 10:21 AM, Rebhan, Gilbert wrote:

considered as textfile


What’s the heuristic in Subversion?

– fxn

2007/8/21, Rebhan, Gilbert [email protected]:

def self.is_binary?(name)
ascii = total = 0
File.open(name, “rb”) { |io| io.read(1024) }.each_byte do |c|
total += 1;
ascii +=1 if c >= 128 or c == 0
ascii.to_f / total.to_f > 0.33

Yep. But I’d leave the “is_” out - that’s handled by the “?” already.



Don’t forget the possibility, that a file ist encoded in UTF-16 or
UTF-32. To recognize these textual data you need an extra recognition
step in front of the rest.

Wolfgang WoNáDo

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs