A directory "grep" in RUBY?

Hi,
Can someone point me to a quick “grep” like function in RUBY? I use glob
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see “grep” in the Pickaxe manual,
but, only as it relates to “.enum,” which, sorry to say, I don’t
understand.

I need something like this, even though, I know this doesn’t work:
Dir.glob(/^f[0-9]{7}.eps/)

Thanks!
Peter

Hi,

Peter B. [email protected] writes:

I need something like this, even though, I know this doesn’t work:
Dir.glob(/^f[0-9]{7}.eps/)

% ls
etc1 f1234500.eps f1234502.eps f1234504.eps f1234567.eps
etc2 f1234501.eps f1234503.eps f1234505.eps
% ruby -e ‘puts Dir.entries(“.”).grep(/f\d{7}.eps/)’
f1234567.eps
f1234500.eps
f1234503.eps
f1234502.eps
f1234504.eps
f1234501.eps
f1234505.eps

On Fri, 24 Nov 2006, Peter B. wrote:

Hi,
Can someone point me to a quick “grep” like function in RUBY? I use glob

Not the answer you want [below], but look at glark

It is a ruby grep, “interbred” with find…

all the time, but, I need the power of regular expressions when finding
particular files in directories. I see “grep” in the Pickaxe manual,
but, only as it relates to “.enum,” which, sorry to say, I don’t
understand.

Dir[“*”].grep(/^c/) # all entries matching * (glob) that begin with
C.

enum is enumerable – ri Enumerable gives:

------------------------------------------------------ Class: Enumerable
The +Enumerable+ mixin provides collection classes with several
traversal and searching methods, and with the ability to sort. The
class must provide a method +each+, which yields successive members
of the collection. If +Enumerable#max+, +#min+, or +#sort+ is used,
the objects in the collection must also implement a meaningful
+<=>+ operator, as these methods rely on an ordering between
members of the collection.


Instance methods:

 all?, any?, collect, detect, each_cons, each_slice,
 each_with_index, entries, enum_cons, enum_slice, enum_with_index,
 find, find_all, grep, include?, inject, inject, map, max, member?,
 min, partition, reject, select, sort, sort_by, to_a, to_set, zip
And of course Dir.[] gives something with an each method. > > I need something like this, even though, I know this doesn't work: > Dir.glob(/^f[0-9]{7}\.eps/) > > Thanks! > Peter
    HTH
    Hugh

WATANABE Hirofumi wrote:

Hi,

Peter B. [email protected] writes:

I need something like this, even though, I know this doesn’t work:
Dir.glob(/^f[0-9]{7}.eps/)

% ls
etc1 f1234500.eps f1234502.eps f1234504.eps f1234567.eps
etc2 f1234501.eps f1234503.eps f1234505.eps
% ruby -e ‘puts Dir.entries(“.”).grep(/f\d{7}.eps/)’
f1234567.eps
f1234500.eps
f1234503.eps
f1234502.eps
f1234504.eps
f1234501.eps
f1234505.eps

Thank you! I’ve never seen “Dir.entries” before. Very cool.

I love this forum!

-Peter

I need something like this, even though, I know this doesn’t work:
Dir.glob(/^f[0-9]{7}.eps/)

You might not want to do this–see the other suggestions in the
thread–but something along these lines would make the above snippit
work as expected:

class Dir
  class << self
    alias_method :__original_glob, :glob
  end

  def self.glob(query,*flags, &blk)
    return __original_glob(query,*flags,&blk) unless query.is_a? 

Regexp

    files = []

    Dir.new('.').each do |f|
      next unless query =~ f
      if blk.nil?
        files << f
      else
        blk.call(f)
      end
    end

    blk.nil?? files : nil
  end
end

# Testing it out:

if $0 == __FILE__
  Dir.glob(/\A\..*/) do |f|
    puts "Dot-file: #{f}"
  end

  backups_regex  = Dir.glob(/~\z/)
  backups_string = Dir.glob('.?*~')

  p backups_regex
  p backups_string

  p backups_regex == backups_string  # => true
end

Thanks, Hugh. I went to that web site and download the .gz file. I need
to find a .gz unzipper for my Windows environment to get at it, but,
I’ll do that. It looks very interesting, and, it’s all written in RUBY.
And, thanks for the enum explanation. I think I get it now.

Peter;

Wow! Thanks, Lou. This looks interesting, but, a lot of it seems beyond
me at this point. All I want is particular files in a directory.

No problem at all. If you just need it to work, you can copy that
into a file, require it when you need to and do your thing.

If you’re interested on how it works, keep reading. If not my
feelings won’t be hurt =)

class Dir

  ##
  # We want to replace the old Dir.glob function with one that also 

takes a
# Regexp obeject. Now, just to come clean from the begining,
this might not
# be the best of ideas since running a shell glob and doing
filtering
# on regexes aren’t quite the same thing semantically.
#
# That being said, pragmatically it might be useful, so here we
go. First
# thing that needs to be done is to move the old version of the
function out
# of the way. We need to do this because we’re still going to use
it when
# the use passes in a string value representing a glob. The
funky class <<
# self notation is because glob is a class method on Dir, not an
instance
# method:

  class << self
    alias_method :__original_glob, :glob
  end

  ##
  # Now we're free to redefine Dir.glob.  Since were in class Dir, 

self.glob
# is really the same thing. I know the method signature looks a
little
# funky, but it needs to match up with the original glob function.
#
# If you look at the rdoc for Dir.glob, you’ll see that it can
take a bunch
# of flags, which we won’t handle here; however, if the original
is to keep
# working, this information will need to be passed on. It’s the
same with
# the blk parameter. The & takes the specified block and stuffs
it into a
# variable. This is done rather than just using yield because the
block
# also needs to be sent to the original method as well.

  def self.glob(query,*flags, &blk)

    ##
    # First off is the easy case.  IF the parameter passed in is

not a regex,
# then we don’t do anything. Just pass it off to the original
function.

    return __original_glob(query,*flags,&blk) unless query.is_a? 

Regexp

    ##
    # Now, if there isn't a block to yield to, we're going a to be

building up
# an array with all of the matching files, so thats
initialized before the
# iteration starts

    files = []

    ##
    # Based on the code you posted above, I assumed that you

wanted the regex
# just to match things in the current directory, so we’ll use
‘.’ as the
# one to iterate over. If you want all the files recursively,
take a look
# at the Find library.

    Dir.new('.').each do |f|

      ##
      # Here's the check against the regex.  If it doesn't match 

skip to the
# next file in the directory. Otherwise, what happens next
depends on
# whether a block was passed or not…

      next unless query =~ f

      ##
      # If we did not get a block, just stick the matching file into 

the
# array,

      if blk.nil?
        files << f

      ##
      # Otherwise, it yield it to the block.

      else
        blk.call(f)
      end

    end

    ##
    # Now to make it match up with the original method, nil is

returned if the
# block was called. Otherwise, return the array of files

    blk.nil?? files : nil
  end
end

##
# And that's about it.  This is just showing how to use the new

method, but it
# won’t be called if you’re requiring it from another script.

if $0 == __FILE__

  ##
  # This is the block way of calling it.  I works just like an 

iterator.

  Dir.glob(/\A\..*/) do |f|
    puts "Dot-file: #{f}"
  end

  ##
  # Here's without the block.  It returns an array.  We also do the 

same
# search with a string just to show that the original
functionality is
# preserved.

  backups_regex  = Dir.glob(/~\z/)
  backups_string = Dir.glob('.?*~')

  p backups_regex
  p backups_string

  p backups_regex == backups_string  # => true
end

Hope that helps.

Wow! Thanks, Lou. This looks interesting, but, a lot of it seems beyond
me at this point. All I want is particular files in a directory.

Lou S. wrote:

Peter;

Yup, that helps. I get it. Thanks, Lou. A deeper understanding is always
a good thing. . . .

On 24.11.2006 15:38, Peter B. wrote:

Hi,
Can someone point me to a quick “grep” like function in RUBY? I use glob
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see “grep” in the Pickaxe manual,
but, only as it relates to “.enum,” which, sorry to say, I don’t
understand.

I need something like this, even though, I know this doesn’t work:
Dir.glob(/^f[0-9]{7}.eps/)

Dir.glob(‘f*.eps’).grep(/^f[0-9]{7}.eps$/)

There’s plenty more options around.

robert

Robert K. wrote:

On 24.11.2006 15:38, Peter B. wrote:

Hi,
Can someone point me to a quick “grep” like function in RUBY? I use glob
all the time, but, I need the power of regular expressions when finding
particular files in directories. I see “grep” in the Pickaxe manual,
but, only as it relates to “.enum,” which, sorry to say, I don’t
understand.

I need something like this, even though, I know this doesn’t work:
Dir.glob(/^f[0-9]{7}.eps/)

Dir.glob(‘f*.eps’).grep(/^f[0-9]{7}.eps$/)

There’s plenty more options around.

robert

Grepping on a glob. That makes sense. I knew that grep could be used
somehow. Thanks, Robert.