I’m trying to parse ruby files to find all the class definitions in the
file. For each line in the file, I thought I could use the following to
pull out the class name:
\bclass\b(\w+)\b
so then $1 would give me the class name.
But it doesn’t work:
irb(main):001:0> line = “class Article < MyBaseClass”
=> “class Article < ActiveRecord::Base”
irb(main):002:0> line =~ /\bclass\b(\w+)\b/
=> nil
I think I narrowed down the problem to my use of \w, but I can’t
understand why.
For extra credit, anybody know how I can make sure I can ignore comments
and quoted strings? I want to make sure I ignore these things:
if option_exists # handle class options
as well as
puts “Your are in a class by yourself”
But those are advanced… if I can just get the first one working I’ll
be grateful!
I’m trying to parse ruby files to find all the class definitions in
the
file. For each line in the file, I thought I could use the
following to
pull out the class name:
\bclass\b(\w+)\b
so then $1 would give me the class name.
You’re close, you just forgot to allow for some space between class
and the name. A boundary is a zero-width assertion, so it’s not enough:
your \w is right. \b doesn’t work the way you think it does though.
It doesn’t consume anything, ie;
"<-- \b is just before the ‘c’
c
l
a
s
s__ \b is in between the ‘s’ and the space
← space doesn’t match \w
A
r
t
i
c
l
e
.
.
.
So what you really want is
line =~ /\bclass\s+(\w+)/
irb(main):007:0> line =~ /\bclass\s+(\w+)/
=> 0
irb(main):008:0> $1
=> “Article”
As for the other questions, comments aren’t SO hard:
/#.*$/ unless of course you want to handle strings, then you have to
worry about # inside of strings. I’m not even going to begin to try
to create a regex to match quoted strings, thats all sorts of
difficult especially with heredocs and such. I would take a look at
rdoc and see if you can’t manipulate it to get a list of classes for
you.
which will only match “class …” at the beginning of a line or with
only spaces to its left. It’s certainly not impossible to get false
positives or negatives this way, but in the normal course of a
normally-written Ruby program file it should be close to 100%.
Don’t forget, though, that you might get “::” in a class name, like
this:
module M
end
class M::C
end
and just going for \w+ will give you the module name, not the class
name.