Way to many calls to Dir#[]


#1

I cannot seem to debug this issue… my entire library does not call
Dir#[] a single time, however when I used ruby-prof I get:

17.26 0.14 0.14 0.00 0.00 594 Class::Dir#[]

unnecessarily slowing things down. I have profiled all the other
libraries I am requiring, and none of them produce this same number of
calls, so I suppose it can only be something I am calling within, but
this seems very weird… any ideas?


#2

Tj Holowaychuk wrote:

I cannot seem to debug this issue… my entire library does not call
Dir#[] a single time, however when I used ruby-prof I get:

17.26 0.14 0.14 0.00 0.00 594 Class::Dir#[]

unnecessarily slowing things down. I have profiled all the other
libraries I am requiring, and none of them produce this same number of
calls, so I suppose it can only be something I am calling within, but
this seems very weird… any ideas?

There is probably some elegant debug or tracing way to do this, but
personally I’d just stick this at the top of my program:

class <<Dir
alias :old_square :[]
def
STDERR.puts “Calling Dir[#{args.inspect}] from #{caller.inspect}”
old_square(*args)
end
end


#3

Thanks! that is helping, its all from rubygems
‘rubygems/gem_path_searcher.rb’ file, just seems very excessive. My gem
is ‘commander’ and it searches for mine in a good 25 locations or so,
then moves on to looking for Win32API and termios in at least 200
directories :S, neither of which I am using, the only library I am using
is highline, which does not require those either… just seems odd
unless im completely retarded and something is brutally wrong with my
requires

$:.unshift File.dirname(FILE)

require ‘rubygems’
require ‘highline/import’
require ‘commander/version’
require ‘commander/user_interaction’
require ‘commander/fileutils’
require ‘commander/core_ext’
require ‘commander/runner’
require ‘commander/command’
require ‘commander/help_formatters’
require ‘commander/import’


#4

nvm i fucked that up… it seems like any required gem is doing that :S


#5

its in highline!! if you put this in any file it calls Dir[] like nuts

require ‘rubygems’
require ‘highline/import’

BUT for anyone else who might run across this, the following cuts this
down to a little over 100 calls to Dir#[], where as the previous is 600+

require ‘rubygems’
require ‘highline’
require ‘highline/import’


#6

On Dec 9, 2008, at 3:44 PM, Tj Holowaychuk wrote:

its in highline!!

I disagree. :slight_smile:

$ grep -c ‘Dir’ ruby -e 'puts Dir["lib/**/*.rb"]'
lib/highline/color_scheme.rb:0
lib/highline/import.rb:0
lib/highline/menu.rb:0
lib/highline/question.rb:1
lib/highline/system_extensions.rb:0
lib/highline.rb:0

If you look into that single usage of Dir[], you will see that you
must construct a HighLine::Question object and call selection() on it
to trigger that code. This does not happen when HighLine loads.

require ‘highline’
require ‘highline/import’

I assume that’s a RubyGems issue with how it figures out to load
highline/import. I recommend asking your question on their mailing
list.

James Edward G. II


#7

Im not using windows either haha. just took a look at the highline
source looks like it conditionally supports termios so your right there.
I see your point though, if its a long running process I guess 0.35 of a
second is not much, just seemed really sketchy to me being new to Ruby


#8

Tj Holowaychuk wrote:

Thanks! that is helping, its all from rubygems
‘rubygems/gem_path_searcher.rb’ file, just seems very excessive.

Startup with rubygems can be pretty slow. Of course, once your app is
running it should be fine; you can turn on profiling later in the code.

My gem
is ‘commander’ and it searches for mine in a good 25 locations or so

Sounds reasonable - I’d expect /usr/lib/ruby/gems/1.8/gems/*/lib

then moves on to looking for Win32API and termios in at least 200
directories :S

Not sure why so many directories there; you’d have to analyse that list
yourself.

neither of which I am using, the only library I am using
is highline, which does not require those either…

Clearly, something is requiring them. But I’m not a Windows user, so I
can’t help you pin this down further. But since highline is something to
do with terminal interaction, I’d certainly expect it to try to load
termios, and perhaps Win32API on a Windows box.


#9

Thanks for investigating, I thought it was for sure some strange
dependency loop or something I was creating, but even with my tiny file
with those two requires it is still being odd… I posted on rubyforge to
see what the gem guys think, might be a rubygems 1.3.x thing


#10

Tj Holowaychuk wrote:

Im not using windows either haha. just took a look at the highline
source looks like it conditionally supports termios so your right there.

And win32 too:

$ pwd
/usr/lib/ruby/gems/1.8/gems/highline-1.5.0
$ find . -type f | xargs grep -i win32
./lib/highline/system_extensions.rb: require “Win32API”

See if we’re on Windows.

./lib/highline/system_extensions.rb: CHARACTER_MODE = “Win32API”

For Debugging purposes only.

./lib/highline/system_extensions.rb: Win32API.new(“crtdll”,
“_getch”, [ ], “L”).Call
./lib/highline/system_extensions.rb: m_GetStdHandle
= Win32API.new( ‘kernel32’,
./lib/highline/system_extensions.rb: m_GetConsoleScreenBufferInfo
= Win32API.new(
./test/tc_highline.rb:if HighLine::CHARACTER_MODE == “Win32API”
./test/tc_highline.rb: assert(%w[Win32API termios
stty].include?(HighLine::CHARACTER_MODE))

However there is only a single require of Win32API, and if it fails a
LoadError is rescued which should cause it to install termios methods
instead. It shouldn’t attempt to load Win32API again after that.

I see your point though, if its a long running process I guess 0.35 of a
second is not much, just seemed really sketchy to me being new to Ruby

Now I know you’re not on Windows, let me look a bit more. Here’s what I
get (P4 2.8GHz, 1.8.6p111, rubygems 1.2.0), giving the second run times
for each test:

$ time for i in $(seq 1 10); do ruby -rubygems -e0; done

real 0m0.646s
user 0m0.568s
sys 0m0.060s

$ time for i in $(seq 1 10); do ruby -rubygems -e’require “termios”’;
done

real 0m0.707s
user 0m0.612s
sys 0m0.072s

$ time for i in $(seq 1 10); do ruby -rubygems -e’require “highline”’;
done

real 0m2.680s
user 0m2.380s
sys 0m0.244s

So there is a substantial startup overhead in including highline, which
for me is about 0.2s more than just loading ruby + rubygems + termios.

But when I profile this, I find that Dir[] is a very long way down the
list.

$ ruby -rubygems -rprofile -e’require “highline”’
% cumulative self self total
time seconds seconds calls ms/call ms/call name
23.23 1.05 1.05 248 4.23 23.79 Array#each
7.08 1.37 0.32 5522 0.06 0.09 Kernel.===
6.86 1.68 0.31 502 0.62 0.62 Regexp#===
4.20 1.87 0.19 6987 0.03 0.03 Module#===
3.32 2.02 0.15 5501 0.03 0.03 Kernel.==
2.88 2.15 0.13 1664 0.08 0.10 Kernel.dup
2.88 2.28 0.13 104 1.25 31.73
Gem::Specification#initialize
2.43 2.39 0.11 303 0.36 0.83
Gem::Version#normalize
2.43 2.50 0.11 25 4.40 47.60
Kernel.gem_original_require
2.21 2.60 0.10 1 100.00 140.00 Array#sort
1.99 2.69 0.09 104 0.87 0.87 IO#read
1.99 2.78 0.09 209 0.43 0.53
Gem::Specification#full_name
… skip lots …
0.44 4.25 0.02 144 0.14 0.14 Dir#[]

(Whilst it’s possible that Dir#[] is using a lot more real time than
CPU time - and there was a post about fixing profile for this a week
or two back - you can see from above that the total real time is very
close to user+sys time in this case. That’s what I’d expect when the
directories are in the VFS cache, that is, no actual disk I/O is done
when searching through directories)

Sorting by total ms/call, I get:

$ env LC_ALL=C sort -rn -k6 ert | head -20
0.00 4.52 0.00 1 0.00 4520.00 #toplevel
0.00 4.52 0.00 1 0.00 4090.00
Gem::GemPathSearcher#initialize
0.00 4.52 0.00 1 0.00 3880.00
Gem::GemPathSearcher#init_gemspecs
0.00 4.52 0.00 1 0.00 3740.00
Gem::SourceIndex#from_installed_gems
0.00 4.52 0.00 1 0.00 3620.00
Gem::SourceIndex#refresh!
0.00 4.52 0.00 1 0.00 3620.00
Gem::SourceIndex#load_gems_in
0.00 4.52 0.00 1 0.00 3620.00
Gem::SourceIndex#from_gems_in
0.00 4.52 0.00 1 0.00 3620.00 Array#reverse_each
0.00 4.52 0.00 2 0.00 2045.00 Mutex#synchronize
0.00 4.52 0.00 2 0.00 2045.00 Gem.searcher
0.00 4.52 0.00 2 0.00 1870.00 Gem.source_index
0.22 4.33 0.01 24 0.42 225.42 Kernel.require
2.21 2.60 0.10 1 100.00 140.00 Array#sort
0.00 4.52 0.00 1 0.00 120.00
Gem::SourceIndex#installed_spec_directories
0.00 4.52 0.00 1 0.00 120.00 Gem.set_paths
0.00 4.52 0.00 1 0.00 120.00 Gem.path
0.00 4.52 0.00 1 0.00 110.00 Gem.set_home
0.00 4.52 0.00 1 0.00 110.00 Gem.dir
0.00 4.52 0.00 2 0.00 60.00
Gem.ensure_gem_subdirectories
0.00 4.52 0.00 2 0.00 50.00
Gem::GemPathSearcher#find

So there’s a lot of rubygems chomping going on. Optimising that may be
difficult, I’m afraid. I tried reading the rubygems source once, and
gave up :slight_smile:

Notice that whilst there are 24 calls to Kernel.require, there is only 1
call to Gem::GemPathSearcher#initialize accounting for most of the work.
Perhaps highline forces rubygems to do some dependency calculations,
whereas an itsy-bitsy standalone gem like termios doesn’t.

Indeed, when loading only termios, this doesn’t seem to be done:

$ ruby -rubygems -rprofile -e’require “termios”’ 2>&1 | env LC_ALL=C
sort -rn -k6 | head
0.00 0.09 0.00 1 0.00 90.00 #toplevel
55.56 0.05 0.05 1 50.00 80.00 Array#each
0.00 0.09 0.00 1 0.00 80.00
Enumerable.each_with_index
11.11 0.09 0.01 2 5.00 50.00
Kernel.gem_original_require
0.00 0.09 0.00 2 0.00 50.00 Kernel.require
33.33 0.08 0.03 514 0.06 0.06 Hash#[]=
time seconds seconds calls ms/call ms/call name
0.00 0.09 0.00 257 0.00 0.00 Integer#to_int
0.00 0.09 0.00 257 0.00 0.00 Array#pack
0.00 0.09 0.00 64 0.00 0.00 String#eql?

Regards,

Brian.