Forum: Ruby-core Avoiding $LOAD_PATH pollution

Posted by Eric Hodel (Guest)
on 2010-08-27 07:44
(Received via mailing list)
Last year Nobu asked me to propose an API for adding an object to
$LOAD_PATH (instead of a String) that could do the work of finding paths
to load.  The reason was to avoid the pollution of $LOAD_PATH that
RubyGems causes as it activates multiple gems.

I have created a proof-of-concept implementation called fancy_require:

http://github.com/drbrain/fancy_require

specifically:

http://github.com/drbrain/fancy_require/blob/master/lib/fancy_require.rb

The API I propose for the lookup object added to $LOAD_PATH is:

The lookup object pushed onto $LOAD_PATH must respond to #path_for.  The
feature being required (file name) will be passed in by Kernel#require.

The lookup object must return the path to the feature, true, false or 
nil.

If nil is returned Kernel#require will continue to the next item in the
load path.

If true or false are returned the lookup object handled loading the file
and recording it in $LOADED_FEATURES.  The boolean will be returned as
plain require.

The true/false behavior would allow the lookup object to load features
from a non-file object like a tarball.

I think Ruby should also add a method that would determine if a feature
is in $LOADED_FEATURES.  Maybe Kernel#loaded?(feature).  This is not
strictly necessary.
Posted by Run Paint Run Run (Guest)
on 2010-08-28 12:33
(Received via mailing list)
> The lookup object pushed onto $LOAD_PATH must respond to #path_for.  The
> feature being required (file name) will be passed in by Kernel#require.

Given the #to_path protocol, I suspect #path_for is a little too
similar. Perhaps #require_path_for / #path_for_require ?

> if RUBY_VERSION > '1.9' then
>   $LOADED_FEATURES << path
> else
>   $LOADED_FEATURES << File.basename(path)
> end

How about appending the absolute path under 1.9?

Say we had a loader object that allowed feature names to be given in
Base64, e.g. `require "bm9rb2dpcmk=\n" would be equivalent to `require
"nokogiri"`. Our loader object would decode the feature name, but
would then want to allow other loader objects, such as RubyGems, to
derive the feature's path. From what I gather, because fancy require
does not recurse with the path it obtains from the loader object, the
loader object would have to handle this recursion itself. In our
example, the loader would call `require` with the decoded feature
name, then return `nil` when re-entered. I'm not sure how I feel about
this.

Anyway, in principle at least, this could be used to commoditise
RubyGems, right? MRI would push a loader object onto the load path
that translated feature names to gem paths, allowing RubyGems to be
just another feature loader. Then, a warning could be issued for
re-defining Kernel.require, so as to encourage the use of this API
instead. How confident are we that this API would be sufficient for
replacing the multiude of require hacks present in the various
RubyGems replacements?
Posted by Eric Hodel (Guest)
on 2010-08-28 15:04
(Received via mailing list)
On Aug 28, 2010, at 19:30, Run Paint Run Run wrote:
> 
>> The lookup object pushed onto $LOAD_PATH must respond to #path_for. &nbsp;The
>> feature being required (file name) will be passed in by Kernel#require.
> 
> Given the #to_path protocol, I suspect #path_for is a little too
> similar. Perhaps #require_path_for / #path_for_require ?

I don't care about the name.  I care about the protocol.  (feature name 
in path/nil/true/false out).

>> if RUBY_VERSION > '1.9' then
>>  $LOADED_FEATURES << path
>> else
>>  $LOADED_FEATURES << File.basename(path)
>> end
> 
> How about appending the absolute path under 1.9?

Pay no attention to the man behind the curtain!

If my proposal is implemented in 1.9 lib/fancy_require.rb need not 
exist.  lib/fancy_require.rb only has to be functional enough to be a 
proof of concept.  I only implemented enough of Kernel#require to be 
minimally compatible.  I know there are edge cases I'm ignoring.

Use this code at your peril!

> Say we had a loader object that allowed feature names to be given in
> Base64, e.g. `require "bm9rb2dpcmk=\n" would be equivalent to `require
> "nokogiri"`. Our loader object would decode the feature name, but
> would then want to allow other loader objects, such as RubyGems, to
> derive the feature's path. From what I gather, because fancy require
> does not recurse with the path it obtains from the loader object, the
> loader object would have to handle this recursion itself. In our
> example, the loader would call `require` with the decoded feature
> name, then return `nil` when re-entered. I'm not sure how I feel about
> this.

See lib/fancy_require/rubygems.rb.  It alters $LOAD_PATH to contain only 
the items after the look-up object then calls require again and returns 
true/false if the restricted $LOAD_PATH require was successful.

I can't think of a solution that would be as simple as a recursive 
require.

Some API to assist in manipulating $LOAD_PATH might be nice, or 
Kernel#require could accept a $LOAD_PATH override to use for the 
recursive call (require 'some_feature', %w[lib ext test]).

I don't think additional API is required as implementing a $LOAD_PATH 
look-up object would be a rare thing and it's ok if rarely-done things 
are a little difficult.

> Anyway, in principle at least, this could be used to commoditise RubyGems, right?

Correct, but it's more expansive than making RubyGems integration 
simple.  With the true/false return from #path_for you could create a 
$LOAD_PATH look-up object that loads features out of an archive.

> MRI would push a loader object onto the load path that translated feature names to gem paths, allowing RubyGems to be just another feature loader.

If my proposed API were implemented gem_prelude could be stripped down 
to code similar to Gem::LookUp in lib/fancy_require/rubygems.rb. 
RubyGems loading could be delayed until absolutely necessary.  (It would 
work just like 1.8 but without the need to require 'rubygems' to use a 
gem-installed feature.)

> Then, a warning could be issued for re-defining Kernel.require, so as to encourage the use of this API instead.

This would be nice but is not necessary.  I think documentation would be 
sufficient.

> How confident are we that this API would be sufficient for replacing the multiude of require hacks present in the various RubyGems replacements?

I know of no successful RubyGems replacement that alters Kernel#require 
(rvm, bundler, isolate depend on RubyGems, RubyOpals never got off the 
ground, RPA is long dead).

urirequire could use #path_for, does that count?

http://fhwang.net/2005/11/01/urirequire-I-got-yer-Web-2-0-right-here
Posted by Run Paint Run Run (Guest)
on 2010-08-28 16:54
(Received via mailing list)
>> How confident are we that this API would be sufficient for replacing the
>> multiude of require hacks present in the various RubyGems replacements?

> I know of no successful RubyGems replacement that alters Kernel#require
> (rvm, bundler, isolate depend on RubyGems, RubyOpals never got off the
> ground, RPA is long dead).

Ah, OK. I thought I'd read about various libraries overriding `require`, 
and
chaos resulting, but I can't find the thread again, so was possibly
hallucinating.

Theoretically, the presence of arbitrary objects in $LOAD_PATH breaks 
backward
compatibility in that users may expect an Array of Strings. That's not a
significant hurdle, however.

In the proof of concept, the feature name may be an arbitrary object. I 
assume
this is unintentional, mainly because it couldn't be used with the '-r' 
switch
on the command-line.

FWIW, I like this idea. A loader could generate a feature by compiling a
third-party language on-the-fly. A GemMissing loader can sit at the head 
of
load path, and automatically install gems from a local repository as
needed. It's surely worth moving to the Features tracker and posting a 
pointer
on ruby-talk to encourage more comments.
Posted by Eric Hodel (Guest)
on 2010-08-29 07:06
(Received via mailing list)
On Aug 28, 2010, at 23:45, Run Paint Run Run wrote:
> hallucinating.
Yes, I recall this thread... I don't know of any wide-spread use of the 
idea besides RubyGems.  I think it was a theoretical discussion.

> Theoretically, the presence of arbitrary objects in $LOAD_PATH breaks backward
> compatibility in that users may expect an Array of Strings. That's not a
> significant hurdle, however.

Yes.

How often do people use $LOAD_PATH besides printing it?

Perhaps an API to look up the path for a feature should be added 
alongside Kernel#require.

Something like this but with a non-terrible name:

look_up_path_for 'rake' # => /usr/local/lib/ruby/1.9.1/rake.rb

> In the proof of concept, the feature name may be an arbitrary object. I assume
> this is unintentional, mainly because it couldn't be used with the '-r' switch
> on the command-line.

It is unintentional.

> FWIW, I like this idea. A loader could generate a feature by compiling a
> third-party language on-the-fly. A GemMissing loader can sit at the head of
> load path, and automatically install gems from a local repository as
> needed. It's surely worth moving to the Features tracker and posting a pointer
> on ruby-talk to encourage more comments.

ok.
Posted by Haase, Konstantin (Guest)
on 2010-08-29 11:38
(Received via mailing list)
On Aug 29, 2010, at 07:01 , Eric Hodel wrote:
> How often do people use $LOAD_PATH besides printing it?

Most autoloading/reloading libraries do (Rack::Reloader, 
Sinatra::Reloader, ...), Railties' generator.rb does, facets does, some 
other do so too. Not many do, but some (it's not the flip flop 
operator):

http://www.google.com/codesearch?hl=en&lr=&q=\%24LOAD_PATH\.%28detect|each|map|collect|select%29+lang%3Aruby+-file%3Arubygems&sbtn=Search

Konstantin
Posted by James Tucker (Guest)
on 2010-08-29 12:41
(Received via mailing list)
On 29 Aug 2010, at 02:01, Eric Hodel wrote:

>> chaos resulting, but I can't find the thread again, so was possibly
>> hallucinating.
> 
> Yes, I recall this thread... I don't know of any wide-spread use of the idea besides RubyGems.  I think it was a theoretical discussion.

I think most of the reported breakage is not much different from that of 
poorly defined method_missings, in other words, author error. Other than 
that, there's a speed penalty that cannot be avoided.

> 
>> Theoretically, the presence of arbitrary objects in $LOAD_PATH breaks backward
>> compatibility in that users may expect an Array of Strings. That's not a
>> significant hurdle, however.
> 
> Yes.
> 
> How often do people use $LOAD_PATH besides printing it?

I've seen quite a few folks using it to roll their own plugin systems. 
I've also seen people use it for debugging (runtime 'gem which' like 
search through it). In this case, they would have to check for #path_for 
and add branches to this kind of debugging. $LOAD_PATH.find_all { |path| 
path.respond_to?(:path_for) ? path.path_for(target) : 
File.exists?(File.join(path, target)) }, and then rebuild.

> Perhaps an API to look up the path for a feature should be added alongside Kernel#require.

Yes, I think the above debugging case could be made easier, maybe 
$LOAD_PATH should have a #find_path or #find_all_paths_for such that it 
can reconstruct the above pattern in a single call.

> Something like this but with a non-terrible name:
> 
> look_up_path_for 'rake' # => /usr/local/lib/ruby/1.9.1/rake.rb
> 
>> In the proof of concept, the feature name may be an arbitrary object. I assume
>> this is unintentional, mainly because it couldn't be used with the '-r' switch
>> on the command-line.

It would be nice to see that working in a non-broken way with things 
like rubygems.
Posted by Stephen Bannasch (Guest)
on 2010-08-29 16:53
(Received via mailing list)
At 2:01 PM +0900 8/29/10, Eric Hodel wrote:
>> chaos resulting, but I can't find the thread again, so was possibly
>How often do people use $LOAD_PATH besides printing it?
Scanning through a local dir with many gems most of the uses are adding 
to $LOAD_PATH using '<<', 'unshift', 'shift',and 'push'

Scanning through these gems:
  aasm, activerecord-import, activerecord-jdbc-adapter, arel,
  aruba, awesome_print, awestruct, bj, bones, buildr, bundler,
  chef, cloud-crowd, compass, composite_primary_keys, couch_foo,
  cucumber, datamapper, desert, docrails, dropbox, earth,
  em-http-request, execwar, extlib, fakeweb, ffi-zlib, gem-this,
  gherkin, gist, git_remote_branch, goio, goio-ffi, haml,
  has_many_polymorphs, jammit, jasmine-gem, jeweler, journeta,
  libxml, mail, merb, mixlib-cli, mongrel, neo4j, nokogiri, ohm,
  passenger, persevere, prawn, prawn-format, rack, rack-fiber_pool,
  rack-reverse-proxy, radiant, railroad, rails, rainbows, rake,
  rake-compiler, rawr, rcov, rdoc, redcar, ri_cal, rjack, rspec,
  rspec-core, rspec-rails, ruby-debug, ruby-debug-ide, showoff,
  sinatra, sinatra_more, specjour, spork, sqlite3-ruby-1.2.4,
  strokedb, test23, threadz, treetop, ttfunk, ultraviolet, warbler,
  webrat, words, yajl-ruby, yard

Here's the interactions with $LOAD_PATH that don't involve adding to it 
using: '<<', 'unshift', 'shift', 'concat' and 'push' or directly 
printing it:

  $LOAD_PATH.clear
  $LOAD_PATH.delete File.expand_path('../unregistered_handler', 
__FILE__)
  $LOAD_PATH.delete(@other_load_path)
  $LOAD_PATH.delete(@path)
  $LOAD_PATH.each do |base|
  $LOAD_PATH.each do |path|
  $LOAD_PATH.include?(File.expand_path(__DIR__))
  $LOAD_PATH.include?(__DIR__) ||
  $LOAD_PATH.index(File.join(RAILS_ROOT, 'lib')) || 0
  $LOAD_PATH.insert(application_lib_index + 1, path)
  $LOAD_PATH.reject! do |p|
  $LOAD_PATH.reject! { |path| !(original_load_path.include?(path)) }
  $LOAD_PATH.replace @_sandbox[:load_path]
  $LOAD_PATH.should include(libdir)
  $LOAD_PATH.should include(specdir)
  $LOAD_PATH.should_receive(:unshift).with("a/dir")
  $LOAD_PATH.stub!(:unshift)
  $LOAD_PATH.uniq!
  !$LOAD_PATH.include?(component_path)
  $servlet_context.log("LoadError while loading '#{library}', current 
path:\n " + $LOAD_PATH.join("\n "))
  %w{test lib ext/libxml}.each{ |path| $LOAD_PATH.unshift(path) }
  @options[:load_paths].each {|p| $LOAD_PATH << p}
  @snoop[:load_path]         = $LOAD_PATH
  @snoop[:load_path] = $LOAD_PATH
  @_sandbox[:load_path] = $LOAD_PATH.clone
  abs = $LOAD_PATH.map { |path| ::File.join(path, loaded) }.
  ar_lib_path = $LOAD_PATH.detect {|p| p if File.exist?File.join(p, 
ar_version)}
  assert 
$LOAD_PATH.include?(File.join(plugin_fixture_path('default/acts/acts_as_chunky_bacon'), 
'lib'))
  assert 
$LOAD_PATH.include?(File.join(plugin_fixture_path('default/stubby'), 
'lib'))
  assert 
$LOAD_PATH.index(File.join(plugin_fixture_path('default/acts/acts_as_chunky_bacon'), 
'lib')) >= assert
    $LOAD_PATH.index(File.join(plugin_fixture_path('default/stubby'), 
'lib')) >= stubbed_application_lib_index_in_LOAD_PATHS
  assert load_paths_count.select { |k, v| v > 1 }.empty?, 
$LOAD_PATH.inspect
  def add_dir_from_project_root_to_load_path(dir, load_path=$LOAD_PATH) 
# :nodoc:
  def rake_require(file_name, paths=$LOAD_PATH, loaded=$")
  describe $LOAD_PATH do
  dir  = $LOAD_PATH.find { |dir| File.exist? File.join(dir, file) }
  env["test.$LOAD_PATH"]  = $LOAD_PATH
  f << "$LOAD_PATH.unshift 'file:' + dir + '/#{rack_dir}'\n"
  f << "$LOAD_PATH.unshift dir + '/#{rack_dir}'\n"
  files = $LOAD_PATH.map do |p|
  if $LOAD_PATH.first != LIBDIR
  if ( $LOAD_PATH.index( ext_dir ).nil? )
  it "should prepend gemspec require paths to $LOAD_PATH in order" do
  libs.map {|lib| $LOAD_PATH.unshift lib}
  libs.reverse.each{|lib| $LOAD_PATH.unshift(lib)}
  load_paths = spec.load_paths.reject {|path| $LOAD_PATH.include?(path)}
  load_paths.reverse_each { |dir| $LOAD_PATH.unshift(dir) if 
File.directory?(dir) }
  load_paths_count = $LOAD_PATH.inject({}) { |paths, path|
  on("--dev", "Add this project's bin/ and lib/ to $LOAD_PATH.",
  on("-I PATH", "Add PATH to $LOAD_PATH") do |path|
  opts.on("-I", "--include PATH", String, "Add PATH to $LOAD_PATH") do 
|path|
  opts.on('-I', '--include PATH', String, 'Add PATH to $LOAD_PATH') do 
|path|
  original_load_path = $LOAD_PATH
  ORIGINAL_LOAD_PATH = $LOAD_PATH.dup
  ORIGINAL_LOAD_PATH.each { |path| $LOAD_PATH << path }
  parser.on('-I DIRECTORY', 'specify $LOAD_PATH directory (may be used 
more than once)') do |dir|
  path = $LOAD_PATH.grep(/#{gem_name}[\w.-]*\/lib$/).first
  paths = ['./', *$LOAD_PATH].uniq
  plugin.load_paths.each { |path| $LOAD_PATH.unshift(path) }
  plugin_load_paths.each { |path| assert $LOAD_PATH.include?(path) }
  puts $LOAD_PATH.grep(/activesupport/i)
  puts( $LOAD_PATH.inject([]) do |res, path|
  require spec_classes_path unless 
$LOAD_PATH.include?(spec_classes_path)
  template_dir = $LOAD_PATH.map do |path|
Posted by Roger Pack (Guest)
on 2010-08-30 19:35
(Received via mailing list)
> Last year Nobu asked me to propose an API for adding an object to
> $LOAD_PATH (instead of a String) that could do the work of finding paths
> to load.  The reason was to avoid the pollution of $LOAD_PATH that
> RubyGems causes as it activates multiple gems.

what's wrong with polluting $LOAD_PATH?  It's what you can use to
determine which versions of gems have been activated...



-r
Posted by Eric Hodel (Guest)
on 2010-11-16 02:51
(Received via mailing list)
I discussed this with Matz at RubyConf.  Matz asked me to bump the 
discussion of this so we could learn of any objections to this proposal 
that may have been discussed on ruby-dev.

Original message follows:

Last year Nobu asked me to propose an API for adding an object to
$LOAD_PATH (instead of a String) that could do the work of finding paths
to load.  The reason was to avoid the pollution of $LOAD_PATH that
RubyGems causes as it activates multiple gems.

I have created a proof-of-concept implementation called fancy_require:

http://github.com/drbrain/fancy_require

specifically:

http://github.com/drbrain/fancy_require/blob/master/lib/fancy_require.rb

The API I propose for the lookup object added to $LOAD_PATH is:

The lookup object pushed onto $LOAD_PATH must respond to #path_for.  The
feature being required (file name) will be passed in by Kernel#require.

The lookup object must return the path to the feature, true, false or 
nil.

If nil is returned Kernel#require will continue to the next item in the
load path.

If true or false are returned the lookup object handled loading the file
and recording it in $LOADED_FEATURES.  The boolean will be returned as
plain require.

The true/false behavior would allow the lookup object to load features
from a non-file object like a tarball.

I think Ruby should also add a method that would determine if a feature
is in $LOADED_FEATURES.  Maybe Kernel#loaded?(feature).  This is not
strictly necessary.
Posted by zimbatm ... (zimbatm)
on 2011-01-06 12:58
The polyglot gem also override require. It adds a facility to require 
other languages by recognizing their extensions, once transformed into 
ruby.

http://polyglot.rubyforge.org/
Posted by zimbatm ... (zimbatm)
on 2011-01-08 21:08
Hi Eric,

I've been thinking about this a little bit for the past few days. Here 
is what I think actually.

First of all, I think it's a good framework for experimentation, require 
& friends never seemed to be totally right to me either, so it's nice to 
have an initiative to rework some bits of it.

There is this perceived notion that $LOAD_PATH is polluted. But by 
shifting some part of it into sub-objects, it is only hiding a part of 
the information, making it necessary to resort to a new API to access 
it. Agreed, this could allow for a dynamic $LOAD_PATH, but in the 
current situation, all that rubygems does, is activate the gems, and 
push then on it's own internal $LOAD_PATH, right ? So we would have:

 ['.../1.9', '.../site_ruby/1.9', '.../vendor_ruby/1.9', [ all the 
activated rubygems ]]

But I think another hidden goal of this proposal is to not touch 
#require, or at least avoiding the necessity of rubygems to override it 
with it's own version. Would you be interested if I come up with a 
solution in that area ?
Posted by zimbatm ... (zimbatm)
on 2011-01-09 00:08
(Received via mailing list)
Just a note for future references. While playing with require, I found
that 1.9 was particularly slow on startups because of require. I
suspect it has something to do with the path expansions.

$ touch Gemfile
$ rvm use ruby-1.9.2
Using /Users/zimbatm/.rvm/gems/ruby-1.9.2-p136
$ RUBYOPT=-rprofile bundle install 2>&1 >/dev/null | head -n 6
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 43.04     1.02      1.02      197     5.18    31.98
Kernel.gem_original_require
 15.19     1.38      0.36    41288     0.01     0.01  Hash#default
  2.53     1.44      0.06     1546     0.04     0.05  Kernel.===
  1.69     1.48      0.04      171     0.23     2.22  Array#each

$ rvm use ruby-1.8.7
Using /Users/zimbatm/.rvm/gems/ruby-1.8.7-p330
$ RUBYOPT=-rprofile bundle install 2>&1 >/dev/null | head -n 6
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 13.71     0.24      0.24      168     1.43    17.26
Kernel.gem_original_require
  9.14     0.40      0.16      139     1.15     8.56  Array#each
  4.00     0.47      0.07     3229     0.02     0.02 
Module#method_added
  3.43     0.53      0.06     2127     0.03     0.03  Module#===

$ rvm use ruby-1.9.2
Using /Users/zimbatm/.rvm/gems/ruby-1.9.2-p136
$ RUBYOPT=-rprofile rails help 2>&1 >/dev/null | head -n 6
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 34.25     0.75      0.75      156     4.81    45.45
Kernel.gem_original_require
  7.31     0.91      0.16      385     0.42     0.62  Array#include?
  6.85     1.06      0.15    25979     0.01     0.01  Hash#default
  5.48     1.18      0.12      298     0.40     1.41  Array#each

$ rvm use ruby-1.8.7
Using /Users/zimbatm/.rvm/gems/ruby-1.8.7-p330
$ RUBYOPT=-rprofile rails help 2>&1 >/dev/null | head -n 6
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 12.50     0.30      0.30      142     2.11    35.21
Kernel.gem_original_require
  9.58     0.53      0.23      271     0.85     8.60  Array#each
  8.75     0.74      0.21     1892     0.11     0.18  Array#include?
  5.83     0.88      0.14     7964     0.02     0.02  String#==
Posted by Luis Lavena (luislavena)
on 2011-01-09 00:40
(Received via mailing list)
On Sat, Jan 8, 2011 at 8:07 PM, Jonas Pfenniger (zimbatm)
<jonas@pfenniger.name> wrote:
> Just a note for future references. While playing with require, I found
> that 1.9 was particularly slow on startups because of require. I
> suspect it has something to do with the path expansions.
>

Ruby 1.9.2 suffers from this, which has been corrected in trunk.

This was caused by excessive stat() calls.
Posted by Eric Hodel (Guest)
on 2011-01-09 03:10
(Received via mailing list)
On Jan 8, 2011, at 12:08, zimbatm ... wrote:
> it.
Yes, this is a problem with my proposal.  A method to find a file on 
$LOAD_PATH like Gem.find_files may be a solution.

> Agreed, this could allow for a dynamic $LOAD_PATH, but in the
> current situation, all that rubygems does, is activate the gems, and
> push then on it's own internal $LOAD_PATH, right ? So we would have:
>
> ['.../1.9', '.../site_ruby/1.9', '.../vendor_ruby/1.9', [ all the
> activated rubygems ]]

Yes (but activated rubygems paths come before installed ruby paths)

> But I think another hidden goal of this proposal is to not touch
> #require, or at least avoiding the necessity of rubygems to override it
> with it's own version. Would you be interested if I come up with a
> solution in that area ?

Yes!
Posted by Charles Nutter (headius)
on 2011-01-10 08:34
(Received via mailing list)
On Sat, Jan 8, 2011 at 5:39 PM, Luis Lavena <luislavena@gmail.com> 
wrote:
> Ruby 1.9.2 suffers from this, which has been corrected in trunk.
>
> This was caused by excessive stat() calls.

This is also a cause of overhead in JRuby's require, though it's
harder to get around (some of the Java libraries we use do multiple
stats per path visited).

- Charlie
Posted by Charles Nutter (headius)
on 2011-01-10 08:35
(Received via mailing list)
Sorry, missed this email until it was bumped recently...

On Fri, Aug 27, 2010 at 12:43 AM, Eric Hodel <drbrain@segment7.net> 
wrote:
>
> http://github.com/drbrain/fancy_require/blob/master/lib/fancy_require.rb

The base protocol seems fine to me. I think you and I discussed it on
IRC at some point. I like its simplicity.

I do have a concern for the additional overhead it adds to all
requires. case/when matching does a dispatch to ===, so that's a
dyncall for every element at the very least. Probably could be
short-circuited in the C code.

The loaded? implementation is broken like gem_prelude on systems with
case-insensitive filesystems like OS X and Windows. JRuby just added a
change to "require" that the $" search is done case-insensitively when
on a case-insensitive filesystem, but it's an experimental change. We
made it originally to work around problems in gem_prelude that
disappeared when we wiped out gem_prelude...

The specified protocol would also greatly simplify logic in JRuby to
load files from URLs and classloader resources. Where currently we
have special-cased logic to search those types of artifacts, with your
change we'd simply register "finders" in the load path that can handle
them.

This probably could also tie into the Ruby Archive RSoC project,
allowing MRI to also load code from archives. I did not follow that
project closely enough to know its status.

- Charlie
Posted by Eric Hodel (Guest)
on 2011-01-10 23:40
(Received via mailing list)
On Jan 9, 2011, at 23:33, Charles Oliver Nutter wrote:
> The base protocol seems fine to me. I think you and I discussed it on
> IRC at some point. I like its simplicity.
>
> I do have a concern for the additional overhead it adds to all
> requires. case/when matching does a dispatch to ===, so that's a
> dyncall for every element at the very least. Probably could be
> short-circuited in the C code.

In C code I would only dispatch if a path element wasn't a T_STRING 
object

> The loaded? implementation is broken like gem_prelude on systems with
> case-insensitive filesystems like OS X and Windows. JRuby just added a
> change to "require" that the $" search is done case-insensitively when
> on a case-insensitive filesystem, but it's an experimental change. We
> made it originally to work around problems in gem_prelude that
> disappeared when we wiped out gem_prelude

For a real #loaded? implementation I would prefer using the built-in C 
functionality in rb_require.

> The specified protocol would also greatly simplify logic in JRuby to
> load files from URLs and classloader resources. Where currently we
> have special-cased logic to search those types of artifacts, with your
> change we'd simply register "finders" in the load path that can handle
> them.
>
> This probably could also tie into the Ruby Archive RSoC project,
> allowing MRI to also load code from archives. I did not follow that
> project closely enough to know its status.

Yes, I was thinking about this use-case.
Posted by zimbatm ... (zimbatm)
on 2011-01-13 02:07
(Received via mailing list)
2011/1/9 Eric Hodel <drbrain@segment7.net>:
>> But I think another hidden goal of this proposal is to not touch
>> #require, or at least avoiding the necessity of rubygems to override it
>> with it's own version. Would you be interested if I come up with a
>> solution in that area ?
>
> Yes!

Hi Eric, sorry for taking so long. I tried to come-up with various
solutions and am afraid there is no clean-one that doesn't break
backward-compatibility.

== Backward-compatible hack

The simplest way to fix this without breaking backward compatibility,
is to introduce a second method with signature (naming is bad):

  Kernel.require_search(feature) -> path | nil

On #require, if no path is found, it will fallback on this method, if
it exists. This means that rubygems and any other package-managers can
fight over this method, and leave #require alone. If they want to
change $LOAD_PATH for future requires, then it's fine too.

Patch is under construction.

== Clean solution ?

I'd like to restore the feature / filename separation to make things a
bit simpler in future ruby versions. But this emails needs to get out,
so the rest of for another episode.

Cheers,
  zimbatm
Posted by zimbatm ... (zimbatm)
on 2011-01-19 19:08
(Received via mailing list)
The attached patch implements my previous proposition but is not final.

This patch applies on top of 9638cd5bbcfc9023d20227d4f85be3e736d910d0.
This is before the rubygems-1.5.0 merge, I didn't feel like putting
more effort if the idea is not accepted. Please let me know if you
prefer subversion patches, it's just that I'm more fluent with git.

Kernel.require_search is really horrible name and position but I don't
have a better idea.

----- commit message ----
If Kernel.require_search is defined, it will be used as a secondary 
lookup
by Kernel#require is the feature was not found.

Rubygems has also been changed to show it's usage.

Current issues:
* The feature is difficult to discover by introspection. Any ideas ?
* No tests. Where should I put core tests ? test/ is for stdlib,
  bootstrap/ is lower-level and has no fixtures mechanism.
Posted by Eric Hodel (Guest)
on 2011-01-24 20:08
(Received via mailing list)
On Jan 12, 2011, at 17:00, Jonas Pfenniger (zimbatm) wrote:

> backward-compatibility.
Sorry my response came so late, I got distracted.

> == Backward-compatible hack
>
> The simplest way to fix this without breaking backward compatibility,
> is to introduce a second method with signature (naming is bad):
>
>  Kernel.require_search(feature) -> path | nil
>
> On #require, if no path is found, it will fallback on this method, if
> it exists.

This does not behave in a way that RubyGems can use it.

RubyGems needs to be able to insert a load path before the system load 
paths so that rubygems-provided features may override built-in features.

By your description, with this patch if I gem 'rdoc' then require a file 
I will always get the built-in rdoc (or worse, a mix of the two if there 
are new files added to the gem).

> This means that rubygems and any other package-managers can
> fight over this method, and leave #require alone. If they want to
> change $LOAD_PATH for future requires, then it's fine too.

I am looking for a solution that removes the need to override a method 
instead of one that simply changes which method to override.
Posted by zimbatm ... (zimbatm)
on 2011-01-25 00:29
(Received via mailing list)
2011/1/24 Eric Hodel <drbrain@segment7.net>:
> Sorry my response came so late, I got distracted.

Yup, no problem. I prefer that it takes long and that it's solved once
and for all.

> This does not behave in a way that RubyGems can use it.
>
> RubyGems needs to be able to insert a load path before the system load paths so 
that rubygems-provided features may override built-in features.
>
> By your description, with this patch if I gem 'rdoc' then require a file I will 
always get the built-in rdoc (or worse, a mix of the two if there are new files 
added to the gem).

Hmm, okay, yeah my patch still relies on rubygems unshifting into
$LOAD_PATH. It works, but it only avoids overriding #require (and
shifts the problem with #require_search as you mentioned). This is
because I assumed that usually only one package-manager would live in
the same ruby instance and that an unique method call is faster than
other approaches.

>> This means that rubygems and any other package-managers can
>> fight over this method, and leave #require alone. If they want to
>> change $LOAD_PATH for future requires, then it's fine too.
>
> I am looking for a solution that removes the need to override a method instead 
of one that simply changes which method to override.

Ok, let me come up with another approach, I think I got something
nice, but still need some reflection on it.

Cheers,
  zimbatm
Please log in before posting. Registration is free and takes only a minute.
Existing account (Switch to SSL-encrypted connection)
NEW: Do you have a Google/GoogleMail or Yahoo account? No registration required!
Log in with Google account | Log in with Yahoo account
No account? Register here.