Hello.
I’ve been noticing a problem for a while in a larger application and
today I
produced a minimal example, given below. When a method is not found,
Sinatra
calls NoMethodError#message to display a nice error message, but that is
taking
VERY long.
require 'sinatra'
require 'nokogiri'
get '/' do
@xml = Nokogiri::XML(File.new('baz.xml', 'rb').read())
foo()
'hello'
end
The file ‘baz.xml’ can be just a simple file, like a root element
containing
some thousands of elements. The number of elements is important because,
oddly
enough, the time taken for the message to be displayed is proportional
to the
size of the XML file. In my system, with a 3.7 MB file this program
takes almost
13 seconds to print the message, but in a larger application the message
takes a
few minutes to appear.
You can try to print the line yourself instead, but that also takes very
long:
begin
foo()
rescue ::Exception => boom
puts boom.message
end
I don’t know where the error might be. The code above looks too innocent
to
cause such a weird behavior, and I don’t see how either Nokogiri or
Sinatra
could be involved, although both of them are required to reproduce the
problem.
I’m using Ruby 1.9.2p180, Nokogiri 1.4.4 and Sinatra 1.2.0.
Any help would be greatly appreciated!
On Mar 15, 2011, at 8:25 PM, Adiel M. wrote:
get ‘/’ do
13 seconds to print the message, but in a larger application the message takes a
few minutes to appear.
Let me see if I have this straight:
If baz.xml is empty it is fast
If baz.xml is large it is slow
If this is true it is because it takes time to parse the XML file before
it gets to foo and discovers that there is no method foo.
Ruby does not know that foo does not exist until you try to call it.
On Wed, Mar 16, 2011 at 7:24 AM, Eric H. [email protected]
wrote:
require ‘sinatra’
enough, the time taken for the message to be displayed is proportional to the
Ruby does not know that foo does not exist until you try to call it.
Additional remarks: the file is not properly closed and reading the
whole file in one go is probably also not a good solution. I’d rather
require ‘sinatra’
require ‘nokogiri’
get ‘/’ do
@xml = File.open(‘baz.xml’, ‘rb’) {|io| Nokogiri::XML(io)}
foo()
‘hello’
end
Note, the fact that you read in a large file as one String can
actually be related to your slowness because it might case GC
overhead.
Cheers
robert
On Wed, Mar 16, 2011 at 1:15 PM, Adiel M. [email protected]
wrote:
Ruby does not know that foo does not exist until you try to call it.
minutes.
Not if one does not have the environment installed. Did you try the
suggested change?
Cheers
robert
Hello.
On Wed, Mar 16, 2011 at 03:24:29PM +0900, Eric H. wrote:
Let me see if I have this straight:
If baz.xml is empty it is fast
If baz.xml is large it is slow
If this is true it is because it takes time to parse the XML file before it gets
to foo and discovers that there is no method foo.
Ruby does not know that foo does not exist until you try to call it.
The thing is: if method foo is not called at all, and thus no exception
is
raised, the resulting web page is shown quickly enough. Nokogiri is
pretty fast
at parsing XML files.
It only takes long when the inexistent method foo() is called and the
pogram
tries to print the exception message.
Yes, I know this is very, very weird. Can anybody try to reproduce
this bug?
The code snippet is very simple and it shouldn’t take more than a couple
minutes.
Thanks for the reply.
Hi, I discovered what’s causing the problem.
Ruby’s function name_err_mesg_to_str (file error.c, line 929, SVN
snapshot),
which produces the dreaded message, proceeds as follows:
-
It converts the object that does not possess the method in question
to a
string, by calling #inspect;
-
If it doesn’t like that string (i.e., if it’s nil or is bigger than
65
characters), then it creates a simpler representation and use it
instead.
In other words, if you happen to call a method on an object that doesn’t
define
to_s but that does contain a lot of data, then calling an undefined
method and
trying to print the resulting exception’s message will take a long time
because
all the data in the object will have to be converted to a string – just
to be
discarded.
Maybe in a case like this the message should always return the simpler
representation of the object instead of calling to_s?
Hello.
On Wed, Mar 16, 2011 at 09:08:42PM +0900, Robert K. wrote:
Note, the fact that you read in a large file as one String can
actually be related to your slowness because it might case GC
overhead.
That is indeed a better way to read the file, thanks for pointing it
out.
The same problem happens with the code you suggested, however.
In any case, did you test any of the variants? Was it very slow as in my
case?
Thank you for your reply.
Hello.
On Thu, Mar 17, 2011 at 05:07:47PM +0900, Robert K. wrote:
I’d rather provide a custom #inspect method for objects which may
contain potentially large volumes of data. Nobody can read that
anyway when printed on console or somewhere else.
I agree with you: if I am going to print out a big object to the
console, it’s
better to provide a custom #inspect in order to generate a nice summary.
However, I find it odd that I must write such a method for all big
objects,
regardless of whether I’m ever going to print them out or not – because
otherwise my program may eventually freeze for up to a few minutes doing
the
very complicated task of telling me that some method didn’t exist
On Wed, Mar 16, 2011 at 10:28 PM, Adiel M. [email protected]
wrote:
In other words, if you happen to call a method on an object that doesn’t define
to_s but that does contain a lot of data, then calling an undefined method and
trying to print the resulting exception’s message will take a long time because
all the data in the object will have to be converted to a string – just to be
discarded.
Maybe in a case like this the message should always return the simpler
representation of the object instead of calling to_s?
I’d rather provide a custom #inspect method for objects which may
contain potentially large volumes of data. Nobody can read that
anyway when printed on console or somewhere else.
Kind regards
robert