JRuby 1.3.0 issue with memcache

I’ve been trying to get our Rails application working with JRuby. I
just upgraded to JRuby 1.3.0, and came across an error with memcache.
I would love to debug this, but I’m not even sure where to start. Any
thoughts would be appreciated.

Here’s some output from loading the console with both JRuby 1.2.0 and
1.3.0…

$ java -version
java version “1.6.0_07”
Java™ SE Runtime Environment (build 1.6.0_07-b06-153)
Java HotSpot™ 64-Bit Server VM (build 1.6.0_07-b06-57, mixed mode)

$ jruby --version
jruby 1.2.0 (ruby 1.8.6 patchlevel 287) (2009-06-06 rev 6586)
[x86_64-java]

$ jruby script/console
Loading development environment (Rails 2.3.2)
<jruby_home>/vendor/rails/actionpack/lib/action_controller/middleware_stack.rb:84
warning: given block not used

Rails.cache.write(‘foo’, 1)
=> true

Rails.cache.read(‘foo’)
=> 1

$ jruby --version
jruby 1.3.0 (ruby 1.8.6p287) (2009-06-06 5dc2e22) (Java HotSpot™
64-Bit Server VM 1.6.0_07) [x86_64-java]

$ jruby script/console
Loading development environment (Rails 2.3.2)
<rails_app>/vendor/rails/activesupport/lib/active_support/vendor/memcache-client-1.6.5/memcache.rb:857:in
initialize': execution expired (Timeout::ExitException) from <rails_app>/vendor/rails/activesupport/lib/active_support/vendor/memcache-client-1.6.5/memcache.rb:805:insocket’
from
<rails_app>/vendor/rails/activesupport/lib/active_support/vendor/memcache-client-1.6.5/memcache.rb:788:in
alive?' from <rails_app>/vendor/rails/actionpack/lib/action_controller/session/mem_cache_store.rb:19:ininitialize’
from
<rails_app>/vendor/rails/actionpack/lib/action_controller/session/mem_cache_store.rb:19:in
each' from <rails_app>/vendor/rails/actionpack/lib/action_controller/session/mem_cache_store.rb:19:inany?’
from
<rails_app>/vendor/rails/actionpack/lib/action_controller/session/mem_cache_store.rb:19:in
initialize' from <rails_app>/vendor/rails/actionpack/lib/action_controller/middleware_stack.rb:72:inbuild’
from
<rails_app>/vendor/rails/actionpack/lib/action_controller/middleware_stack.rb:116:in
each' ... 13 levels... from <jruby_home>/lib/ruby/1.8/irb/init.rb:21:insetup’
from <jruby_home>/lib/ruby/1.8/irb.rb:54:in `start’
from <jruby_home>/bin/jirb:19


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Darren Boyd wrote:

    from <rails_app>/vendor/rails/activesupport/lib/active_support/vendor/memcache-client-1.6.5/memcache.rb:788:in

It looks like it’s trying to connect to memcached and timing out. We did
build a new pure-Java “timeout” library, but it seems to be working ok.

You should probably toss this into a bug here:

http://jira.codehaus.org/browse/JRUBY

And ideally if you can come up with a simple app that exhibits the same
problem, we’ll be able to reproduce it more quickly. Definitely seems
like it’s a problem connecting to memcached though.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Sun, Jun 7, 2009 at 11:18 AM, Charles Oliver
Nutter[email protected] wrote:

It looks like it’s trying to connect to memcached and timing out. We did
build a new pure-Java “timeout” library, but it seems to be working ok.

I’ve tracked the issue down to the new timeout library. It is timing
out on the Memcache::Server’s socket connection. The default timeout
is 0.5 seconds. If I increase that time, my problem goes away.

You should probably toss this into a bug here:

http://jira.codehaus.org/browse/JRUBY

And ideally if you can come up with a simple app that exhibits the same
problem, we’ll be able to reproduce it more quickly. Definitely seems like
it’s a problem connecting to memcached though.

Well, I did a bunch more testing, and suddenly the problem
disappeared. In both my Rails application (concerning MemCache), and
the little test application I wrote. You can see the test code for
the issue here…

It was giving inconsistent results every time I ran it (yeah, it was
consistently inconsistent :)). Suddenly, all the problem disappeared.
Including the issue I was having in Rails.

Obviously this sounds like something I was doing, but I can’t for the
life of me think of anything that changed. At any rate, I can’t
reproduce it at this point, so I will not be creating a bug.

If anyone wants to download the gist above and try it, that would be
great. It should pass 100% of the time.

Thanks for your help.

Darren


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Darren Boyd wrote:

I’ve tracked the issue down to the new timeout library. It is timing
out on the Memcache::Server’s socket connection. The default timeout
is 0.5 seconds. If I increase that time, my problem goes away.

Perhaps something changed in how it’s connecting? How long did you have
to make it?

Well, I did a bunch more testing, and suddenly the problem
disappeared. In both my Rails application (concerning MemCache), and
the little test application I wrote. You can see the test code for
the issue here…

125523’s gists · GitHub

It was giving inconsistent results every time I ran it (yeah, it was
consistently inconsistent :)). Suddenly, all the problem disappeared.
Including the issue I was having in Rails.

I was unable to get it to fail at all either. And other tests I tried
seemed to be ok too…

Obviously this sounds like something I was doing, but I can’t for the
life of me think of anything that changed. At any rate, I can’t
reproduce it at this point, so I will not be creating a bug.

If anyone wants to download the gist above and try it, that would be
great. It should pass 100% of the time.

It passed 100% of the time, but I also seemed to see it pass when I
didn’t expect it to. I’m going to test a bit more.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Darren Boyd wrote:

Well, I did a bunch more testing, and suddenly the problem
disappeared. In both my Rails application (concerning MemCache), and
the little test application I wrote. You can see the test code for
the issue here…

125523’s gists · GitHub

Ahh, I found a peculiarity in your code: you immediately reset timeout
to 0.

A timeout of zero causes Timeout to simply yield and never time out. So
I think this script as written is broken.

Fixing that I do see some peculiarities I’m looking into now:

Simple quick loop test…
Pass: 963
Timeout: 2
Error: 35
Simply return true test…
Pass: 999
Timeout: 0
Error: 1

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Charles Oliver N. wrote:

A timeout of zero causes Timeout to simply yield and never time out. So
I think this script as written is broken.

Fixing that I do see some peculiarities I’m looking into now:

UGH. I found it. When I ported timeout, I introduced a pretty gross bug
by coercing the timeout to a long first, resulting in it being rounded
to zero if it’s under one. So in your timeout logic, the timeout of
0.9999 is actually getting rounded to zero, which shows why it sometimes
times out (and more often when running the loop).

I feel pretty dumb about this because I know I’ve corrected other impls
in the past.

So there’s a couple workarounds:

  • build from master after I commit the fix this evening
  • use timeouts of >= 1 second and don’t expect sub-second timings
  • swap in a new org/jruby/ext/Timeout.class from a dev build into
    jruby.jar

I know we had not planned on doing x.x.1 releases, but if we get more
reports of this we may have to consider it.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Mon, Jun 8, 2009 at 11:27 PM, Charles Oliver
Nutter[email protected] wrote:

  • build from master after I commit the fix this evening
  • use timeouts of >= 1 second and don’t expect sub-second timings
  • swap in a new org/jruby/ext/Timeout.class from a dev build into jruby.jar

I know we had not planned on doing x.x.1 releases, but if we get more
reports of this we may have to consider it.

This is now fixed in 581fde37 and reported in bug
http://jira.codehaus.org/browse/JRUBY-3743 so it’s at least resolved
on master. I will abstain from voting on spinning a 1.3.1 release at
some point, since there are workarounds for this (albeit not great
ones). What do others think?


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Thanks for looking into this further. Figures I broke my test :(.

It’s easy enough to change the timeout in Rails from 0.5 to 1. That
should be enough to work around the problem. I don’t think this issue
merits a .1 release. It is a configuration option that can be passed
through the environment configuration of the cache store.

The memcache-client gem doesn’t use Timeout::timeout itself (rails
uses it when initializing memcache stores), so it shouldn’t be an
issue there.

I wouldn’t vote for a .1 release, since it seems easy enough to work
around.

Thanks again,
Darren

On Mon, Jun 8, 2009 at 10:14 PM, Charles Oliver
Nutter[email protected] wrote:

This is now fixed in 581fde37 and reported in bug
http://jira.codehaus.org/browse/JRUBY-3743 so it’s at least resolved
on master. I will abstain from voting on spinning a 1.3.1 release at
some point, since there are workarounds for this (albeit not great
ones). What do others think?


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Tue, Jun 9, 2009 at 12:34 AM, Darren Boyd[email protected]
wrote:

I wouldn’t vote for a .1 release, since it seems easy enough to work around.

Well, that’s good to hear :slight_smile: If you’re fine with it as is, I’m fine
with it as-is. We’ll see what others say. :slight_smile:

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

As someow who’d have to pass the buck to my users (both of them) I’d
vote to fix the broken window and do a .1 release.

Unless 1.4 is right around the corner.

Bob

Sent from my iPhone

On Jun 9, 2009, at 1:14 AM, Charles Oliver N.
[email protected] wrote:

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Tue, Jun 9, 2009 at 7:05 AM, Bob McWhirter[email protected] wrote:

As someow who’d have to pass the buck to my users (both of them) I’d vote to
fix the broken window and do a .1 release.

Unless 1.4 is right around the corner.

1.4 is definitely not just around the corner, but it’s a non-trivial
amount of effort to spin a .1. I’ll talk with Tom about it today.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email