Threads and Ruby

barjunk · July 1, 2008, 6:44pm

Joel VanderWerf wrote:

Yes, sqlite does synchronize, but a potential problem is granularity: a
writer gets an exclusive lock on the entire db.

Clarification: “a potential problem for IPC”. We’re using it for IPC
between programs written in C and ruby, and haven’t had major problems
with this yet.

barjunk · July 1, 2008, 10:39pm

IÃ±aki Baz C. wrote:

Now I’m doing a server in Ruby using green threads (Ruby MRI).
In the future it’s possible I try to migrate it to JRuby to use real threads.

Just a question: Will my code and for now green threads work correctly under
JRuby and green threads will become OS threads magically? or must I change my
code to work with OS threads?

You should not need to do anything and the threads will “just be
native”. We do that by hobbling native threads slightly so they check
for those “unsafe” events like kill, raise, and critical.

One down side to native threads is that launching a thread in JRuby is a
lot more costly than in MRI, but we also have a thread pool (somewhat
experimental, but people are using it) to mitigate that cost:

Normal:
user system total real
control loop 0.005000 0.000000 0.005000 ( 0.006000)
Thread.new.join loop 2.569000 0.000000 2.569000 ( 2.569000)

-J-Djruby.thread.pool.enabled=true:
user system total real
control loop 0.009000 0.000000 0.009000 ( 0.009000)
Thread.new.join loop 0.655000 0.000000 0.655000 ( 0.654000)

This improves more with a longer run.

Charlie

barjunk · July 1, 2008, 11:02pm

El Martes, 1 de Julio de 2008, Charles Oliver N. escribiÃ³:

native". We do that by hobbling native threads slightly so they check

-J-Djruby.thread.pool.enabled=true:
user system total real
control loop 0.009000 0.000000 0.009000 ( 0.009000)
Thread.new.join loop 0.655000 0.000000 0.655000 ( 0.654000)

This improves more with a longer run.

Great. Thanks a lot.

barjunk · July 1, 2008, 11:19pm

ara.t.howard wrote:

one of the reasons this is true is that for a heavily threaded ruby
program (green threads) you end up with the entire process sometimes
blocked on io and the threads end up getting into a pattern where all of
them need to write at once - a kind of rhythm - with processes the
ability for the OS to schedule access to resources ends up staggering
the phase of execution so access is generally faster than it ‘ought’ to
be taking only TPS into account.

Of course that’s mostly a factor of Ruby’s rather simplistic thread
scheduling, which has a 10ms timeslice and a fairly basic selection
algorithm. Obviously OS scheduling will be better/more advanced, but
that applies equally well to native threads (like in JRuby).

Charlie

barjunk · July 1, 2008, 8:35pm

El Martes, 1 de Julio de 2008, Charles Oliver N. escribiÃ³:

trying threads under JRuby first.
Now I’m doing a server in Ruby using green threads (Ruby MRI).
In the future it’s possible I try to migrate it to JRuby to use real
threads.

Just a question: Will my code and for now green threads work correctly
under
JRuby and green threads will become OS threads magically? or must I
change my
code to work with OS threads?

Thanks for any explanation.

barjunk · July 2, 2008, 6:31pm

On Jul 1, 8:49 am, “ara.t.howard” [email protected] wrote:

one of the reasons this is true is that for a heavily threaded ruby
mention it…

a @http://codeforpeople.com/

we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

OK, I’m going to try to recap what I learned so far from reading links
and making some assumptions (bad):

1 - Green threads are threads that run in user space.
2 - Forking allows for running multiple ruby interpreters, each with
their own memory space.
3 - Ruby provides a thread mechanizm, but these threads are
serialized.
4 - Native threads are used in Jruby, and in Ruby 1.9 YARV
5 - If you use ruby threads, each thread shares the memory space of
all the others.

Hopefully I hit on all the major points that were made. It looks like
forking is what I’m looking for in my use case:

I want sandboxing for each ‘thread’
I don’t want one ‘thread’ to block another

Thanks for the discussion guys. Seems like there is more to be
learned though.

Mike B.

barjunk · July 2, 2008, 8:15pm

On 02.07.2008 18:29, barjunk wrote:

OK, I’m going to try to recap what I learned so far from reading links
and making some assumptions (bad):

1 - Green threads are threads that run in user space.

Native threads do this as well. The difference between native and green
threads is who does the scheduling: native threads are scheduled by the
OS’s scheduler and can use all cores available in a system. Green
threads are scheduled by a piece of user code (from the OS’s
perspective).

2 - Forking allows for running multiple ruby interpreters, each with
their own memory space.

Correct.

3 - Ruby provides a thread mechanizm, but these threads are
serialized.

Yes. Ruby’s threads are green threads. I would not call them
“serialized” though because that sounds like one complete thread was
executed after the other. However, even with green threads they appear
to execute concurrently although no two threads can simultaneously be
active.

4 - Native threads are used in Jruby, and in Ruby 1.9 YARV

JRuby yes. I am not sure about 1.9 - I believe native threads are not
yet completely supported there.

5 - If you use ruby threads, each thread shares the memory space of
all the others.

This is true for both thread models. Actually this is one of the core
differences between threads and processes: since multiple threads can
use the same process space they automatically share memory. Processes
are separate by default and can only share memory explicitly via
operating system specific means.

Hopefully I hit on all the major points that were made.

Not exactly but for the most part.

It looks like
forking is what I’m looking for in my use case:

I want sandboxing for each ‘thread’

I don’t want one ‘thread’ to block another

Thanks for the discussion guys. Seems like there is more to be
learned though.

I always recommend Doug Lea’s book - although it specifically deals with
Java I find the general mechanisms very well explained - and those apply
to other PL as well.

http://www.amazon.com/dp/0201310090

Kind regards

robert

barjunk · July 2, 2008, 10:29pm

El MiÃ©rcoles, 2 de Julio de 2008, Charles Oliver N. escribiÃ³:

Robert K. wrote:

On 02.07.2008 18:29, barjunk wrote:

4 - Native threads are used in Jruby, and in Ruby 1.9 YARV

JRuby yes. I am not sure about 1.9 - I believe native threads are not
yet completely supported there.

Ruby 1.9’s threads are native, but they are not allowed to run in
parallel because most of the core structures in Ruby are not thread-safe
(as in kernel-level thread safe).

AFAIK one of the advantages of green threads is that they are a bit
faster.
What is the purpose of native threads in Ruby 1.9 if they cannot run in
parallel?

barjunk · July 3, 2008, 1:08am

IÃ±aki Baz C. wrote:

AFAIK one of the advantages of green threads is that they are a bit faster.
What is the purpose of native threads in Ruby 1.9 if they cannot run in
parallel?

I think there are several, but one I know of is that it simplifies the
logic required to block on IO without blocking other threads, since you
can just let another thread get scheduled (essentially unlock/lock GIL
around the blocking IO). I’m not sure if they actually made that change
though.

Actually, it’s probably more of a stepping stone toward full native,
parallel threads.

And green threads are no faster than regular threads, but are often
cheaper to create.

Charlie

barjunk · July 2, 2008, 9:04pm

Robert K. wrote:

On 02.07.2008 18:29, barjunk wrote:

4 - Native threads are used in Jruby, and in Ruby 1.9 YARV

JRuby yes. I am not sure about 1.9 - I believe native threads are not
yet completely supported there.

Ruby 1.9’s threads are native, but they are not allowed to run in
parallel because most of the core structures in Ruby are not thread-safe
(as in kernel-level thread safe).

Charlie

barjunk · July 3, 2008, 7:21am

On Tue, Jul 1, 2008 at 1:31 PM, Charles Oliver N.
[email protected] wrote:

Of course that’s mostly a factor of Ruby’s rather simplistic thread
scheduling, which has a 10ms timeslice and a fairly basic selection
algorithm. Obviously OS scheduling will be better/more advanced, but that
applies equally well to native threads (like in JRuby).

Reminds me of the last time I looked at ruby’s thread scheduling,
every wakeup from select() caused a traversal of the entire list of
threads to see if any were blocked on #sleep(). This has horrid
performance characteristics for large numbers of threads doing
periodic actions.

Rewriting code to be single-threaded, and keeping a priority queue of
waiting actions solved my apps performance problem, but also made the
point that the OS works very hard to have good scheduling behaviour
for threads under wide usage patterns, whereas ruby’s scheduler just
gets the job done.

Cheers,
Sam

barjunk · July 3, 2008, 4:01pm

Robert K. wrote:

some tests with a Java 6 JVM some time ago and thread creation is
extremely fast - at least on my Windows XP box.

It’s fast, but it’s not nothing…JRuby has a thread pool option
(-J-Djruby.thread.pool.enabled=true) that improves thread-spin-up quite
a bit. The remaining cost is just because we have heavier in-memory
structures per-thread than CRuby does, mostly for performance reasons.

Eventually I’d like to trim that down, so that pooled threads results in
almost free thread spin-up.

Charlie

barjunk · July 13, 2008, 5:15am

On Wednesday 02 July 2008 18:05:01 Charles Oliver N. wrote:

Actually, it’s probably more of a stepping stone toward full native,
parallel threads.

I really hope this is true. I’ve been building a threading toy that I’ve
since
learned is an implementation (rediscovery?) of the Actor model. It’s
inspired
(loosely) by Erlang, but I can’t bring it up with Erlang people without
being
laughed out of the room by the GIL!

While mine is a hobby project, it would still be very cool if it became
multicore-aware.

barjunk · July 3, 2008, 11:22am

2008/7/3 Charles Oliver N. [email protected]:

Actually, it’s probably more of a stepping stone toward full native,
parallel threads.

That’s what I believe, too.

And green threads are no faster than regular threads, but are often cheaper
to create.

But these days overhead of OS thread creation is negligible. I did
some tests with a Java 6 JVM some time ago and thread creation is
extremely fast - at least on my Windows XP box.

Cheers

robert

barjunk · July 13, 2008, 7:51pm

but I can’t bring it up with Erlang people without being
laughed out of the room by the GIL!

I’m not sure that makes a lot of sense … SMP/multithreaded Erlang was
not mainstream for a long time … I’m not 100% sure it is now. Non-SMP
Erlang is basically greenthreads with good threading/scheduling and good
interprocess messaging. Until the Erlang VM(s) started supporting
multiple threads per emulator process, they (for some definition of
“they”) pretty much sold single kernel threading as a feature …