Forum: Ruby Ruby and threading

3b6b0584576997eb7067253987405f9e?d=identicon&s=25 Carter Cheng (Guest)
on 2011-10-16 10:51
(Received via mailing list)
Hello,

Is the current ruby 1.9.2 multithreaded at the OS level?

Regards,

Carter.
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-16 11:04
(Received via mailing list)
On Sun, Oct 16, 2011 at 3:50 AM, Carter Cheng <cartercheng@gmail.com>
wrote:

> Hello,
>
> Is the current ruby 1.9.2 multithreaded at the OS level?
>
> Regards,
>
> Carter.
>

For MRI, yes, but it has a global interpreter lock. Here's a blog that
explains all the nuances.
http://www.engineyard.com/blog/2011/ruby-concurren...
3b6b0584576997eb7067253987405f9e?d=identicon&s=25 Carter Cheng (Guest)
on 2011-10-16 13:40
(Received via mailing list)
Hi Josh,

Thanks for the reply. I was looking over the YARV implementation in
1.9.3rc1
and did notice some support for pthreads and ruby threads (and some
description in the comments of certain models). Do you know if (or does
anyone else) if this code is fully implemented at present? Or is it a
global
lock like situation?

Regards,

Carter.
233c279c012ebac792aaa805f966cbc7?d=identicon&s=25 Steve Klabnik (Guest)
on 2011-10-16 13:58
(Received via mailing list)
Hey Carter-

MRI will not be removing the GIL any time soon. For true concurrency,
you should use JRuby or Rubinius.

-Steve
62711fa2787e85b5f0c88e245ef69f54?d=identicon&s=25 Alexey Petrushin (axyd80)
on 2011-10-16 19:08
Is there any big projects using jRuby? It seems that jRuby is available
for a long time, but still has not very much attention.
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-16 21:01
(Received via mailing list)
On Sun, Oct 16, 2011 at 12:09 PM, Alexey Petrushin <axyd80@gmail.com>
wrote:

> Is there any big projects using jRuby? It seems that jRuby is available
> for a long time, but still has not very much attention.
>
>
Obtiva (who is now owned by Groupon) gave a JRuby workshop at Red Dirt
Ruby
Conf. The presenters, Tyler Jennings, and Noel Rappin, said that they
were
using JRuby on a large client site, which was initially written in Java,
and
then their new portion was written in Rails, and they were coexisting
via
JRuby.
A74a68807619459925cc1d8e1045c7bd?d=identicon&s=25 Tony Arcieri (Guest)
on 2011-10-16 21:09
(Received via mailing list)
JRuby is being used at several companies including LinkedIn and Square.
It
also powers the HBase console.
233c279c012ebac792aaa805f966cbc7?d=identicon&s=25 Steve Klabnik (Guest)
on 2011-10-16 21:38
(Received via mailing list)
I've done some consulting work, and JRuby was being used heavily.
Cb6bbc826cd7d9238a2fae344958f7ec?d=identicon&s=25 "Sandor Szücs" <sandor.szuecs@fu-berlin.de> (Guest)
on 2011-10-16 21:54
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 10/16/11 1:57 PM, Steve Klabnik wrote:
> MRI will not be removing the GIL any time soon. For true
> concurrency, you should use JRuby or Rubinius.

or MacRuby.

- --
All the best, Sandor Sz
18d3c84ca5a017fe3e96490afaea28aa?d=identicon&s=25 Richard Conroy (Guest)
on 2011-10-16 22:36
(Received via mailing list)
On Sun, Oct 16, 2011 at 7:59 PM, Josh Cheek <josh.cheek@gmail.com>
wrote:

> and
> then their new portion was written in Rails, and they were coexisting via
> JRuby.
>

In London there are a few big name companies following this MO.

They wrap up a legacy Java app with Ruby based acceptance tests
(cucumber/capybara is popular), and then extend the app with Ruby based
extensions (no new production code with Java).

JRuby glues up a lot of it - companies have put a lot of investment in a
Java-based operations backend. JRuby lets them keep it. Its just the JSP
grunge that they want to be rid of.
3b6b0584576997eb7067253987405f9e?d=identicon&s=25 Carter Cheng (Guest)
on 2011-10-17 08:47
(Received via mailing list)
I gather then the 1.9.2. implementation at present does not "exploit" OS
level threads (where as JRuby borrows them from the underlying JVM in
some
manner)?
31ab75f7ddda241830659630746cdd3a?d=identicon&s=25 Austin Ziegler (austin)
on 2011-10-17 18:30
(Received via mailing list)
On Mon, Oct 17, 2011 at 2:47 AM, Carter Cheng <cartercheng@gmail.com>
wrote:
> I gather then the 1.9.2. implementation at present does not "exploit" OS
> level threads (where as JRuby borrows them from the underlying JVM in some
> manner)?

You gather incorrectly. Ruby 1.9.2 does use underlying OS threads. For
a variety of reasons, however, there is a Global Interpreter Lock as
Josh Cheek pointed out to you (and provided this link:
http://www.engineyard.com/blog/2011/ruby-concurren...).

-a
62711fa2787e85b5f0c88e245ef69f54?d=identicon&s=25 Alexey Petrushin (axyd80)
on 2011-10-17 22:02
Maybe it's a stupid question, but I can't get it - what's the point of
using OS threads with GIL?

You'll anyway never get situation where two thread runs simultaneously,
so, what's the point of it?
A74a68807619459925cc1d8e1045c7bd?d=identicon&s=25 Tony Arcieri (Guest)
on 2011-10-17 22:15
(Received via mailing list)
On Mon, Oct 17, 2011 at 1:02 PM, Alexey Petrushin <axyd80@gmail.com>
wrote:

> Maybe it's a stupid question, but I can't get it - what's the point of
> using OS threads with GIL?
>
> You'll anyway never get situation where two thread runs simultaneously,
> so, what's the point of it?


Threads can still run simultaneously, they just can't make changes to
Ruby
objects or the Ruby environment.

A native extension can release the GIL and do blocking I/O or perform a
complex computation (e.g. crypto) while Ruby code is running in another
thread.
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-17 22:16
(Received via mailing list)
On Mon, Oct 17, 2011 at 3:02 PM, Alexey Petrushin <axyd80@gmail.com>
wrote:

> Maybe it's a stupid question, but I can't get it - what's the point of
> using OS threads with GIL?
>
> You'll anyway never get situation where two thread runs simultaneously,
> so, what's the point of it?
>
> --
> Posted via http://www.ruby-forum.com/.
>
>
"MRI 1.9 uses the same technique as MRI 1.8 to improve the situation,
namely
the GIL is released if a Thread is waiting on an external event
(normally
IO) which improves responsiveness."

-- http://www.engineyard.com/blog/2011/ruby-concurren...
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (robert_k78)
on 2011-10-17 22:17
(Received via mailing list)
On Mon, Oct 17, 2011 at 10:02 PM, Alexey Petrushin <axyd80@gmail.com>
wrote:
> Maybe it's a stupid question, but I can't get it - what's the point of
> using OS threads with GIL?
>
> You'll anyway never get situation where two thread runs simultaneously,
> so, what's the point of it?

No, with a GIL no two threads can concurrently hold the lock.  But a
thread which does not need the lock can execute in parallel (e.g.
while doing a syscall).  For specifics you would have to ask Matz or
read the source.

Kind regards

robert
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-19 09:14
(Received via mailing list)
On Sun, Oct 16, 2011 at 4:03 AM, Josh Cheek <josh.cheek@gmail.com>
wrote:

>
> For MRI, yes, but it has a global interpreter lock. Here's a blog that
> explains all the nuances.
> http://www.engineyard.com/blog/2011/ruby-concurren...
>


A great followup to this post, explains why the GIL exists
http://merbist.com/2011/10/18/data-safety-and-gil-removal/

When I ran the code Matt provides under MRI 1.9.3 (has GIL) and
Rubinius,
JRuby, MacRuby (native threads, no GIL):


$ rvm 1.9.3-rc1,rbx-2.0.0pre,jruby-1.6.4,macruby-0.10 do ruby
needs_gil.rb

ruby-1.9.3-rc1
0.064 seconds
400000 elements in array (should be 400000)

rbx-2.0.0pre
0.232 seconds
398877 elements in array (should be 400000)

jruby-1.6.4
0.069 seconds
398709 elements in array (should be 400000)

macruby-0.10
0.076 seconds
366231 elements in array (should be 400000)




$ cat needs_gil.rb
puts '', ENV['RUBY_VERSION']

@array, threads = [], []
start = Time.now
4.times do
  threads << Thread.new { (1..100_000).each {|n| @array << n} }
end
threads.each{|t| t.join }
stop = Time.now

puts "%0.3f seconds" % (stop - start), @array.size




Note:
* Other times I ran it under Rubinius, the array got corrupted or
something
"Tuple::copy_from: index 8092 out of bounds for size 5395
(Rubinius::ObjectBoundsExceededError)"
* Other times I ran it under JRuby, it detected the corrupt data with
'ConcurrencyError: Detected invalid array contents due to unsynchronized
modifications with concurrent users'
* I ran this a whole bunch of times, sometimes MRI was fastest,
sometimes
MacRuby, sometimes JRuby (MRI was fastest most consistently, though)


Thoughts:
* MRI has a GIL, thus keeping the data safe, and still performs
equivalently
with other implementations (for this admittedly limited test), so do
benchmarks to decide if this will be worthwhile. It's not a fluke that
Matz
wants to keep the GIL.
* I'm glad JRuby notices the corrupt data (though not always) I'm a big
fan
of fail-fast
* Has JRuby fixed their startup time issue? I ran this a lot of times
and
didn't notice any of the lag I used to.
Cb6bbc826cd7d9238a2fae344958f7ec?d=identicon&s=25 "Sandor Szücs" <sandor.szuecs@fu-berlin.de> (Guest)
on 2011-10-19 14:57
(Received via mailing list)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 10/19/11 9:11 AM, Josh Cheek wrote:

> 400000)
>
>
> $ cat needs_gil.rb puts '', ENV['RUBY_VERSION']
>
> @array, threads = [], [] start = Time.now 4.times do threads <<
> Thread.new { (1..100_000).each {|n| @array << n} } end
> threads.each{|t| t.join } stop = Time.now
>
> puts "%0.3f seconds" % (stop - start), @array.size

I think it's pretty obvious, that implementations without GIL should
behave exactly as you have shown.
You did not use synchronization to append an item to your shared
object, so sometimes items will get lost.
- --
All the best, Sandor Sz
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-19 15:27
(Received via mailing list)
On Wed, Oct 19, 2011 at 7:56 AM, Sandor Szcs
<sandor.szuecs@fu-berlin.de>wrote:

> >
> > Thread.new { (1..100_000).each {|n| @array << n} } end
> > threads.each{|t| t.join } stop = Time.now
> >
> > puts "%0.3f seconds" % (stop - start), @array.size
>
> I think it's pretty obvious, that implementations without GIL should
> behave exactly as you have shown.
> You did not use synchronization to append an item to your shared
> object, so sometimes items will get lost.
> - --
>

That's the point, to show why the GIL is useful.
233c279c012ebac792aaa805f966cbc7?d=identicon&s=25 Steve Klabnik (Guest)
on 2011-10-19 16:38
(Received via mailing list)
It's not useful, it's hiding a problem: your code isn't thead-safe.
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-19 17:35
(Received via mailing list)
On Wed, Oct 19, 2011 at 9:38 AM, Steve Klabnik
<steve@steveklabnik.com>wrote:

> It's not useful, it's hiding a problem: your code isn't thead-safe.
>
>
When it's in a lib you don't control, it's useful. A wrapped C lib is
probably the best example, but even if its in someone else's gem, it's
relevant. Debugging libs takes a lot of time and effort, especially for
something nondeterministic that may or may not reproduce itself.

Rbx / JRuby / MacRuby will need resources to educate their users about
how
to write safe code, and may also need to maintain a list of safe gems.
A74a68807619459925cc1d8e1045c7bd?d=identicon&s=25 Tony Arcieri (Guest)
on 2011-10-19 19:01
(Received via mailing list)
On Wed, Oct 19, 2011 at 8:32 AM, Josh Cheek <josh.cheek@gmail.com>
wrote:

> When it's in a lib you don't control, it's useful. A wrapped C lib is
> probably the best example, but even if its in someone else's gem, it's
> relevant. Debugging libs takes a lot of time and effort, especially for
> something nondeterministic that may or may not reproduce itself.
>
> Rbx / JRuby / MacRuby will need resources to educate their users about how
> to write safe code, and may also need to maintain a list of safe gems.
>

The GIL doesn't eliminate the importance of writing thread safe code, it
just masks the symptoms of unsafe code in certain cases.

It's not some magical panacea. Users of Ruby 1.9 who use threads need to
be
just as vigilant about the thread safety of their code. Otherwise their
programs are working by accident, such as your "needs_gil.rb". The fact
that
program works is an implementation detail of the GIL on 1.9.2, it's not
as
if Matz has said "all operations on arrays should be automatically
thread
synchronized", which is still possible on VMs without a GIL (although
I'm
surprised the above doesn't trigger a ConcurrentModificationException on
JRuby)

As an example of thread safety problems even with a GIL, try running the
following on 1.9.2:

-- snip --
require 'thread'

numbers = [0]
threads = []
#lock = Mutex.new

100.times do
  threads << Thread.new do
    size = nil

    begin
      #lock.synchronize do
        value = rand(101)
        if value == numbers.last + 1
          sleep 0.01
          numbers << value
        end

        size = numbers.size
      #end
    end while size < 100
  end
end

threads.each(&:join)
p numbers
-- snip --

It's a bit contrived and goofy, but whatever. The output should look
like
this:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
39,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,
58,
59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76,
77,
78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95,
96,
97, 98, 99,100]

However if you run the above code with the lock commented out, even on
1.9.2
you'll get something like this instead:

[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1,
1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

Broken, thread-unsafe code is broken, thread-unsafe code. It doesn't
matter
if it magically happens to work by accident in 1.9.2. It's still broken.
The
GIL doesn't mean you can avoid teaching users of threads about thread
safety, or that you don't need to maintain a list of safe gems (
railsplugins.org is trying to do this, BTW, although I wish it were
built
into RubyGems). You still need to do these things on 1.9.2. The
difference
is the GIL will *mask* certain thread safety bugs, but not all of them.

"I don't need to worry about thread safety because the GIL will take
care of
it for me" is definitely the wrong attitude.
B078cb4f4fb473c7a54d1fc36d10c70e?d=identicon&s=25 Regis d'Aubarede (raubarede)
on 2011-10-19 20:16
Steve Klabnik wrote in post #1026864:

> MRI will not be removing the GIL any time soon. For true concurrency,
> you should use JRuby or Rubinius.

or IronRuby !

see
http://regisaubarede.posterous.com/tag/multicore
for test in compare real use of processors
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2011-10-27 04:00
(Received via mailing list)
On Wed, Oct 19, 2011 at 2:11 AM, Josh Cheek <josh.cheek@gmail.com>
wrote:
> A great followup to this post, explains why the GIL exists
> http://merbist.com/2011/10/18/data-safety-and-gil-removal/
>
> When I ran the code Matt provides under MRI 1.9.3 (has GIL) and Rubinius,
> JRuby, MacRuby (native threads, no GIL):

Ok, I can't let this one sit.

To my eyes, the only one broken there is MRI. It's not actually doing
anything in parallel, so you get the synchronous result. Perhaps I
should file a bug against MRI that its threads...aren't?

In all seriousness, though, this is flawed reasoning. Spinning up
threads is asking the runtime to do something in parallel, and MRI is
the only example here not delivering. You are asking for the result
you get under JRuby, Rubinius, and MacRuby, since you don't
synchronize any access to the shared array, and the shared array does
not (according to Matz himself) have thread safety as part of its
contract. The only reason you get the other result under MRI is
because it isn't actually doing what you've asked of it.

Saying that the GIL is useful based on this example is a bit like
saying "JRuby not supporting C extensions is useful because they'll
never crash due to C extensions." You can't compare lack of
parallelism with parallelism when you're trying to demonstrate
parallelism.

> * Other times I ran it under JRuby, it detected the corrupt data with
> 'ConcurrencyError: Detected invalid array contents due to unsynchronized
> modifications with concurrent users'

We do our best to detect this for Array, and at some point we'll try
to do it for Hash (Hash will currently raise errors from Java like
ArrayIndexOutOfBoundsException...still rescuable, but not as nice). It
would be cool if Ruby incorporated some explicitly thread-safe
collections by default, but there are gems that provide such things
right now.

FWIW, it's almost impossible to have threadsafe data structures that
perform as well as non-threadsafe data structures, which is why we've
always opted to keep Array and Hash the way they are. Hopefully people
are starting to learn that the alternatives aren't that bad, like
using external threadsafe libs or simply mutexing around all accesses.

> * I ran this a whole bunch of times, sometimes MRI was fastest, sometimes
> MacRuby, sometimes JRuby (MRI was fastest most consistently, though)

For a run that short, I'm not surprised. JRuby would be faster if it
ran for more than...what...0.07 seconds? I ran a longer version
without threads (so it wouldn't error out) and JRuby was clearly the
fastest. I also wrote a version that uses a JRuby-specific module for
thread-safety, and it only slowed down by about 2x...but it completes
successfully every time:

require 'jruby/synchronized'
puts '', ENV['RUBY_VERSION']

class SafeArray < Array
  include JRuby::Synchronized
end

10.times do
  @array, threads = SafeArray.new, []
  start = Time.now
  4.times do
   threads << Thread.new { (1..100_000).each {|n| @array << n} }
  end
  threads.each{|t| t.join }
  stop = Time.now

  puts "%0.3f seconds" % (stop - start), @array.size
end

> Thoughts:
> * MRI has a GIL, thus keeping the data safe, and still performs equivalently
> with other implementations (for this admittedly limited test), so do
> benchmarks to decide if this will be worthwhile. It's not a fluke that Matz
> wants to keep the GIL.

False safety (you can still easily have threads step on each other) at
the expense of parallelism. I'm not sure that's a win.

Also, I don't think Matz has ever said he really "wants" to keep the
GIL. It's just a massively difficult thing to retrofit MRI for
parallel threading without a very large rework. If they could drop the
GIL without destabilizing MRI itself, I'm sure they'd do it.

> * I'm glad JRuby notices the corrupt data (though not always) I'm a big fan
> of fail-fast

It only fails fast if it actually fails, of course. Some of your runs
manage to succeed without the threads stepping on each other. And by
failure, here, I mean potentially corrupting the array. The array
contents may get out of sync because you don't synchronize writes, but
that's not a failure in a concurrent environment. Or at least, it's
not JRuby's failure...it's yours.

> * Has JRuby fixed their startup time issue? I ran this a lot of times and
> didn't notice any of the lag I used to.

That's good to hear! Every release includes more startup-time tweaks.
Perhaps we're finally "getting there".

- Charlie
B31e7abd14f1ceb4c4957da08933c630?d=identicon&s=25 Josh Cheek (josh-cheek)
on 2011-10-27 05:41
(Received via mailing list)
On Wed, Oct 26, 2011 at 8:59 PM, Charles Oliver Nutter
<headius@headius.com>wrote:

> On Wed, Oct 19, 2011 at 2:11 AM, Josh Cheek <josh.cheek@gmail.com> wrote:
> > A great followup to this post, explains why the GIL exists
> > http://merbist.com/2011/10/18/data-safety-and-gil-removal/
> >
> > When I ran the code Matt provides under MRI 1.9.3 (has GIL) and Rubinius,
> > JRuby, MacRuby (native threads, no GIL):
>
> Ok, I can't let this one sit.
>
>
Hi, Charlie, hope you aren't taking this as a criticism of JRuby, it's
just
intended to point out what the GIL does, since many people don't seem to
understand it's purpose.


> class SafeArray < Array
>  stop = Time.now
>
>  puts "%0.3f seconds" % (stop - start), @array.size
> end
>
>
nice!


>
> Also, I don't think Matz has ever said he really "wants" to keep the
> GIL. It's just a massively difficult thing to retrofit MRI for
> parallel threading without a very large rework. If they could drop the
> GIL without destabilizing MRI itself, I'm sure they'd do it.
>
>
I'm sure they would, too. But people seem to think it was a whimsical
decision. I'm just saying it isn't, that it's that way for a reason, and
showing what the reason is. Debating over the phrase "want" seems
pedantic.


>
>
I'm not saying JRuby failed (perhaps I'm not communicating well, as
you're
not the only person to take it this way). I'm saying the behaviour is
different in a way that could easily bite someone, and showing an
example. As to whether I failed in writing this code, the code came from
http://merbist.com/2011/10/18/data-safety-and-gil-removal/ (as stated
previously) and it's purpose is to reveal this issue.


> > * Has JRuby fixed their startup time issue? I ran this a lot of times and
> > didn't notice any of the lag I used to.
>
> That's good to hear! Every release includes more startup-time tweaks.
> Perhaps we're finally "getting there".
>
>
Glad to hear it, that's the primary (and only noteworthy) reason that
JRuby
isn't my default implementation.



As an aside, if I had a real need for parallelism, I'd probably select a
language better suited to it (e.g. Clojure). Maybe if the community
offered
some good resources about how to safely develop parallel code I'd be
more
comfortable using Ruby, but right now, I prefer the safety of the GIL.
F1d37642fdaa1662ff46e4c65731e9ab?d=identicon&s=25 Charles Nutter (headius)
on 2011-10-27 08:38
(Received via mailing list)
On Wed, Oct 26, 2011 at 10:40 PM, Josh Cheek <josh.cheek@gmail.com>
wrote:
> As an aside, if I had a real need for parallelism, I'd probably select a
> language better suited to it (e.g. Clojure). Maybe if the community offered
> some good resources about how to safely develop parallel code I'd be more
> comfortable using Ruby, but right now, I prefer the safety of the GIL.

It's not an unreasonable decision to consider Clojure, but the rules
it forces you to follow can also be followed voluntarily in Ruby.

Parallelism is possible (and not terribly difficult) in Ruby, but it's
also somewhat easy to shoot yourself in the foot if you don't know
what you're doing. Clojure, on the other hand, makes it much harder to
shoot yourself in the foot...but it's also not Ruby, and you need to
dig Lisp to enjoy using it.

I'd love to see more techniques from Clojure applied to Ruby, such as
in the Hamster library (a port of Clojure's persistent data structures
to Ruby). I often state my "four rules of safe concurrency":

Rule #1: Don't do it if you can avoid it.
Rule #2: If you must do it, don't share data across
workers/threads/processes.
Rule #3: If you must share data, don't share mutable data.
Rule #4: If you must share mutable data, synchronize access to it.

If you follow these rules, you'll live a happy and fruitfully parallel
life.

- Charlie
A0c079a7c3c9b2cf0bffebd84dc578b0?d=identicon&s=25 Chuck Remes (cremes)
on 2011-10-27 15:53
(Received via mailing list)
On Oct 26, 2011, at 10:40 PM, Josh Cheek wrote:
>
> As an aside, if I had a real need for parallelism, I'd probably select a
> language better suited to it (e.g. Clojure). Maybe if the community offered
> some good resources about how to safely develop parallel code I'd be more
> comfortable using Ruby, but right now, I prefer the safety of the GIL.

Please reread Tony Arcieri's response from October 19 where he shreds
the idea that you can "prefer the safety of the GIL." There is *no*
safety provided by the GIL. Let me reprint his code example again.

Tony Arcieri wrote:

>
>    end while size < 100
> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
>
> Broken, thread-unsafe code is broken, thread-unsafe code. It doesn't matter
> if it magically happens to work by accident in 1.9.2. It's still broken.

It's. Still. Broken.

The GIL provides a *false* sense of security. Don't rely on it.

cr
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (robert_k78)
on 2011-10-27 16:02
(Received via mailing list)
On Thu, Oct 27, 2011 at 3:59 AM, Charles Oliver Nutter
<headius@headius.com> wrote:
> On Wed, Oct 19, 2011 at 2:11 AM, Josh Cheek <josh.cheek@gmail.com> wrote:

>
> FWIW, it's almost impossible to have threadsafe data structures that
> perform as well as non-threadsafe data structures, which is why we've
> always opted to keep Array and Hash the way they are. Hopefully people
> are starting to learn that the alternatives aren't that bad, like
> using external threadsafe libs or simply mutexing around all accesses.

Actually it is not possible to provide thread safe collections out of
the box which handle all use cases.  I have frequently seen people use
Collections.synchronizedMap() in Java and not realize that their code
is not thread safe.  There are two typical scenarios in Java where you
need to manually define the scope of the mutex by externally
synchronizing:

1. When you invoke more than one method on the object which need to
act on the same state.
2. Iterating a collection (Collections.synchronizedMap() does not
prevent ConcurrentModificationException).

(Note: 2 does not apply in Ruby when iterating with #each and
relatives but 1 also is valid in Ruby land)

People need to understand that thread safety is not a property of a
class (or type) which you can easily gain by just synchronizing all
methods (it works in some cases though) but of the code which accesses
shared state.  So from an educational point of view the effect of
providing objects with all methods synchronized may not be ideal
because it may convey a false sense of safety.

Having said that it would probably be a good idea to add something
like this to the standard library which can be used for any object

require 'monitor'

class SynchronizingWrapper < BasicObject
  def initialize(o)
    @lock = ::Monitor.new
    @obj = o
  end

  def synchronize(&b)
    @lock.synchronize(&b)
  end

  def method_missing(*a, &b)
    @lock.synchronize do
      @obj.send(*a, &b)
    end
  end
end

Wrapping a collection with this gets you pretty far because for
example Hash already has methods #fetch and the default proc so the
usual idiom of if h.contains_key?(x); y=h[x];else;h[x]=y=create;end is
not needed often.

Kind regards

robert
62711fa2787e85b5f0c88e245ef69f54?d=identicon&s=25 Alexey Petrushin (axyd80)
on 2011-10-28 06:28
> It's not an unreasonable decision to consider Clojure, but the rules
> it forces you to follow can also be followed voluntarily in Ruby.

I don't understand this. Ruby has no support for parallel execution
(except jRuby and similar platforms), so no matter what technics do You
apply - the program always will be on the single processor core.

As far as I understood there's a way to hack it, like EventMachine, but
it's a half-solution, it's useful only for limited problems dealing with
little computing load and lots of IO-waiting.

I honestly don't know why You ever should use Ruby Threads - it gives
You all the burden of concurrent programming complexity and gives in
return exactly nothing (except very specific cases like EventMachine).

Clojure otherwise - allows You to use all available cores and makes it
simple by applying modern approaches of dealing with concurrency.

And You can't apply Clojure technics right now, at first Ruby should be
able to make parallel execution possible (by utilizing multiple
cores), and only after it it can solve the following problem - make
parallel execution simple.

http://petrush.in
233c279c012ebac792aaa805f966cbc7?d=identicon&s=25 Steve Klabnik (Guest)
on 2011-10-28 19:40
(Received via mailing list)
The idea is that if you wrote thread-safe code, the GIL could be
removed, and then Ruby would _not_ be limited to running on one core.
A74a68807619459925cc1d8e1045c7bd?d=identicon&s=25 Tony Arcieri (Guest)
on 2011-10-28 20:41
(Received via mailing list)
On Thu, Oct 27, 2011 at 9:28 PM, Alexey Petrushin <axyd80@gmail.com>
wrote:

> I don't understand this. Ruby has no support for parallel execution
> (except jRuby and similar platforms), so no matter what technics do You
> apply - the program always will be on the single processor core.
>

So use JRuby or Rubinius if you want real multicore concurrency. Problem
solved.

Threads are still useful even if you have a GIL. Rails can use threads
to
service more than one request at a time per Ruby interpreter. Without
threads each connection to a web site requires a dedicated Ruby VM to
service it. Seems bad, but people have made it work.


> Clojure otherwise - allows You to use all available cores and makes it
> simple by applying modern approaches of dealing with concurrency.


Clojure has a lot of neat tools, like immutable persistent data
structures,
agents, and STM. However, none of these are a panacea for concurrency
bugs.
Clojure provides a very thin interop wrapper around the JVM and existing
Java libraries, which you'll find yourself using all the time when you
write
Clojure programs. Since the Java libraries use mutable state and allow
you
to get below the abstractions Clojure otherwise provides, you can still
wind
up with thread safety bugs in your Clojure programs which are identical
to
the kind you'd find in Ruby programs.
150c2e7bb8354b13dee87fc5e5de09b0?d=identicon&s=25 "Matthias Wächter" <matthias@waechter.wiz.at> (Guest)
on 2011-10-28 20:42
(Received via mailing list)
On 28.10.2011 19:40, Steve Klabnik wrote:
> The idea is that if you wrote thread-safe code, the GIL could be
> removed, and then Ruby would _not_ be limited to running on one core.

The GIL discussion is very similar to the memory ordering property of
processors
[http://en.wikipedia.org/wiki/Memory_ordering] and the related problem
of gaining speed in CPU
design by making it more Alpha-ish vs. keeping it x86-ish but with less
hassle on the software
front. BTW, Ruby still uses mostly volatile’s which are not inherently
thread safe on all processors
instead of proper memory barriers.

– Matthias
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.