Forum: Ruby sysread changes behavior in the presence of threads?

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 01:42
(Received via mailing list)
What am I missing here?

require 'socket'
require 'fcntl'
Thread.new {sleep 100}
sd = TCPsocket.new( "www.cisco.com", 80)
m = sd.fcntl( Fcntl::F_GETFL, 0)
sd.fcntl( Fcntl::F_SETFL, Fcntl::O_NONBLOCK | m)
sd.sysread(4096)

This code blocks in the sysread, in effect ignoring the nonblocking mode
set
on the file descriptor. But if you comment out the line that spins the
thread, the sysread raises Errno::EAGAIN as you'd expect.

Is this a defined behavior or a bug?
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-05-21 02:37
(Received via mailing list)
From: "Francis Cianfrocca" <garbagecat10@gmail.com>
>
> This code blocks in the sysread, in effect ignoring the nonblocking mode set
> on the file descriptor. But if you comment out the line that spins the
> thread, the sysread raises Errno::EAGAIN as you'd expect.
>
> Is this a defined behavior or a bug?

Hi, I learned about this just the other day.  See thread
starting here:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/...


Regards,

Bill
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 02:55
(Received via mailing list)
Ahh, thanks for pointing it out, although it ain't the answer I was
hoping
for. Interesting that your concern (not having to think about EAGAIN) is
opposite from mine- I'm *wanting* to get EAGAIN so I can do something
else
while waiting for the I/O.
Solved my problem by holding my nose and writing an IO#read_nonblocking
method in C.
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-05-21 03:26
(Received via mailing list)
From: "Francis Cianfrocca" <garbagecat10@gmail.com>
> Ahh, thanks for pointing it out, although it ain't the answer I was hoping
> for. Interesting that your concern (not having to think about EAGAIN) is
> opposite from mine- I'm *wanting* to get EAGAIN so I can do something else
> while waiting for the I/O.

Yeah.  I didn't have a specific need to check for EAGAIN at the
time, but my main point was trying to express my confusion that
there would be this one semi-obscure case (single thread only)
that acted differently from the rest.  I say obscure because I
don't think I'd ever write code in ruby to expect that no other
threads were present in the system.  How would I know that
some other library I've required hasn't spawned some worker
thread for its own internal use?  So having ruby act
inconsistently in the particular case of there being only one
thread alive seems peculiar to me.  So my point was I was
confused by that behavior since it seems like something I could
never be able to reliably expect to depend on.  :)


Regards,

Bill
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 03:32
(Received via mailing list)
I'd be inclined to categorize it as a bug. It violates the expectation
that
I/O is orthogonal to threads. I looked at the code in io.c and I can see
the
reason for it, but I'm writing a library and I don't have the luxury of
making assumptions about whether there are other threads.
42c83263598da02c136a4ebf8a058182?d=identicon&s=25 Tanaka Akira (Guest)
on 2006-05-21 05:11
(Received via mailing list)
In article <3a94cf510605201832g5a8d23cue1f0e2e69661d5f1@mail.gmail.com>,
  "Francis Cianfrocca" <garbagecat10@gmail.com> writes:

> I'd be inclined to categorize it as a bug. It violates the expectation that
> I/O is orthogonal to threads. I looked at the code in io.c and I can see the
> reason for it, but I'm writing a library and I don't have the luxury of
> making assumptions about whether there are other threads.

A workaround is Thread.exclusive { sd.sysread(4096) }.

However I think a nonblocking read method is right way to
fix this issue.  The problem is the method name, though.

There are several problems to fix sysread to cause EAGAIN in
I/O multiplex mode.  Since a nonblocking I/O doesn't block,
it is possible to avoid I/O multiplex to avoid entire
process blocking.  So if Ruby disables I/O multiplex for
nonblocking I/O, sysread will cause EAGAIN.  But,
unfortunately, Ruby cannot know nonblocking state of a fd in
some case.

1. Windows has no F_GETFL equivalent
  There is no way to know the state on Windows.

2. race condition
  Even on an environment which has F_GETFL, the state may be
  changed between F_GETFL and read(2) by another process.
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 07:04
(Received via mailing list)
Tanaka-sensei: from your description it appears that the problem is
caused
by an interaction between the Ruby thread-scheduler and the I/O
functions,
which can't be resolved without fundamentally changing how the scheduler
works. That's fair enough.

By this point, I've accumulated a small library of functions in C that
perform various operations without blocking, and are unaffected by the
presence of threads. Perhaps I'll release them as an extension library
so
people can use them now, while recognizing that they will be eventually
obsoleted when a decision is made about the names to be used in the
standard
distro. (Another thing I'd like to do is write a unified Mutex/Condition
Variable implementation that can actually be used to synchronize Ruby
threads with native threads.)
E17ac15581d924c99fc803e86d8d70fb?d=identicon&s=25 Anatoly Karp (Guest)
on 2006-05-21 07:52
(Received via mailing list)
On 5/21/06, Francis Cianfrocca <garbagecat10@gmail.com> wrote:
> Tanaka-sensei: from your description it appears that the problem is caused
> by an interaction between the Ruby thread-scheduler and the I/O functions,
> which can't be resolved without fundamentally changing how the scheduler
> works. That's fair enough.
>

I am not sure Tanaka's explanation is quite satisfactory. If you change
your original example thusly:

require 'socket'
require 'fcntl'
Thread.new { puts "hi" }
sd = TCPsocket.new( "www.cisco.com", 80)
m = sd.fcntl( Fcntl::F_GETFL, 0)
sd.fcntl( Fcntl::F_SETFL, Fcntl::O_NONBLOCK | m)
sd.sysread(4096)

it will produce Errno::EAGAIN as expected.

Thus, one is led to suspect that in the former case the thread somehow
does not get properly cleaned up upon completion.

In any case, I agree with you that not being able to count on
non-blocking
behavior, just because there could be some stray Threads around, is
pretty nasty.

-A
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 08:14
(Received via mailing list)
In your example, the thread ends almost immediately, especially since
puts
will probably complete long before a connect to a remote web server. So
by
the time the sysread executes, there is only one thread. Tanaka's
explanation still holds.

I think the answer is to define a new set of functions that can be
depended
on not to block. (And on Windows, they will probably also need to set
the
descriptor nonblocking.) From prior communications with Matz, he's not
against this, but hasn't settled yet on what these new methods should be
named.
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-05-21 08:17
(Received via mailing list)
From: "Francis Cianfrocca" <garbagecat10@gmail.com>
>
> (Another thing I'd like to do is write a unified Mutex/Condition
> Variable implementation that can actually be used to synchronize Ruby
> threads with native threads.)

That would be awesome!    :)


Can I donate via paypal or something?  <grin>


Regards,

Bill
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 08:20
(Received via mailing list)
I started working on it the other day, will complete when I have time.
Hope
you're not running Windows ;-). Condition variables don't work perfectly
on
Windows.
4feed660d3728526797edeb4f0467384?d=identicon&s=25 Bill Kelly (Guest)
on 2006-05-21 08:59
(Received via mailing list)
From: "Francis Cianfrocca" <garbagecat10@gmail.com>
>
> I started working on it the other day, will complete when I have time. Hope
> you're not running Windows ;-). Condition variables don't work perfectly on
> Windows.

The relevant applications are multi-platform: OS X; Linux; and yes,
Windows.

Your mention of condition variables not working reliably(?) on
Windows has certainly piqued my interest.  Are you referring to
Ruby's condition variables?  Or some fundamental Windows flaw?
Our C++ app uses boost::condition variables on all platforms.
Definitely interested to know if there's some glitch in Ruby
and/or Windows I should be aware of.


Thanks,

Bill
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 13:44
(Received via mailing list)
It's a Windows issue, nothing to do with Ruby. I haven't looked at how
Boost
implements them, but if I have a chance I will. The problem is with
timed
condwaits. As you know Windows uses different synch primitives than
Unix.
It's one of those tiny breaches of atomicity that you'll occasionally
see if
you run enough trials on a machine with enough multiprocessors. Once I
spent
a Saturday morning trying to write a proper condvar for Windows in
assembler
but I gave up when it occurred to me that I really should get a life
instead. ;-)
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2006-05-21 14:15
(Received via mailing list)
2006/5/21, Francis Cianfrocca <garbagecat10@gmail.com>:
> It's a Windows issue, nothing to do with Ruby. I haven't looked at how Boost
> implements them, but if I have a chance I will. The problem is with timed
> condwaits. As you know Windows uses different synch primitives than Unix.
> It's one of those tiny breaches of atomicity that you'll occasionally see if
> you run enough trials on a machine with enough multiprocessors.

But this does not bite you if you use Ruby's condition variables as
they are completely in Ruby land and there are no native threads
(well, *one* native thread is there :-))

This whole thread makes me wonder why at all you need to use sysread
and a non blocking variant of it. Do you have any extensions that use
native threads or what are you trying to accomplish?  I'm asking
because for me Ruby's threads and blocking IO (on Ruby level) have
served me well so far.

Kind regards

robert
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-21 14:46
(Received via mailing list)
I'm working on the eventmachine library (see rubyforge). The goal is
to enable complicated applications (including multiplayer games and
network servers) that are far faster and more scalable than is
possible with "ordinary" Ruby coding. (That is, without requiring deep
understanding of the Ruby runtime environment in order to get the
required performance.) Practically speaking, this requires strict
nonblocking i/o and a certain amount of C extensions. We'd like for
any Ruby programmer to be able to write a large, fast application
without acquiring expertise in concurrency and networking issues.

As an example, I'm responsible for an LDAP system with five replicated
servers running simultaneously and sharing load. This system now can
sustain rates of 2000 queries per second per server (directory size is
about one million entries), but I had to resort to a single-threaded
server handwritten in C++ (openldap's performance on the specified
hardware was about one-twentieth of the requirement). The replication
code is almost all in Ruby. I'd like to have the main server code be
largely in Ruby so it will be easier to maintain. That's an example of
what I want to do.

As far as threads are concerned: I'm one of those people who believe
that threads are seriously overused and should be avoided, especially
in high-performance applications. But occasionally if you're mixing
Ruby and native code, you may have threads in each. As long as Ruby's
threads are green, this split will exist, and it would be nice to able
to synchronize a Ruby thread with a native one.
0ca6e5c33d7e7ff901d75ff0b13d9e1c?d=identicon&s=25 Sam Roberts (Guest)
on 2006-05-22 08:46
(Received via mailing list)
Quoting garbagecat10@gmail.com, on Sun, May 21, 2006 at 09:45:38PM
+0900:
> As far as threads are concerned: I'm one of those people who believe
> that threads are seriously overused and should be avoided, especially
> in high-performance applications. But occasionally if you're mixing

I understand the argument in general, but since ruby's "threads" aren't
actually threads, does it apply here? Ruby with multiple "threads" is
really just a single process with a nice application-level way of
invoking particular code when a particular socket descriptor is ready,
and having that code have some state.

This is pretty much what any (other) single-threaded unix app hanging
off of select would do, except state would be held explicitly in some
kind of data structure. In ruby the state is held in the lexical
state/closure/stack (not sure what to call it) of the ruby thread.

Is the overhead of a ruby thread too high, for some reason? Uses too
much memory, doesn't scale well across thousands of descriptors because
it uses select, something else...? I'm sure you are trying to avoid
ruby's select-based, non-blocking, io-multiplexing scheme (aka
"threads") for a good reason, I just don't see what the reason is yet.

> Ruby and native code, you may have threads in each. As long as Ruby's
> threads are green, this split will exist, and it would be nice to able
> to synchronize a Ruby thread with a native one.

Interaction between ruby and any other OS threads is a well known
problem, but xx_nonblock APIs in Socket doesn't seem like its going to
help that.

Sam
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-22 14:16
(Received via mailing list)
No, I'm going for something rather different, and it has nothing to do
with
Ruby. (And therefore, the point is a threadjack so I'll be brief.)
Threads
are too difficult to use. Even if you have a lot of experience (I've
been
programming posix-like threads since Solaris 2.4 and Win32 threads even
longer), concurrency within a process is darned hard to get right. One
really good reason to use threads is to capture system latencies like
disk
and network i/o, but this can generally be done with events. Another
good
reason is to reflect the structure of the problem you're trying to
solve: if
your problem really does involve multiple, independent control flows,
then
threading the app will probably make it easier to write (but may also
make
it slower and harder to scale). With teams I manage, when it's necessary
to
use threads, I impose strict rules on when and how to apply mutexes, and
how
to design synchronization sets. As long as my rules are followed, you
generally won't see a deadlock, and you will rarely see severe mutex
contention. But most programmers hate following them. (Among them: NEVER
call a function under lock, not even one you wrote, not even an inline
or a
macro. Only variable reads and writes are allowed.)

Your second point: interactions between Ruby and native threads has
nothing
to with nonblocking I/O. Separate problem, it was my mistake if I left
the
implication that they are linked. I was thinking it might be possible to
teach Ruby to work with native mutexes and condvars.
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2006-05-22 15:45
(Received via mailing list)
2006/5/22, Francis Cianfrocca <garbagecat10@gmail.com>:
> With teams I manage, when it's necessary to
> use threads, I impose strict rules on when and how to apply mutexes, and how
> to design synchronization sets. As long as my rules are followed, you
> generally won't see a deadlock, and you will rarely see severe mutex
> contention. But most programmers hate following them. (Among them: NEVER
> call a function under lock, not even one you wrote, not even an inline or a
> macro. Only variable reads and writes are allowed.)

I can see why they hate sticking to that rule.  Basically you disallow
decent synchronization of functional parts of the application. The
consequence of this is that you either do not have concurrent programs
that are correct or you force people to implement their own mutex on
top of your rule.  To give an example what I mean, your rule prohibits
this typical cache idiom:

// pseudo code
synchronized ( lock ) {
   if ( ! aMap.contains( myKey ) ) {
      // cache miss
      aMap.put( myKey, calculateValue( key ) );
   }
}

IMHO your ruly makes multi threaded applications pretty much pointless.

Kind regards

robert
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-22 16:16
(Received via mailing list)
We're threadjacking, so I'll keep it short. The point of taking a very
restrictive view of synchronization is to prevent incorrect concurrency.
Your example depends on the implementation of aMap and of calculateValue
not
to do evil things. (One of the evil things they can do is simply to run
for
several milliseconds, or make a blocking I/O call. This can give you a
bad
case of mutex contention, which is exceptionally costly in many modern
implementations.) This means that the program may change behavior with
respect to concurrency across platforms, hardware, and also across time
(as
the code inside those called functions changes). Ruby adds the further
dimension that the code you call under lock may have been metaprogrammed
on
the fly.

The nightmare scenario is this: the client calls to say that your
mission-critical application stops running occasionally. It will be fine
for
a month, and then it will stop twice in one week. You ask what did they
do
differently, and the answer is always "nothing." You ask what your
programmers changed, and the answer is always "nothing." The problem is
of
course completely non-reproducible. This is not a nice place to be,
since
you can't just blame the client's environment.

I suppose your answer to all this is: just code more carefully, and only
use
well-debugged libraries. That of course is a partially-correct answer,
but
achievable in practice only at some specific cost. My larger point is
that
in the case of threads, this balance-point is often very hard to achieve
at
reasonable cost.

I'll let you have the last word, both because we're offtopic, and
because
threading is a religious issue to many people and so the question tends
to
generate more heat than light :-). In my defense, I'll only say that my
dislike of threads is rooted in many years of experience, and not a mere
prejudice.
47b1910084592eb77a032bc7d8d1a84e?d=identicon&s=25 Joel VanderWerf (Guest)
on 2006-05-22 20:21
(Received via mailing list)
Francis Cianfrocca wrote:
> We're threadjacking, so I'll keep it short. The point of taking a very

I like this thread, but I'll nudge it in the ruby direction...

> the fly.
If you're talking about ruby threads, sync mechanisms are expensive with
or without contention, since they are built on top of Thread.critical.

require 'thread'
require 'benchmark'

N = 1_000_000

Benchmark.bmbm(12) do |bm|

  bm.report("no Thread.critical") do
    x = 0
    N.times do
      x += 1
    end
  end

  bm.report("Thread.critical") do
    x = 0
    N.times do
      Thread.critical = true
      x += 1
      Thread.critical = false
    end
  end

  bm.report("Thread.exclusive") do
    x = 0
    N.times do
      Thread.exclusive do
        x += 1
      end
    end
  end

end

__END__

Rehearsal ------------------------------------------------------
no Thread.critical   1.040000   0.010000   1.050000 (  1.063126)
Thread.critical      1.130000   0.000000   1.130000 (  1.153935)
Thread.exclusive     2.660000   0.000000   2.660000 (  2.704054)
--------------------------------------------- total: 4.840000sec

                         user     system      total        real
no Thread.critical   0.360000   0.000000   0.360000 (  0.366529)
Thread.critical      0.910000   0.010000   0.920000 (  0.922671)
Thread.exclusive     2.670000   0.000000   2.670000 (  2.692268)
481b8eedcc884289756246e12d1869c1?d=identicon&s=25 Francis Cianfrocca (Guest)
on 2006-05-22 20:37
(Received via mailing list)
As a very rough rule of thumb, when I design a thread-hot system, I
generally try to make the contention ratio of every mutex no worse
than 1:100. I try for 1:1000 if I can get it. Regardless of the weight
of the mutex mechanism (Ruby's can hardly be worse than the one in
multi-processor Windows builds), the cost of a missed
mutex-acquisition ends up being about that high. People often complain
to me that keeping lock-sets as small as possible works against the
goal of making threads easier to use. Well, it doesn't because threads
are hard to use, period. I've found that with careful analysis, it's
*generally* possible to keep a lock set minimal, ideally no larger
than one read and one write. Any more than that (incuding function
calls), and you're often synchronizing more than you really need to.

Java programmers (who suffer simultaneously from horrible
thread-management in their language, and a culture of serious
thread-overuse) love to tell you about their new "wait-free"
programming model. Not a new idea, it just encourages you to make your
lock-sets so minimal that they will fit within a machine operation
that is guaranteed to run in one bus cycle. (Intel chips have
half-a-dozen such operations.) Just proves my point all the more.
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (Guest)
on 2006-05-23 10:53
(Received via mailing list)
2006/5/22, Francis Cianfrocca <garbagecat10@gmail.com>:
> We're threadjacking, so I'll keep it short. The point of taking a very
> restrictive view of synchronization is to prevent incorrect concurrency.
> Your example depends on the implementation of aMap and of calculateValue not
> to do evil things. (One of the evil things they can do is simply to run for
> several milliseconds, or make a blocking I/O call. This can give you a bad
> case of mutex contention, which is exceptionally costly in many modern
> implementations.) This means that the program may change behavior with
> respect to concurrency across platforms, hardware, and also across time (as
> the code inside those called functions changes).

No, it will change with regard to timing but not with regard to
behavior (semantics).

> The nightmare scenario is this: the client calls to say that your
> mission-critical application stops running occasionally. It will be fine for
> a month, and then it will stop twice in one week. You ask what did they do
> differently, and the answer is always "nothing." You ask what your
> programmers changed, and the answer is always "nothing." The problem is of
> course completely non-reproducible. This is not a nice place to be, since
> you can't just blame the client's environment.

I agree that getting MT programs right is harder than single threaded
applications - but that's not a reason to basically disallow MT if it
fits the business problem well.

> I suppose your answer to all this is: just code more carefully, and only use
> well-debugged libraries. That of course is a partially-correct answer, but
> achievable in practice only at some specific cost. My larger point is that
> in the case of threads, this balance-point is often very hard to achieve at
> reasonable cost.

No, my answer is that your rule prevents proper implementation of
business requirements. The point of the short example I presented was
that you have to make a *sequence of operations* mutually exclusive in
order to implemente the business requirement (have a thread safe cache
that is filled as values are requested...). You have to allow this in
order to implement correct thread safe programs. If you allow for
variable assignment only then it's overly complex to implement the
semantics I demonstrated.  (btw, assignment can be a complex
operation, too - just thing of C++ operator overloading)

> I'll let you have the last word, both because we're offtopic, and because
> threading is a religious issue to many people and so the question tends to
> generate more heat than light :-). In my defense, I'll only say that my
> dislike of threads is rooted in many years of experience, and not a mere
> prejudice.

I'm not religious here I just state the fact that your rule cripples
MT implementations.

Kind regards

robert
This topic is locked and can not be replied to.