Networking: select() blocks for seconds (> timeout)

Raul P. wrote:

and then this “Time thief”
appearing every hour and taking away chunks of 5 or 10 seconds… only a
sudden flash of that conversation with the JPL engineer (“the GC misses
Mars landing!”) gave me a clue (and kept me sane, barely).

I will recreate the example using 2 macs connected via Ethernet (I must
rewrite the code though, as it stays with the company). I will use Ruby
on one side and C on the other, and hopefully it will happen again; and
then I will disable/enable the Gc (not that those statements fix
anything as some seem to think, as they will just shift the problem from
one place to another of the program, but they will indicate if it is the
GC).

I cannot see any possible mechanism how the GC would pause for exactly a
multiple of 5 seconds. But a DNS query, if it loses a packet or two,
could certainly pause for exactly 5 or 10 seconds until it retries.

Mark T wrote:

Wow Raul,
You know how to write a thriller at least!

We are not running any more the experiment where this problem occurred,
so I cannot say now if this fixed it. But thanks for the tip.

Does this mean the story is unfinished?

MarkT

Mark, you made my day with your line.

It will remain an unsolved thriller for a while; not only the
experiment is not running any more (the structure that was moved is
being replaced) but I am also leaving that work, and I will not have
the chance to verify it.

However, I can’t stand the thought of leaving that story unfinished. It
was so damning, because everything worked for 99% of the time, every
packet delivered with millisecond precision by ruby 1.9.2 keeping the
pace up perfectly with C/Assembler firmware, and then this “Time thief”
appearing every hour and taking away chunks of 5 or 10 seconds… only a
sudden flash of that conversation with the JPL engineer (“the GC misses
Mars landing!”) gave me a clue (and kept me sane, barely).

I will recreate the example using 2 macs connected via Ethernet (I must
rewrite the code though, as it stays with the company). I will use Ruby
on one side and C on the other, and hopefully it will happen again; and
then I will disable/enable the Gc (not that those statements fix
anything as some seem to think, as they will just shift the problem from
one place to another of the program, but they will indicate if it is the
GC).

I will let you know the outcome (in 2-3 weeks). I appreciated your note

Raul P.
[email protected]

Brian C. wrote:

Raul P. wrote:

and then this “Time thief”
appearing every hour and taking away chunks of 5 or 10 seconds… only a
sudden flash of that conversation with the JPL engineer (“the GC misses
Mars landing!”) gave me a clue (and kept me sane, barely).

I will recreate the example using 2 macs connected via Ethernet (I must
rewrite the code though, as it stays with the company). I will use Ruby
on one side and C on the other, and hopefully it will happen again; and
then I will disable/enable the Gc (not that those statements fix
anything as some seem to think, as they will just shift the problem from
one place to another of the program, but they will indicate if it is the
GC).

I cannot see any possible mechanism how the GC would pause for exactly a
multiple of 5 seconds. But a DNS query, if it loses a packet or two,
could certainly pause for exactly 5 or 10 seconds until it retries.

So we have either:
a) a GC problem
b) or that to read a packet from a queue, Ruby 1.9.2 makes a reverse dns
query.

The second is so sad (and so unlike what I have seen from Ruby) that I
almost prefer the first (although it can’t be fixed).

But enough speculation.

Raul P.

So we have either:
a) a GC problem
b) or that to read a packet from a queue, Ruby 1.9.2 makes a reverse dns
query.

The second is so sad (and so unlike what I have seen from Ruby) that I
almost prefer the first (although it can’t be fixed).

I don’t think it’s the second, since 1.9.2 sets
BasicSocket.do_not_reverse_lookup to true by default.
Jruby doesn’t yet, though I’m sure they’ll get around to it sometime.
-r

Roger P. wrote:

So we have either:
a) a GC problem
b) or that to read a packet from a queue, Ruby 1.9.2 makes a reverse dns
query.

The second is so sad (and so unlike what I have seen from Ruby) that I
almost prefer the first (although it can’t be fixed).

I don’t think it’s the second, since 1.9.2 sets
BasicSocket.do_not_reverse_lookup to true by default.
Jruby doesn’t yet, though I’m sure they’ll get around to it sometime.
-r

Yes; I will anyhow use that line (although on the socket instance, not
on the class) so that there are no doubts.

[ btw: for anyone interested, the test will be done by July 12 or so,
as by then I will have my 2nd mac ]

Raul P.

Raul P. wrote:

I don’t think it’s the second, since 1.9.2 sets
BasicSocket.do_not_reverse_lookup to true by default.
Jruby doesn’t yet, though I’m sure they’ll get around to it sometime.
-r

Yes; I will anyhow use that line (although on the socket instance, not
on the class) so that there are no doubts.

It is only settable globally, AFAIK.

require ‘socket’
=> true
Socket.do_not_reverse_lookup = true
=> true
TCPSocket.new(“www.google.com”,80).do_not_reverse_lookup = true
NoMethodError: undefined method `do_not_reverse_lookup=’ for
#TCPSocket:0x7f65b8da0000
from (irb):4
from :0

It is only settable globally, AFAIK.

Appears to be settable per socket in 1.9.

instance.do_not_reverse_lookup=true
=> true