Rev/actor TCP monkey patching

Short and sweet -

How exactly does one monkey patch Rev’s TCP over Net::HTTP-s?

Thanks,
-------------------------------------------------------|
~ Ari
seydar: it’s like a crazy love triangle of Kernel commands and C code

On Feb 11, 2008 4:26 PM, fedzor [email protected] wrote:

Short and sweet -

How exactly does one monkey patch Rev’s TCP over Net::HTTP-s?

Check out the SVN trunk:

svn checkout http://rev.rubyforge.org/svn/

There’s now a Rev::SSL module you can extend any subclass of
Rev::TCPSocket
with (this includes Revactor::TCP::Socket).

To use it with Rev::HttpClient, just:

require ‘rev/ssl’

and subclass HttpClient, defining the following callbacks:

def on_connect
super
extend(Rev::SSL)
ssl_client_start
end

You’ll also need to:

def on_ssl_connect
request(…) # Make the HTTP request after the SSL handshake completes
end

…to initiate the HTTP request after the SSL handshake is complete.

There’s a cooresponding #on_ssl_error(ex) method you can define to
capture
any errors which occur during the SSL handshake (or any attempts to
negotiate a new session).
You can also:

def ssl_context
OpenSSL::SSL::SSLContext.new(…)
end

…to create an OpenSSL context to use for the SSL session. By default
no
certificate verification takes place, so if you’d like to do any add the
appropriate certificates to the SSL context object.

If you’re having any trouble, you might take a look at the Revactor SVN
trunk:

svn checkout http://revactor.rubyforge.org/svn/

Revactor’s HttpClient is already configured for SSL support, and wraps
everything up a lot better, e.g.:

response = Revactor::HttpClient.get(“https://your.ssl.server.here/”)

This also takes care of some of the other annoyances of HTTP, like
following
redirects.

If you don’t want to use Revactor directly, you can at least use it as a
reference for using Rev::HttpClient in conjunction with SSL. Just have
a
look at lib/revactor/http_client.rb

I’d like to wrap up SSL support in Rev’s HttpClient a bit more nicely,
but
for now my real priority was getting it working in Revactor.

On Feb 11, 2008 4:26 PM, fedzor [email protected] wrote:

Short and sweet -

How exactly does one monkey patch Rev’s TCP over Net::HTTP-s?

I probably should’ve mentioned this up front, but you can save yourself
a
lot of grief by just using Revactor:

require ‘revactor’
=> true
response = Revactor::HttpClient.get(“https://www.google.com”)
=> #<Revactor::HttpResponse:0x5d30f4
@client=#Revactor::HttpClient:0x5d5020, @status=200, @reason=“OK”,
@version=“HTTP/1.1”, @content_length=nil, @chunked_encoding=true,
@header_fields={“Cache-Control”=>“private”,
“Set-Cookie”=>“PREF=ID=b7f27e80b36a740b:TM=1202788671:LM=1202788671:S=CIZI75tKEMuS7q3X;
expires=Thu, 11-Feb-2010 03:57:51 GMT; path=/; domain=.google.com”,
“Server”=>“gws”, “Transfer-Encoding”=>“chunked”, “Date”=>“Tue, 12 Feb
2008
03:57:51 GMT”, “Connection”=>“Close”}, @content_type=“text/html;
charset=ISO-8859-1”>
response.body
=> "<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">Google …

That’s it.

Just grab the Rev and Revactor svn repositories, “rake gem” in both, and
install the gems. You get all the benefits of an evented HTTP client
without the inversion-of-control headaches typically associated with
evented
programming.

On Feb 12, 2008 1:38 PM, fedzor [email protected] wrote:

On a side note, would you be able to post some more example with
Actors to achieve concurrency? Because at first I understand your
echo server, and then I tested out actors myself and everything I
knew fell apart…

Here’s a simple example. It will create a new Actor for each specified
URL,
which probably isn’t the behavior you want. You’ll instead want
something
like a queue. Also, this will gib if you have duplicates in the URL
list.
And as another bonus, I haven’t tested it, but I think it will work :slight_smile:

However, it should be enough to get you started:

require ‘revactor’

A list of URLs we want to fetch

url_list = [‘http://www.google.com’, …]

Capture the parent Actor to send messages to

parent = Actor.current

Spawn a new Actor for each URL we want to fetch

url_list.each do |u|
Actor.spawn do
# Request the URL
response = Revactor::HttpClient.get(u)

# Consume and store the body if the response status was 200
response.body if response.status == 200

# Close the connection
response.close

parent << [:http_response, u, response]

end
end

Store all the responses in a hash mapping URLs to the responses

responses = {}

Consume all the responses

while responses.size < url_list.size
Actor.receive do |filter|
# Catch messages which start with :http_response and contain two
objects
filter.when(Case[:http_response, Object, Object]) do |_, url,
response|
responses[url] = response
end
end
end

Inspect the responses

p responses

On Feb 12, 2008, at 5:30 PM, Tony A. wrote:

response.body if response.status == 200

# Close the connection
response.close

parent << [:http_response, u, response]

end
end

Here is where I’ve found a problemo will occur. In my ahem
extensive ahem research with your fine fine actors ;-), concurrency
only exists when run within an actor - Actor.current has failed me
here! Example:

foods = [‘chocolate’, ‘seltzer’, ‘awesome sauce’]
foods.each do |food|

Actor.spawn(food) do |f|
sleep 1
puts f
end
end
puts “foods rounded up”

chocolate

foods rounded up

Uhoh! Not quite. Lemme try something different -

Well, whatever, I couldn’t reproduce that code, because I updated to
your svn code. The one released as a gem required me to embed it all
within an actor before getting any sort of concurrency. But
basically, I sent some snail mail to an actor and it waited on
quitting until all of the internal actors finished working
concurrently. It was really awesome and I understood it! And then I
reinstalled revactor. oof! Better work something out.

Do you think you could help me understand what I would need to do to
get awesome sauce printed?

Thanks,
-------------------------------------------------------|
~ Ari
if god gives you lemons
YOU FIND A NEW GOD

Thank you so much!

This has really saved me a lot of speed and given me some more speed :wink:

On a side note, would you be able to post some more example with
Actors to achieve concurrency? Because at first I understand your
echo server, and then I tested out actors myself and everything I
knew fell apart…

Thanks,
Ari
--------------------------------------------|
If you’re not living on the edge,
then you’re just wasting space.

On Feb 12, 2008 6:10 PM, fedzor [email protected] wrote:

puts f

end
end
puts “foods rounded up”

I should really document this better, but: In Revactor, Actors are
Fiber-based, so anything that blocks for prolonged periods of time will
hang
all Actors in the system. It’s the same sort of thing you’ll encounter
with
an evented framework like EventMachine.

Fortunately, there’s “Actor-safe” replacements for most of the blocking
tasks you’ll perform in a networked application, namely: DNS resolution,
opening connections, SSL handshakes, and reading from and writing to the
network.

In the case of sleep, there’s the handy:

Actor.sleep 1

Which is just shorthand for Actor.receive { |filter| filter.after 1 }

For anything related to networking, you need to use Actor::TCP or
Actor::HttpClient (which uses the fully asynchronous HTTP client from
Rev)

To execute long-running blocks of code which aren’t related to
networking,
the next release of Revactor will be thread safe. This means you can
spin
the long running task off in a thread, and have it send a message when
it
completes.

As a follow up, I realized why it wasn’t working: bug in filters. I’m
too scared to patch it up, but I think I’ll do that anyways. Example
problem:

myactor = Actor.spawn do
Actor.receive do |filter|
filter.when(:dog) { puts “I got a dog!” }
end
end

myactor << :dog

Run it and see… NOTHING. it exits… so… what up G.

-------------------------------------------------------|
~ Ari
seydar: it’s like a crazy love triangle of Kernel commands and C code

On Feb 12, 2008 6:39 PM, fedzor [email protected] wrote:

myactor << :dog

Run it and see… NOTHING. it exits… so… what up G.

The problem here is that you need to do something which yields control
back
to the Actor scheduler before it can run. The simplest thing you can do
is:

Actor.sleep 0

This will yield control to the scheduler, which will process any
outstanding
messages then return control back to you (since you told it you wanted
to
sleep for 0 seconds)

It’s pretty confusing from irb, I’ll admit…

When doing anything with Actors, just remember you’re “queuing up”
operations which will run later… later being whenever you call
Actor.receive. Actor.receive is the only way to defer control to other
Actors (keeping in mind Actor.sleep is just shorthand for Actor.receive)

All of the “blocking” operations in Revactor, things like
Actor::TCP.connect,
Actor::HttpClient.get, etc. are all calling Actor.receive underneath.

On Feb 12, 2008, at 8:57 PM, Tony A. wrote:

wanted to
sleep for 0 seconds)

It’s pretty confusing from irb, I’ll admit…

When doing anything with Actors, just remember you’re “queuing up”
operations which will run later… later being whenever you call
Actor.receive. Actor.receive is the only way to defer control to
other
Actors (keeping in mind Actor.sleep is just shorthand for
Actor.receive)

BTW, do you hang on IRC? #ruby-lang, seydar.

So I need to do the Actor.sleep to dish out control, but how come I
didn’t have to do that with the version of revactor actually released?

-------------------------------------------------------|
~ Ari
if god gives you lemons
YOU FIND A NEW GOD

On Feb 12, 2008 7:11 PM, fedzor [email protected] wrote:

BTW, do you hang on IRC? #ruby-lang, seydar.

I’m typically on #rubinius and #erlang, tarcieri

So I need to do the Actor.sleep to dish out control, but how come I
didn’t have to do that with the version of revactor actually released?

The semantics of how Actors are scheduled changed slightly in trunk.
Previous versions of Revactor would return control back to the root
Actor /
Fiber in the event that all inter-Actor messages had been dispatched and
there were no pending network events. The trunk version adds a
thread-safe
message queue to allow Actors in one thread to send messages to Actors
in
another thread, and implementing this required a number of changes to
the
semantics of the scheduler.

There was a nasty idle loop bug in previous versions of Revactor, as
well.
If you load Revactor 1.2, and call something to the effect of
Actor.receive{ |f|
f.when(:foo) {} } in the root Actor, the scheduler will just sit and
spin,
because all messages have been dispatched and there are no event
sources.
Infinite loop.

The new scheduler semantics make use in irb (or in RSpec) a bit more
cumbersome, but they add thread safety, a ~10% performance improvement
(per
tools/messaging_throughput.rb), and eliminate a potential infinite loop.
And that infinite loop isn’t just hypothetical: you’ll encounter it in
any
system which has run out of events to process, and it will immediately
get
in the way when implementing distributed systems which need to wait for
remote events.

Next release or next svn update? I’m running the svn version.

The next release. The svn version has the thread-safety improvements in
place, but they’re not speced yet and are definitely buggy. I wouldn’t
try
using them yet, but you won’t encounter any problems unless you try to
send
messages between Actors across threads. The idle loop bug is also
fixed.

On Feb 12, 2008, at 8:48 PM, Tony A. wrote:

To execute long-running blocks of code which aren’t related to
networking,
the next release of Revactor will be thread safe. This means you
can spin
the long running task off in a thread, and have it send a message
when it
completes.

Next release or next svn update? I’m running the svn version.

~ Ari
English is like a pseudo-random number generator - there are a
bajillion rules to it, but nobody cares.

On Feb 13, 2008 1:45 AM, Tony A. [email protected] wrote:

So I need to do the Actor.sleep to dish out control, but how come I
didn’t have to do that with the version of revactor actually released?

Oh, also note: I have tried to restore the semantics of the original
scheduler by making the current Actor reschedule itself and relinquish
control to the event loop whenever sending messages, but my initial
attempts
at doing this introduced a number of scheduling bugs and broke several
of
the specs. I might give it another try…