Can I do this perl code the same in ruby?

kazaam · October 6, 2007, 6:50pm

Well I’m trying to “translate” a perl program to ruby and everything
worked fine until the near end where I’m now. There we have this perl
code:

my $slct = IO::Select->new($server);
while($slct->can_read()) {
my $nbytes = read $server, $response, 2**16;
last if !$nbytes;
$client->send($response);
}

$server is a socket-handle exactly as $client. But now I’m stuck. Is
there any equivalent to perls can_read ? Than this line here:
my $nbytes = read $server, $response, 2**16;
last if !$nbytes;

Means something like read from server 2^16 bytes save to nbytes and
append to response or?

kazaam · October 6, 2007, 9:21pm

On Oct 6, 2007, at 10:50 AM, kazaam wrote:

$server is a socket-handle exactly as $client. But now I’m stuck.
Is there any equivalent to perls can_read ? Than this line here:
my $nbytes = read $server, $response, 2**16;
last if !$nbytes;

Means something like read from server 2^16 bytes save to nbytes and
append to response or?

cfp:~ > ruby -r io/wait -e’ p STDIN.ready? ’
nil

cfp:~ > ruby -r io/wait -e’ p STDIN.ready? ’ < /dev/zero
-e:1:in `ready?': Operation not supported by device (Errno::ENODEV)
from -e:1

cfp:~ > ruby -r io/wait -e’ p STDIN.ready? ’ < /dev/null
-e:1:in `ready?': Operation not supported by device (Errno::ENODEV)
from -e:1

cfp:~ > ruby -r io/wait -e’ p STDIN.ready? ’ < /etc/passwd
1932

a @ http://codeforpeople.com/

kazaam · October 7, 2007, 12:25pm

Thanks for pointing me to this but the main problem at the moment is,
that rubys select compaired to perls can’t handle sockets but just
arrays!?

kazaam · October 7, 2007, 1:17pm

On 10/7/07, kazaam [email protected] wrote:

Thanks for pointing me to this but the main problem at the moment is, that rubys select compaired to perls can’t handle sockets but just arrays!?

–
kazaam [email protected]

Arrays of what do you think? Maybe array’s of sockets? Actually
array’s of any instance of IO.

kazaam · October 7, 2007, 3:47pm

Arrays of what do you think? Maybe array’s of sockets? Actually
array’s of any instance of IO.
Yes you have been right, also if this looks terrible now because:
myselec = IO.select([server]) fills myselec with [[socket]] so with 2
Arrays I have to extract with myselec[0][0]…somehow perl looks more
intuitive in this case.

Thats the perl code I wanna translate to ruby:
http://mailman.linuxchix.org/pipermail/techtalk/2003-January/014338.html
and this is my ruby code:

#!/usr/bin/env ruby

$Verbose=true

require ‘socket’
require ‘uri’
require ‘io/wait’

$bind_port = ‘2222’
$bind_address=‘localhost’

opens a socket on the local machine and binds the proxy to it

proxy = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
proxy.bind(Socket.pack_sockaddr_in($bind_port, $bind_address ))
proxy.listen( 5 )

waits for Browser-client to connect

while client = proxy.accept
fork()
# read what comes from the Browser into request
request=‘’
while client[0].readline
request += $_
break if $_ =~ /^\s*$/m
if $_ =~ /^GET .+/
host = URI.parse(URI.extract($)[0]).host
port = URI.parse(URI.extract($)[0]).port
end
end
# connect to remote webserver and sends the request and read the
response
server = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
server.connect(Socket.pack_sockaddr_in(port,
host.chomp.sub(‘/’,‘’) ))
server.write(request)
response=‘’
while server.readline
response += $_
break if $_ =~ /^\s*$/m
end
# sends the http-header to browser
client[0].write(response)

    # listens for further responses of the server and sends it to

the browser
myselec = IO.select([server])
while myselec[0][0].ready?
response = server.read(2**16)
break if !response
client[0].write(response)
end
end

This works a bit now for calling just one url in the browser but problem
at the moment is just the handling of the different child processes. I
can’t close client with client.close because it’s an array and no
socket-handler?!? So I’m struggeling with translating this to ruby:

kazaam · October 8, 2007, 9:13am

2007/10/6, kazaam [email protected]:

    my $nbytes = read $server, $response, 2**16;
    last if !$nbytes;
Means something like read from server 2^16 bytes save to nbytes and append to response or?

As long as you are only copying between one pair of descriptors, you
do not need #select. You can simply do

while ( buffer = io_in.read(2**16) )
io_out.write(buffer)
end

If you just have few pairs, the code is easier with the above piece
put in a thread per pair.

Only if you are doing heavy copying (say, more than 10 pairs or so)
you should consider using #select. My 0.02EUR…

Kind regards

robert

kazaam · October 7, 2007, 3:50pm

hehe pushed shortcut for sending incedentially while writing the last
message, here the missing part:

$SIG{CHLD} = sub { while (wait() > 0) {} };

while (my $client = $proxy->accept()) {

my $kidpid = fork();  die "cannot fork" unless defined $kidpid;

if ($kidpid) {
    close $client;  # no longer needed in parent
    next;
}
close $proxy;       # no longer needed in child

kazaam · October 8, 2007, 3:16pm

On 10/7/07, kazaam [email protected] wrote:

Arrays of what do you think? Maybe array’s of sockets? Actually
array’s of any instance of IO.
Yes you have been right, also if this looks terrible now because: myselec = IO.select([server]) fills myselec with [[socket]] so with 2 Arrays I have to extract with myselec[0][0]…somehow perl looks more intuitive in this case.

generally we use select thusly:

to_read, to_write, erroed = IO.select([server])

Again you have only one socket apparently, as Robert points out you
don’t need to use select. select is generally for multiplexing IO, ie
you are working with multiple sockets. So it is going to seem awkward
for when you are working with just one.

kazaam · October 8, 2007, 5:06pm

hmm I wanted to use it because I have many different server sockets.
They are created in a fork. I did it now with your loop and it seems to
work now except the fact the browsing through my webproxy is now pretty
slow and I’m getting really confusing errors:
./httpsocket.rb:23:in readline': end of file reached (EOFError) from ./httpsocket.rb:23 ./httpsocket.rb:23:inreadline’: end of file reached (EOFError)
from ./httpsocket.rb:23
./httpsocket.rb:23:in readline': end of file reached (EOFError) from ./httpsocket.rb:23 ./httpsocket.rb:23:inreadline’: end of file reached (EOFError)
from ./httpsocket.rb:23

which seems to me like a very strange behavior because this loop means
read a line from client[0] as long as possible: while client[0].readline
why is an eof error there possible? shoudln’t make an eof ending the
loop?

#!/usr/bin/env ruby

$Verbose=true

require ‘socket’
require ‘uri’

$bind_port = ‘23322’
$bind_address=‘localhost’

opens a socket on the local machine and binds the proxy to it

proxy = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
proxy.bind(Socket.pack_sockaddr_in($bind_port, $bind_address ))
proxy.listen(1)

waits for Browser-client connections

while client = proxy.accept
fork()
# reads the browsers request
request=’’
while client[0].readline
request += $_
break if $_ =~ /^\s*$/m
if $_ =~ /^GET .+/
host = URI.parse(URI.extract($)[0]).host
port = URI.parse(URI.extract($)[0]).port
end
end
# connects to the webserver and sends the request
server = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
server.connect(Socket.pack_sockaddr_in(port,
host.chomp.sub(’/’,’’) ))
server.write(request)
# reads webservers response
response=’’
while server.readline
response += $_
break if $_ =~ /^\s*$/m
end
# sends the http-header to browser
client[0].write(response)
# listens for further responses of the server and sends it to the
browser
while ( response = server.read(2**16) )
client[0].write(response)
end
end

kazaam · October 8, 2007, 5:50pm

On 08.10.2007 17:03, kazaam wrote:

hmm I wanted to use it because I have many different server sockets. They are created in a fork.

What exactly do you mean by that? If you create them in a sub process
then you just have one pair per process - do you?

I did it now with your loop and it seems to work now except the fact the browsing through my webproxy is now pretty slow and I’m getting really confusing errors:
./httpsocket.rb:23:in readline': end of file reached (EOFError) from ./httpsocket.rb:23 ./httpsocket.rb:23:in readline’: end of file reached (EOFError)
from ./httpsocket.rb:23
./httpsocket.rb:23:in readline': end of file reached (EOFError) from ./httpsocket.rb:23 ./httpsocket.rb:23:in readline’: end of file reached (EOFError)
from ./httpsocket.rb:23

That’s why:
http://www.ruby-doc.org/core/classes/IO.html#M002298

$bind_port = ‘23322’
$bind_address=‘localhost’

opens a socket on the local machine and binds the proxy to it

proxy = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
proxy.bind(Socket.pack_sockaddr_in($bind_port, $bind_address ))
proxy.listen(1)

Why don’t you use TCPServer? Seems much easier for your use case.
There is a comprehensive example in the Pickaxe.

waits for Browser-client connections

while client = proxy.accept
fork()

I believe you are not using fork properly here. The easiest is to use
it with a block which gets executed in the child. If you do not do
that, you need to evaluate the return value of fork to determine whether
you are in the parent or child process.

    # reads the browsers request
    request=''
    while client[0].readline
               request += $_
               break if $_ =~  /^\s*$/m
               if $_ =~ /^GET .+/
                       host = URI.parse(URI.extract($_)[0]).host
                       port = URI.parse(URI.extract($_)[0]).port
               end
    end

Why don’t you read the complete request? This way you can’t do POST and
PUT as far as I can see.

    # connects to  the webserver and sends the request  
    server = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
    server.connect(Socket.pack_sockaddr_in(port, host.chomp.sub('/','') ))
    server.write(request)

TCPSocket seems easier here.

    # reads webservers response
    response=''
    while server.readline
               response += $_
               break if $_ =~  /^\s*$/m
     end

You’re making your life much harder than necessary. Why don’t you just
do this

response = “”
server.each do |line|
response << line
break if /^\s*$/m =~ line
end

or

response = “”
while line = server.gets
response << line
break if /^\s*$/m =~ line
end

    # sends the http-header to browser
    client[0].write(response)
   # listens for further responses of the server and sends it to the browser
    while ( response = server.read(2**16) )
          client[0].write(response)
    end

end

It seems you are trying to write a HTTP proxy. If it is not for the
educational experience then I suggest to look into the RAA or in the
standard lib. I believe a proxy class is part of Webrick.

Kind regards

robert

kazaam · October 8, 2007, 6:32pm

It seems you are trying to write a HTTP proxy. If it is not for the
educational experience then I suggest to look into the RAA or in the
standard lib. I believe a proxy class is part of Webrick.

Thanks for these hints!! I didn’t new about RAA which seems like a
really great collection. And you have also been right with webrick which
makes it pretty easy to do an http-proxy:

#!/usr/bin/env ruby
$Verbose=true

require “webrick”
require “webrick/httpproxy”

pch = Proc.new{|req, res|
p [ req.request_line, res.status_line ]
}

def upstream_proxy
if prx = ENV[“http_proxy”]
return URI.parse(prx)
end
return nil
end

httpd = WEBrick::HTTPProxyServer.new(
:Port => 10080,
:ProxyContentHandler => pch,
:ProxyURI => upstream_proxy
)
Signal.trap(:INT){ httpd.shutdown }
httpd.start

kazaam · October 8, 2007, 7:40pm

On 08.10.2007 18:29, kazaam wrote:

It seems you are trying to write a HTTP proxy. If it is not for the
educational experience then I suggest to look into the RAA or in the
standard lib. I believe a proxy class is part of Webrick.

Thanks for these hints!! I didn’t new about RAA which seems like a really great collection. And you have also been right with webrick which makes it pretty easy to do an http-proxy:

#!/usr/bin/env ruby
$Verbose=true

The variable is called $VERBOSE.

end
return nil
end

You can simplify that to

def upstream_proxy
prx = ENV[“http_proxy”] and URI.parse(prx)
end

httpd = WEBrick::HTTPProxyServer.new(
:Port => 10080,
:ProxyContentHandler => pch,
:ProxyURI => upstream_proxy
)
Signal.trap(:INT){ httpd.shutdown }
httpd.start

Kind regards

robert

kazaam · October 9, 2007, 10:11am

The variable is called $VERBOSE.

thx I will do so! I always used $Verbose and never noticed any problems.
I always got warnings but I’m gonna trust you with it

I recognized it because I happened to be hacking on it
yesterday […]

Have you found a way to change requests before webrick/httpproxy is
sending them out? Proxycontenthandler let’s you change the response
before it is send to the browser but not the requests. I tried to change
the httpproxy.rb-lib on several places with req.header[“user-agent”] =
“another one than I’m using…” but wether on the end of choose_header
nor here:
header = Hash.new
choose_header(req, header)
set_via(header)

nor on any other place I could change it?

kazaam · October 8, 2007, 8:20pm

Robert K. wrote:

On 08.10.2007 18:29, kazaam wrote:
…
prx = ENV[“http_proxy”] and URI.parse(prx)
end

Heh, the former is actually the example from webrick in the ruby
distribution. (I recognized it because I happened to be hacking on it
yesterday and also thought it was a little verbose.)