Before anything else, thanks for your long and interesting reply.
I can’t find any documentation of the TCP protocol itself.
The documentation is still incomplete, I will wait some
versions to improve it.
Looking at the code, the call appears to be a four-byte length followed
by
a Marshal dump of
[@uri.path[1…-1], name, args]
and the response is a 4-byte length followed by a Marshal dump of a
single
object. This means that the protocol is similar to DRb, but different,
i.e.
not compatible
It’s true. I don’t try to do a compatible protocol, because I think it’s
difficult to add some new ideas with the same protocol.
I can see a problem with this protocol: it’s impossible to distinguish
between a method returning an Exception object and a method raising an
exception.
Also true. I do not realize that. I will try to fix it.
There are certainly problems which DRb has which could do with solving.
Chief in my mind these are:
(1) Doesn’t run easily over FastCGI, IO.popen and stdin/stdout, or Unix
domain sockets.
(2) Doesn’t allow bi-directional method calls through a single
connection,e.g. for use through a firewall and/or NAT and port forwarding.
(3) Doesn’t interoperate with non-Ruby applications. I would like to be
able to carry strings and integers as plain strings. This would make it
easy to interoperate with Perl, say.
(4) Not easy to secure.
(5) Large and complex implementation, not easy to understand, especially
with regards to connection caching and what happens if connections are
dropped.
Your library addresses (4), (5) and part of (1). But I think (2) and (3)
need careful thinking to produce a suitable protocol. As for (2), I’m
not sure your protocol supports proxy objects at all, a.k.a.
DRbUndumped, but please correct me if I’m wrong.
I think a replacement protocol for DRb should have the following
characteristics:
- ‘Requests’ and ‘Responses’ explicitly marked as such, so that calls
can be made in both directions down the same socket.
I will do this, you are right.
- Each Request and Response to be tagged with an ID, so multiple
overlapping calls can be multiplexed onto the same socket, meaning
that only a single TCP connection needs to be kept open
between a particular pair of hosts.
Very interesting too, I will implement this too.
- Ability to pass Undumped objects in such a way that the callback
is either made back down the same socket, or to a global URI (the
latter is what DRb does). The latter also allows the callback to
be made independently, and avoids forming long chains of tunnels. (+)
I don’t realy understand this point, my english is not perfect
Could you explain it, thanks.
-
Basic protocol uses only string encoding, allowing tiny implementation
suitable for embedded devices etc, and cross-language interoperability.
-
Optionally enable other native marshalling protocols (e.g. Marshall,
YAML, Perl Storable etc) if understood by both sides. Ideally negotiated.
I already thought about that, and I would like to release this in the
next
version (0.5). I would like to have these serialization methods :
- YAML
- JSON
- XML
- Marshall-Ruby
And in a second time :
I already don’t know some problem, for example, how to serialize a ruby
exception in php format
- Optional SASL authentication as well as, or instead of, SSL. This
allows basic username/password authentication (AUTH PLAIN) which is
simpler to configure than client certificates.
It could be a great idea but I’m not familiar with SASL, I will take a
look
at some docs.
- Optional synchronous mode (one request -> one response) for tunneling
over HTTP or FastCGI and for simplistic implementations in embedded devices
I will think about this too, I could be cool, but I don’t know if I will
be
able to merge this with the current code.
- Automatic establishment or re-establishment of dropped connections
Already planned
I also think that implementations should consider using an opaque key to
identify each object, rather than it’s object_id. For one thing,
object_id’s can be recycled; for another, it prevents people probing
for random objects which have not been explicitly shared. Keeping a
mapping table of opaque_key => object will prevent DRbUndumped objects
from being garbage collected (although this may or may not be desirable)
object_id is just used with the logger (through client_id method) to
print
something revelant to the user, and for example, it’s overloaded by
TCP::Server. So, if I’m not wrong, it’s not necessary.
Just a few thoughts.
Thanks again for your thoughts.
Regards,
Brian.
(+) It would be useful to pass a different global URI depending on where
you are connecting to. For example, if you are connecting outbound through
the Internet, the global URI you expose may be your firewalls’ outside IP or
hostname and a port on the firewall which forwards inbound connections.
To make this more transparent, it could be useful to auto-detect NAT.
This could be done by passing your actual source IP and port inside the
connection message; the far end would compare these to the seen source
IP and port. IPSEC NAT traversal uses a similar mechanism.
This could be great, but not for the moment. I will wait a stable and
clean
API before doing this kind of feature.
Best regards.