Ruby fork (thread like) on windows

Hi All,

I’m trying to upload a number of large files to s3. I want to do this
in parallel using threads, but it looks like the s3 gems for windows
aren’t thread safe.

I then started looking into fork. Memory/process creation time
aren’t a limiting factor.


So the question is: How can I create a thread like experience using
fork
(or something similar). I want it to run only a subset of the entire
program (ideally what’s in the block) and I want to wait for it to
finish.


I tried win32/process#fork, but it runs the entire program twice. And
open("|-", “r”) isn’t supported on windows either.

Any ideas are appreciated…

Adam

yeah best bet is to spawn off multiple processes.

On Fri, May 27, 2011 at 1:44 AM, Roger P. [email protected]
wrote:

yeah best bet is to spawn off multiple processes.

Though, I wonder if this is an improvement over the serial uploading.
Unless the upload rate is limited to a fraction of the total upload
speed for each process wanting to upload data (either on the uploading
or the receiving end), two files uploaded in parallel will use half*
the bandwidth, and thus upload just as slow or fast as the two files
in series.

Additionally, the program complexity rises: instead of keeping tabs on
successful uploads of one file, it’s now n uploads that have to be
monitored and redone on failure.

But since engineering challenges are fun, I’d do it like this (if I had
to):

  • Create an upload queue server containing all unfinished uploads.
  • Spawn several worker processes that check the queue for the next
    available upload (beware: race condition. A simple block-and-backoff
    strategy would be sufficient to prevent that, I think), and mark their
    assigned uploads.
  • Once the upload is finished, the worker processes send the all
    clear, and the upload gets removed from the queue.
  • If a worker process cannot finish its upload for whatever reason,
    the upload gets marked as unfinished again.
  • If a worker finishes its upload, it marks the queue item for deletion.

Rails can use delayed_job http://rubygems.org/gems/delayed_job that
does something similar, so go ahead and steal what you can.

  • Actually a little less due to a touch more overhead, but I doubt
    that that is a significant factor for 99% of cases.


Phillip G.

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
– Leibnitz