What's the difference between a Thread and a Process?

They seem similar, but by no means identical. My guess is that Ruby
Threads are parallel execution paths within a single OS-level ‘task,’
but a Process is handed to the OS to be run as an independent task at
the OS level?

On May 9, 2006, at 3:28 PM, Dave H. wrote:

They seem similar, but by no means identical. My guess is that Ruby
Threads are parallel execution paths within a single OS-level
‘task,’ but a Process is handed to the OS to be run as an
independent task at the OS level?

Yes, but threads can also be implemented at the OS level. (Ruby’s
threads currently are not).

On May 09, 2006, at 8:41 pm, Logan C. wrote:

Yes, but threads can also be implemented at the OS level. (Ruby’s
threads currently are not).

The JRuby project maps Ruby threads to Java threads, and I think they
in turn get mapped to system threads. So JRuby probably has a better
threading model than normal Ruby! (Still waiting to see what Matz
turns out for the Ruby VM)

Ashley

On 5/9/06, Ashley M. [email protected] wrote:

threading model than normal Ruby! (Still waiting to see what Matz
turns out for the Ruby VM)

Ashley

Ruby uses what are often called green threads. The OS is unaware of
them and they operate in the context of a single process/thread under
the OS with all task switching done by the Ruby interpreter. If you
want to create an application with 2000 threads this is actually a
pretty good thing. If you were hoping to spready your work load across
a mutil-core/cpu box, not so much. Also be aware that any OS blocking
operation in an extension will block all of your threads.

For more information on threads you might find
comp.programming.threads FAQ [last mod 97/5/24] helpful.

pth

On May 9, 2006, at 16:11, Ohad L. wrote:

I think the more notable difference is that threads share variables,
whereas forks don’t:

Aha. Now that’s interesting. What DO forks share? Apparently Signals
can move back and forth, to some degree…

This came up because I used Process to have a command-line script that
gave me the prompt back without me having to use a & at the end of the
command. Then I tried getting the ‘main’ path to hold up until the
sub-process had completed it’s initialization, but found that it wasn’t
possible to get information out of a Process. A Thread, sure, no
problem asking for Thread.status. But not a Process. I didn’t bother
trying a Thread, though, since I rather suspected that the Thread would
be unable to let my script actually complete and release the prompt.

Note: actually accomplishing the above task isn’t required. I just
fiddled with it to learn something. Which is another way of saying I
don’t currently need any suggestions on how I might accomplish that,
especially if they’re more than two or three lines of code. It’s not
worth the extra code. :slight_smile:

They also share open file “handles”, at least on *nix systems, at the
point the fork occurs. (I’m not sure about Windows.) Each process may
then close some file handles (including stdin, stdout, and stderr) and
open others, independently of the other processes.

On 5/9/06, Dave H. [email protected] wrote:

On May 9, 2006, at 16:11, Ohad L. wrote:

I think the more notable difference is that threads share variables,
whereas forks don’t:

Aha. Now that’s interesting. What DO forks share? Apparently Signals
can move back and forth, to some degree…


Dean W.
http://www.aspectprogramming.com

http://www.contract4j.org

Dave H. wrote:

They seem similar, but by no means identical. My guess is that Ruby
Threads are parallel execution paths within a single OS-level ‘task,’
but a Process is handed to the OS to be run as an independent task at
the OS level?

I think the more notable difference is that threads share variables,
whereas forks don’t:

$global = 0
fork { $global = 1 }
Process.wait # Wait for fork to finish
$global
=> 0
t = Thread.new { $global = 1 }
t.join # Wati for thread to finish
$global
=> 1

hi Francis, your reply is awesome, i come across this thread when i am
searching for an answer for Solaris lwp…really great piece of
explanation of thread and process

In Unix, every process in the system (except the very first one) is
created
by forking from a parent process. Forking creates a carbon-copy of the
parent process, inlcuding open file descriptors, memory stack and heap,
signal handlers and signal masks. It also creates a parent-child
relationship between the two processes involved in the fork which comes
into
play when either process ends. (If the parent ends first, the child
becomes
the child of “init,” the process with pid 1. If the child ends first, it
becomes a “zombie” which the parent is required to clean up by calling
wait.) After a fork, the child process very often calls exec, which
replaces
the running process with a new executable. There are also a lot of
subtleties relating to process groups and controlling terminals that
you’ll
want to read up on if you want to be an expert. Forking a process is a
relatively heavy thing to do, but all modern Unixes make it reasonably
efficient through the magic of virtual memory and the system call vfork.
The
memory pages shared by parent and child are only copied when they change
(“copy-on-write”).

Threads are distinct streams of control flow that are established within
processes. Creating a thread involves little more than the allocation of
a
separate stack and (in some implementations) a kernel-crossing to set up
some data structures in the kernel. Threads in a process have no
parent-child relationship with each other, and are scheduled
independently
according to some discipline which varies by implementation. Since all
the
threads in a process share file descriptors, memory, and signal masks,
they
can easily interfere with each other by changing data that is visible to
other threads in ways that the other threads don’t expect. This opens up
the
storied and vexed topic of synchronization, which is a subject for
another
post.

The Unix thread-API standard (Posix 1003) has been around for about 10
years, and in that time there has been a lot of debate over how best to
implement threads. You can implement them entirely within a process
(which
means that library functions need to schedule them), or you can
implement
them as kernel-scheduled entities (aka “lightweight processes”). Native
threads in Windows and Linux are LWPs. Solaris has an elaborate hybrid
model
(“m-n”) in which userland threads are scheduled onto LWPs by a
combination
of library and syscalls, and the LWPs in turn are scheduled by the
kernel
onto the processors. After a lot of controversy and some competing
implementations, a greatly improved 1-N scheduling discipline was
introduced
in Linux for the 2.6 kernels. (Linux threads today are reasonably useful
and
scalable, which was far from true in the recent past.) It’s now
more-or-less
accepted that elaborate scheduling schemes are more trouble than they
are
worth, and for several versions now, even Solaris has defaulted to a
“fair-weighted” discipline in which each thread gets pre-empted after a
particular quantum of time (unless it yields). The simplest (and
reasonably
correct) modern view of threads is as mini-processes that will reliably
be
scheduled regardless of the behavior of other threads.

Someone pointed out that Ruby threads are “green,” which means they are
visible only inside the process and not visible to the kernel. They are
scheduled and manipulated through library calls rather than system
calls.
The early Java implementations did the same thing, but all Javas now use
the
“native threads” provided on their respective platforms. Java still
suffers
from remarkably poor thread-scheduling performance, and it’s mysterious
to
me why Java programmers insist on overusing threads. They are of course
tres
chic, and there is an unshakeable misperception that threads make
programs
run faster. In the future, Ruby’s threads will also be native threads.
Let’s
hope the Ruby implementation is better than Java’s.

Around the time that Posix threads were standardized, Windows sprouted a
feature called “fibers.” Microsoft apparently didn’t like the fact that
a
Solaris program could gleefully spin tens of thousands of threads
without
breaking a sweat, whereas Windows “threads” were (and are) extremely
heavy
and bogged down by very slow synchronization primitives. (Once for fun,
I
implemented an uncounted fast mutex for multi-processor Windows machines
in
x86 assembler with no kernel-crossings. It ran at least 10 times faster
than
the native intra-process mutexes, which Microsoft calls “critical
sections”). So they added a userland thread that they called a “fiber.”
It
was interesting until you read the fine print: Microsoft strongly
discouraged the use of fibers except for masochists who wanted to
implement
Posix threads on Windows.

Hope all this was somewhat useful to you. Threads in my experience are
one
of the most seductive things in the programmer’s toolchest, but they are
subtle and dangerous, and seriously overused. I get criticized
constantly
for this, but I believe threads should be avoided. There are only a few
situations where they are genuinely useful. They’re easy enough to
understand but difficult to truly master, and synchronization is a fine
art.
Threads are very hard to debug, which I think wipes out their benefits
in
most cases.

Patrick H. wrote in post #75228:

Ruby uses what are often called green threads. The OS is unaware of
them and they operate in the context of a single process/thread under
the OS with all task switching done by the Ruby interpreter.

Since someone resurrected this old thread I would like to point out that
the above is not true any more for Ruby MRI since 1.9. From this
version on native threads are used - albeit there is still the GIL
(Global Interpreter Lock) which reduces concurrency of threads
dramatically giving a performance similar to the green threads. (There
may be some situations where there is some improvement, IIRC with loads
of work done in non Ruby code such as a native C extension.)

If you
want to create an application with 2000 threads this is actually a
pretty good thing. If you were hoping to spready your work load across
a mutil-core/cpu box, not so much. Also be aware that any OS blocking
operation in an extension will block all of your threads.

For IO operations in Ruby’s standard library that is not true though:
they use non blocking IO to allow the interpreter do some other work
while a channel is not read to read or write even with earlier green
thread versions.

Cheers

robert

The processes and threads are independent sequences of execution, the
typical difference is that threads run in a shared memory space, while
processes run in separate memory spaces.

A process has a self contained execution environment that means it has a
complete, private set of basic run time resources purticularly each
process has its own memory space. Threads exist within a process and
every process has at least one thread.

Each process provides the resources needed to execute a program. Each
process is started with a single thread, known as the primary thread. A
process can have multiple threads in addition to the primary thread.

On a multiprocessor system, multiple processes can be executed in
parallel. Multiple threads of control can exploit the true parallelism
possible on multiprocessor systems.

More about thread and process…

Rampat

Dave H. wrote in post #75191:

They seem similar, but by no means identical. My guess is that Ruby
Threads are parallel execution paths within a single OS-level ‘task,’
but a Process is handed to the OS to be run as an independent task at
the OS level?