Concurent (using threads) slower than sequential -doubt


#1

Hi Folks.
While starting to study the benefits of using threads in Ruby, I tried
to solve the following problem:

I have 3 text files ( numbers0.txt, numbers1.txt, c:\numbers2.txt ),
each file contains a very large list of numbers.
I attempt to read and compute each file by using a different thread.
Finally I tried to sum all subtotals to provide the final result.

Here is the code.

require ‘thread’
m_threads = []

print "INITIAL TIME := ", initial_time = Time.now, “\n”
3.times do |i|
m_threads[i] = Thread.new do
total_per_thread = 0
case i
when 0 then path = “C:\numbers0.txt”
when 1 then path = “C:\numbers1.txt”
when 2 then path = “C:\numbers2.txt”
end
File.open( path, “r” ) do |m_file|
while line = m_file.gets
total_per_thread = line.to_i + total_per_thread
end
Thread.current[:INDEX] = total_per_thread
end
end
end

result = 0
m_threads.each{ |t| t.join; result = t[:INDEX] + result; }

print "FINAL TIME := ", final_time = Time.now, “\n”
print "TOTAL TIME := ", total_time = final_time-initial_time, “\n”
print "Total := ", result, “\n”

=======================================
Output (CONCURRENT - Using Threads):

INITIAL TIME := Sun Oct 05 22:07:26 -0500 2008

FINAL TIME := Sun Oct 05 22:07:38 -0500 2008
TOTAL TIME := 11.485
Total := 1150000000

I verified and each thread made the job, result is OK too.
I also solved the same problem by using a sequential program with no
threads at all
Here is the code:

print "INITIAL Time := ", initial_time = Time.now, “\n”

paths = [ “C:\numbers0.txt”, “C:\numbers1.txt”, “C:\numbers2.txt” ]
result = 0
for m_path in paths
File.open( m_path, “r+” ) do |m_file|
while line = m_file.gets
result = line.to_i + result
end
end
end

print "FINAL time := ", final_time = Time.now, “\n”
print "TOTAL time := ", total_time = final_time - initial_time, “\n”
print "Total := ", result, “\n”

=======================================
Output: (SECUENCIAL- NO Threads)

INITIAL TIME := Sun Oct 05 22:34:47 -0500 2008
FINAL TIME := Sun Oct 05 22:34:57 -0500 2008
TOTAL TIME := 10.656
Total := 1150000000

=======================================
As you see, the thread based program run slower.
I thought that by using threads it will be faster, but it didn’t…Why
is it slower?

Any help will be very appreciated


#2

Hi,

In message “Re: Concurent (using threads) slower than sequential -doubt”
on Mon, 6 Oct 2008 12:40:08 +0900, Carlos O.
removed_email_address@domain.invalid writes:

|As you see, the thread based program run slower.
|I thought that by using threads it will be faster, but it didn’t…Why
|is it slower?

Threads require context switching, so that they tend to run slower,
especially green threads like Ruby 1.8 has.

          matz.

#3

Carlos O. wrote:

As you see, the thread based program run slower.
I thought that by using threads it will be faster, but it didn’t…Why
is it slower?

You may want to try with JRuby, which actually uses native threads. On a
multi-core system, it should improve performance.

  • Charlie

#4

2008/10/6 Yukihiro M. removed_email_address@domain.invalid

In message “Re: Concurent (using threads) slower than sequential -doubt”
on Mon, 6 Oct 2008 12:40:08 +0900, Carlos O. removed_email_address@domain.invalid writes:

|As you see, the thread based program run slower.
|I thought that by using threads it will be faster, but it didn’t…Why
|is it slower?

Threads require context switching, so that they tend to run slower,
especially green threads like Ruby 1.8 has.

There is another issue which may easily have a more serious impact:
since all three files reside in the same directory they are read from
the same physical device (most likely a local (S)ATA disk). And since
these files are large chances are that they are spread over the disk
and do not fit into the operating systems buffer cache. This will lead
to reasonably more head movement and less efficient disk caching than
the sequential approach.

Kind regards

robert


#5

2008/10/6 Carlos O. removed_email_address@domain.invalid:

#<Thread:0x29c5fc0 run>
I take out the t.join statement the interpreter throws an error:

PbaThreads.rb:10 : undefined method `+’ for nil:NilClass (NoMethodError)

Could you clarify this, please.

Well, this is obvious: you cannot access the result before it’s there.
Since you are setting this as the last statement in the thread you
need to wait (i.e. join) until the thread finishes.

Btw, you can use Thread#value for this. Here’s a variant:

require ‘benchmark’

files = (1…3).map {|i| “C:\numbers#{i}.txt”}

Benchmark.bmbm 10 do |b|
b.report “threaded” do
threads = files.map do |file|
Thread.new file do |f|
File.open f do |io|
io.inject(0) {|sum, l| sum + l.to_i}
end
end
end

puts threads.inject(0) {|sum, th| sum + th.value}

end

b.report “sequential” do
puts files.inject(0) {|s, f|
File.open f do |io|
io.inject(s) {|sum, l| sum + l.to_i}
end
}
end
end

Kind regards

robert


#6

Thank all of you (Matz, Charles and Robert)

Just one more doubt…

Since the threads I created really resides as an array that holds
threads object I tried to access each one by using [ ] notation:

for t in m_threads
print t[:INDEX], “\n”
end

The interpreter does not throw any error, but results always indicate:
nil
nil
nil

I tried to verify if they are still running:

Thread.list.each{|t| p t}

Results were:
#<Thread:0x29c5fc0 run>
#<Thread:0x29c6100 run>
#<Thread:0x29c6240 run>
#<Thread:0x294c74c run>

So indeed they are running… the doubt is…why I can’t access the
content of the array?
In fact in the statement
m_threads.each{ |t| t.join; result = t[:INDEX] + result; }

I just can compute result variable only after executing t.join… if
I take out the t.join statement the interpreter throws an error:

PbaThreads.rb:10 : undefined method `+’ for nil:NilClass (NoMethodError)

Could you clarify this, please.

Best Regards

Robert K. wrote:

2008/10/6 Yukihiro M. removed_email_address@domain.invalid

In message “Re: Concurent (using threads) slower than sequential -doubt”
on Mon, 6 Oct 2008 12:40:08 +0900, Carlos O. removed_email_address@domain.invalid writes:

|As you see, the thread based program run slower.
|I thought that by using threads it will be faster, but it didn’t…Why
|is it slower?

Threads require context switching, so that they tend to run slower,
especially green threads like Ruby 1.8 has.

There is another issue which may easily have a more serious impact:
since all three files reside in the same directory they are read from
the same physical device (most likely a local (S)ATA disk). And since
these files are large chances are that they are spread over the disk
and do not fit into the operating systems buffer cache. This will lead
to reasonably more head movement and less efficient disk caching than
the sequential approach.

Kind regards

robert


#7

[2] …if you’re on a multicore machine. Oops. Will be fixed in
the next release.

It’s released…

gegroet,
Erik V.


#8

Erik V. wrote:

[2] …if you’re on a multicore machine. Oops. Will be fixed in
the next release.

It’s released…

gegroet,
Erik V.

Thank you Erik and Robert…

I will try on both environments.

Regards
Carlos


#9

If you are on Linux, you might want to have a look at the gem
“forkandreturn” [1]. ForkAndReturn handles each element in an
enumeration in a seperate process [2].

gegroet,
Erik V. - http://www.erikveen.dds.nl/

[1] http://www.erikveen.dds.nl/forkandreturn/doc/index.html

[2] …if you’re on a multicore machine. Oops. Will be fixed in
the next release.


$ cat count1.rb
files = [“numbers0.txt”, “numbers1.txt”, “numbers2.txt”]
result = 0

files.collect do |file|
res = 0

File.open(file) do |file|
file.each do |line|
res += line.to_i
end
end

res
end.each do |res|
result += res
end

p result


$ diff -ur count[12].rb | clean_diff
+require “forkandreturn”
+
files = [“numbers0.txt”, “numbers1.txt”, “numbers2.txt”]
result = 0

-files.collect do |file|
+files.concurrent_collect do |file|
res = 0

File.open(file) do |file|

$ time ruby count1.rb
81627450482688

real 0m15.309s
user 0m15.201s
sys 0m0.076s


$ time ruby count2.rb
81627450482688

real 0m8.976s <=== Multicore!
user 0m17.177s <=== Multicore!
sys 0m0.204s


$ uname -a
Linux laptop 2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008
i686 GNU/Linux


$ ruby --version
ruby 1.8.6 (2008-06-20 patchlevel 230) [i686-linux]


$ gem list | grep -ie forkandreturn
forkandreturn (0.2.0)


#10

Carlos, that sounds about correct. I did some similar tests early this
year[1]. Basically your problem is that Ruby runs on one kernel
thread/LWP irrespective of how many user land threads you create. It’s
expensive to switch between threads(cost varies depending on which
hardware platform you’re running on) - so these two factors combine to
make it slower for you when you use threads.

JRuby was almost just as bad until JRuby 1.1.1 after which it started
doing better with threads(this was due to a bug fix by Charles [2]).
It’s now much better at scaling with threads compared with MRI, but
still quite poor in absolute terms[3] - it’s scalability on an
embarrassingly threaded program eroded 54% jumping from 1 to 2 threads
and became worse after that. (Caveat: My numbers are old, they’re
from March, and things may have gotten much better since!)

[1]
http://blogs.sun.com/prashant/resource/files/jruby-ruby-comparison.xls
[2] Ref to Charles’ entry
http://blog.headius.com/2008/04/shared-data-considered-harmful.html
[3] http://blogs.sun.com/prashant/resource/files/jruby-threads.xls

-ps