Multithreaded file access

Hi…

I’ve a class which is run by many threads at the same time… this
class has to append a line to a text file eventually.

I do this with:

File.new(‘filename’,‘a’).puts(“this is the string”)

Is this already thread safe??? how can I make it so???

Thanks…

Well, I think it’s OK to do that.
Seeing is believing:

first one

a = Thread.new do
5.times do
f = File.new(“qq.txt”, “a”).puts “I am a…”
f.close if f
end
end
b = Thread.new do
5.times do
f = File.new(“qq.txt”, “a”).puts “I am b…”
f.close if f
end
end
a.join
b.join

and this one

f = File.new(“test.txt”, “a”)
a = Thread.new do
5.times do
f.puts “I am a…”
sleep 1
end
end
b = Thread.new do
5.times do
f.puts “I am b…”
sleep 1
end
end
a.join
b.join
f.close

Both program are ok. (But I am not sure myself:)

On Dec 23, 2005, at 12:12 PM, Jellen wrote:

5.times do
5.times do
a.join
b.join
f.close

Both program are ok. (But I am not sure myself:)

Interesting examples, Jellen, but I don’t think it answers Matias’
question, which was

File.new(‘filename’,‘a’).puts(“this is the string”)

Is this already thread safe??? how can I make it so???

Correct me if I’m wrong, but your examples only prove that the thread
on the CPU will be able to append the file. I think Matias wants
to know if the statement ( File.new(‘filename’,‘a’).puts(“this is the
string”) ) is atomic. Or in other words, do you need to enforce
mutual exclusive access to the file with a mutex? Unfortunately, I
don’t have an answer to that question.

On 12/23/05, J. Ryan S. removed_email_address@domain.invalid wrote:

Correct me if I’m wrong, but your examples only prove that the thread
on the CPU will be able to append the file. I think Matias wants
to know if the statement ( File.new(‘filename’,‘a’).puts(“this is the
string”) ) is atomic. Or in other words, do you need to enforce
mutual exclusive access to the file with a mutex? Unfortunately, I
don’t have an answer to that question.

[kig@jugend:~] cat fw_test.rb
def writer(i, fn, ok)
Thread.new{
t_str = “#{i}” * 65536
while ok.first
File.open(fn, ‘a’){|f|
f.puts t_str
}
end
}
end

ok = [true]
fn = ‘fw_test.dat’
ts = (1…3).map{|i| writer i, fn, ok}

sleep 10

ok[0] = false

if File.readlines(fn).uniq.size == ts.size
puts “puts in different threads seems to be atomic”
else
puts “puts in different threads isn’t atomic”
end

File.unlink fn

[kig@jugend:~] ruby fw_test.rb
puts in different threads seems to be atomic

On Dec 23, 2005, at 4:13 PM, Ilmari H. wrote:

end
puts “puts in different threads seems to be atomic”
else
puts “puts in different threads isn’t atomic”
end

File.unlink fn

[kig@jugend:~] ruby fw_test.rb
puts in different threads seems to be atomic

Crafty test program. Coincidentally, my results differ from yours.

$ ruby fw_test.rb
puts in different threads isn’t atomic
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.3.0]

I’m getting 8 unique lines in File.readlines(fn).uniq, as opposed to
the 3 thread objects in ts. Assuming I understand the program, that
means threads are not waiting their turn like they should. FYI, I’ve
compiled my own ruby executable from DarwinPorts instead of using the
‘broken’ one shipped in OS X Tiger.

However, if I change the multiplication factor in line 3 to 655, then

$ ruby fw_test.rb
puts in different threads seems to be atomic

~ ryan ~

PS - I love the 205+ MB of text this app generates! :smiley:

On Dec 23, 2005, at 5:23 PM, Ilmari H. wrote:

ruby 1.8.2 (2004-12-25) [powerpc-darwin8.3.0]

PS - I love the 205+ MB of text this app generates! :smiley:

Ilmari

Regardless, the important thing is to know that it seems file IO via
puts is not thread safe, which I kind of assumed from the beginning.
This begs the question: which methods, especially concerning file IO,
are thread safe? (if any)

~ ryan ~

J. Ryan S.
escribió:>>> Crafty test program. Coincidentally, my results differ from yours.

Regardless, the important thing is to know that it seems file IO via
puts is not thread safe, which I kind of assumed from the beginning.
This begs the question: which methods, especially concerning file IO,
are thread safe? (if any)

~ ryan ~

Hey!!.. I’m now more confused than before posting :smiley:

On 12/23/05, J. Ryan S. removed_email_address@domain.invalid wrote:

On Dec 23, 2005, at 4:13 PM, Ilmari H. wrote:

[kig@jugend:~] ruby fw_test.rb
puts in different threads seems to be atomic

Crafty test program. Coincidentally, my results differ from yours.

$ ruby fw_test.rb
puts in different threads isn’t atomic
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.3.0]

Good! I thought it was odd that they seemed to be atomic…

And now we know to mutex disk writes (or use flock.)

I’m getting 8 unique lines in File.readlines(fn).uniq, as opposed to
the 3 thread objects in ts. Assuming I understand the program, that
means threads are not waiting their turn like they should. FYI, I’ve
compiled my own ruby executable from DarwinPorts instead of using the
‘broken’ one shipped in OS X Tiger.

However, if I change the multiplication factor in line 3 to 655, then

$ ruby fw_test.rb
puts in different threads seems to be atomic

I don’t know the reason (can guess, but too unsure)
Explanation anyone?

~ ryan ~

PS - I love the 205+ MB of text this app generates! :smiley:

Ilmari

Eero S. escribió:

are thread safe? (if any)
is using Mutex from the core. Consult your ri or make a visit
to http://www.ruby-doc.org.

E

Thanks everyone for replying…

Matias S. wrote:

J. Ryan S.
escribi󺦧t;>> Crafty test program. Coincidentally, my results differ from yours.

Regardless, the important thing is to know that it seems file IO via
puts is not thread safe, which I kind of assumed from the beginning.
This begs the question: which methods, especially concerning file IO,
are thread safe? (if any)

~ ryan ~

Hey!!.. I’m now more confused than before posting :smiley:

You should guard your critical section :slight_smile: The simplest way
is using Mutex from the core. Consult your ri or make a visit
to http://www.ruby-doc.org.

E

On Dec 23, 2005, at 6:42 PM, Matias S. wrote:

Hey!!.. I’m now more confused than before posting :smiley:

http://www.rubycentral.com/book/tut_threads.html

Pay special attention to the section on mutual exclusion.

~ ryan ~

On Dec 23, 2005, at 11:38 AM, Matias S. wrote:

I’ve a class which is run by many threads at the same time… this
class has to append a line to a text file eventually.

I do this with:

File.new(‘filename’,‘a’).puts(“this is the string”)

On Posix file systems, writes to a file in append mode are
atomic but that is only true when you make a direct system
call. In Ruby that would be a call to IO#syswrite, which
bypasses all the standard buffering of puts and company.
You can’t mix and match these types of IO calls. It is one
or the other.

Caveats: I think there is a limit on the size of the write
that will guaranteed to be atomic. The Unix system calls
pathconf and fpathconf give you access to the PIBE_BUF limit
that specifies this limit for pipes. I’m not sure if there
is a similar limit for files. I just couldn’t locate anything
specifically in my quick research.

Related Question: Does Ruby have its own buffering methodology
or does it use the C stdio library buffering for file I/O?
I just don’t know enough about the Ruby internals to answer
this question.

Gary W.

J. Ryan S. wrote on 12/23/2005 4:59 PM:

Crafty test program. Coincidentally, my results differ from yours.

i was in a similar situation.

i was under the impression from page 118 of my well thumbed copy of
‘programming ruby’ that any shared resource (eg. a file that will be
access by more than one thread) should be under the control of a
mutex…

the program i was working is long running (and been in use for almost
a year) and seems to be holding up ok…

Matias S. removed_email_address@domain.invalid writes:

Hi…

I’ve a class which is run by many threads at the same time… this
class has to append a line to a text file eventually.

I do this with:

File.new(‘filename’,‘a’).puts(“this is the string”)

Don’t do this. It’s not an atomic operation, either with buffered or
unbuffered IO.

Is this already thread safe???

No.

how can I make it so???

If thread A outputs two line, and thread B outputs one line, would
this be acceptable?

A-1
B-1
A-2

Or should it be:

A-1
A-2
B-1

How granular do you want it? Per line? Per thread output?

Please investigate the mutex library.

file_mutex.synchronize {
file.puts("…")
file.puts("…")
}

YS.

removed_email_address@domain.invalid wrote:

On Dec 23, 2005, at 11:38 AM, Matias S. wrote:

I’ve a class which is run by many threads at the same time… this
class has to append a line to a text file eventually.

I do this with:

File.new(‘filename’,‘a’).puts(“this is the string”)

Apart from MT issues this code has serious different issues: you do not
close the file (and you cannot because the IO is not returned from
puts).
You rather want

File.open(‘filename’,‘a’) {|io| io.puts(“this is the string”)}

Otherwise you risk that the text is not written to the file in the
proper
order - or you get even problems opening the file multiple times.

There are two possible solutions: synchronize every access to a file or
use
a Queue and a separate writing thread (might be better performance wise,
because you don’t have to reopen the file all the time).

Kind regards

robert