Forum: Ruby-core [ruby-trunk - Bug #9153][Open] IO#flush causes unnecessary fsync on Windows

Dfa842ab64f794363e66d7cce85ba277?d=identicon&s=25 Alexey Borzenkov (snaury)
on 2013-11-25 15:17
(Received via mailing list)
Issue #9153 has been reported by snaury (Alexey Borzenkov).

----------------------------------------
Bug #9153: IO#flush causes unnecessary fsync on Windows
https://bugs.ruby-lang.org/issues/9153

Author: snaury (Alexey Borzenkov)
Status: Open
Priority: Normal
Assignee:
Category: core
Target version:
ruby -v: ruby 2.0.0p353 (2013-11-22) [i386-mingw32]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


On Windows calling IO#flush is effectively identical to calling
IO#fsync, i.e. contents of the file are committed to disk platters
instead of just being flushed. I traced it back to bug #776 where the
original "bug" was worked around by forcing fsync to happen on flushes.
Unfortunately due to this change IO#flush becomes unusable, as fsync are
very expensive, e.g. on one of my machines I had fsync taking up to
150ms and I heard stories of machines where fsync takes on the order of
2000ms.

Originally I discovered this problem where my script would print out a
couple hundred lines using Kernel#p, and to my astonishment when I
redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there
was no issue to begin with. It's not even due to Windows per se why file
size is not updated, it's due to how NTFS driver is optimized to not
update file size (in the directory entry) until the file is closed.
Please read this blog post on details about what's going on:
http://blogs.msdn.com/b/oldnewthing/archive/2011/1...

What I mean is that IO#flush without fsync properly flushes all the data
to the file, you can read all this data from another process, the only
thing that is not updated is directory entry metadata (until the file is
closed), which is by design, it's how it's supposed to work on Windows
with NTFS filesystem. The workaround (i.e. fsync) working is more of an
accident, it's just when OS is forced to write all that data to disk it
currently tries to create a consistent picture and updates directory
metadata as well, there's nothing saying that it would keep doing that
in the future. Worst of all is that original bug was about temporary
files, and fsync during IO#flush forces them to be written to disk, even
if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround
correct Windows behavior and make it unbearably slow. Instead, people
need to learn how filesystems work on Windows and learn to close files
if they are finished writing to them and really need directory metadata
to be updated (however most of the time people shouldn't care about
directory metadata like file size, it's just some arbitrary cached value
and is not necessarily true all of the time).
Eabad423977cfc6873b8f5df62b848a6?d=identicon&s=25 hsbt (Hiroshi SHIBATA) (Guest)
on 2013-11-29 10:51
(Received via mailing list)
Issue #9153 has been updated by hsbt (Hiroshi SHIBATA).

Assignee set to usa (Usaku NAKAMURA)


----------------------------------------
Bug #9153: IO#flush causes unnecessary fsync on Windows
https://bugs.ruby-lang.org/issues/9153#change-43247

Author: snaury (Alexey Borzenkov)
Status: Open
Priority: Normal
Assignee: usa (Usaku NAKAMURA)
Category: core
Target version:
ruby -v: ruby 2.0.0p353 (2013-11-22) [i386-mingw32]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


On Windows calling IO#flush is effectively identical to calling
IO#fsync, i.e. contents of the file are committed to disk platters
instead of just being flushed. I traced it back to bug #776 where the
original "bug" was worked around by forcing fsync to happen on flushes.
Unfortunately due to this change IO#flush becomes unusable, as fsync are
very expensive, e.g. on one of my machines I had fsync taking up to
150ms and I heard stories of machines where fsync takes on the order of
2000ms.

Originally I discovered this problem where my script would print out a
couple hundred lines using Kernel#p, and to my astonishment when I
redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there
was no issue to begin with. It's not even due to Windows per se why file
size is not updated, it's due to how NTFS driver is optimized to not
update file size (in the directory entry) until the file is closed.
Please read this blog post on details about what's going on:
http://blogs.msdn.com/b/oldnewthing/archive/2011/1...

What I mean is that IO#flush without fsync properly flushes all the data
to the file, you can read all this data from another process, the only
thing that is not updated is directory entry metadata (until the file is
closed), which is by design, it's how it's supposed to work on Windows
with NTFS filesystem. The workaround (i.e. fsync) working is more of an
accident, it's just when OS is forced to write all that data to disk it
currently tries to create a consistent picture and updates directory
metadata as well, there's nothing saying that it would keep doing that
in the future. Worst of all is that original bug was about temporary
files, and fsync during IO#flush forces them to be written to disk, even
if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround
correct Windows behavior and make it unbearably slow. Instead, people
need to learn how filesystems work on Windows and learn to close files
if they are finished writing to them and really need directory metadata
to be updated (however most of the time people shouldn't care about
directory metadata like file size, it's just some arbitrary cached value
and is not necessarily true all of the time).
8cbb39dadafaf2287a83a13ee4981ec9?d=identicon&s=25 usa (Usaku NAKAMURA) (Guest)
on 2013-12-01 17:04
(Received via mailing list)
Issue #9153 has been updated by usa (Usaku NAKAMURA).

Status changed from Open to Feedback

Thank you for your long description.
I would like to also know how we educate all the people to take care of
Windows.
----------------------------------------
Bug #9153: IO#flush causes unnecessary fsync on Windows
https://bugs.ruby-lang.org/issues/9153#change-43314

Author: snaury (Alexey Borzenkov)
Status: Feedback
Priority: Normal
Assignee: usa (Usaku NAKAMURA)
Category: core
Target version:
ruby -v: ruby 2.0.0p353 (2013-11-22) [i386-mingw32]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


On Windows calling IO#flush is effectively identical to calling
IO#fsync, i.e. contents of the file are committed to disk platters
instead of just being flushed. I traced it back to bug #776 where the
original "bug" was worked around by forcing fsync to happen on flushes.
Unfortunately due to this change IO#flush becomes unusable, as fsync are
very expensive, e.g. on one of my machines I had fsync taking up to
150ms and I heard stories of machines where fsync takes on the order of
2000ms.

Originally I discovered this problem where my script would print out a
couple hundred lines using Kernel#p, and to my astonishment when I
redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there
was no issue to begin with. It's not even due to Windows per se why file
size is not updated, it's due to how NTFS driver is optimized to not
update file size (in the directory entry) until the file is closed.
Please read this blog post on details about what's going on:
http://blogs.msdn.com/b/oldnewthing/archive/2011/1...

What I mean is that IO#flush without fsync properly flushes all the data
to the file, you can read all this data from another process, the only
thing that is not updated is directory entry metadata (until the file is
closed), which is by design, it's how it's supposed to work on Windows
with NTFS filesystem. The workaround (i.e. fsync) working is more of an
accident, it's just when OS is forced to write all that data to disk it
currently tries to create a consistent picture and updates directory
metadata as well, there's nothing saying that it would keep doing that
in the future. Worst of all is that original bug was about temporary
files, and fsync during IO#flush forces them to be written to disk, even
if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround
correct Windows behavior and make it unbearably slow. Instead, people
need to learn how filesystems work on Windows and learn to close files
if they are finished writing to them and really need directory metadata
to be updated (however most of the time people shouldn't care about
directory metadata like file size, it's just some arbitrary cached value
and is not necessarily true all of the time).
Dfa842ab64f794363e66d7cce85ba277?d=identicon&s=25 Alexey Borzenkov (snaury)
on 2013-12-03 19:10
(Received via mailing list)
Issue #9153 has been updated by snaury (Alexey Borzenkov).


usa (Usaku NAKAMURA) wrote:
> Thank you for your long description.
> I would like to also know how we educate all the people to take care of Windows.

I'm not sure what you mean? One idea would be to change
test_size_flushes_buffer_before_determining_file_size in
test/test_tempfile.rb to skip the last assert on /mswin|mingw/ with a
comment on why. So the next time it comes up there would be a record of
why flushing is not supposed to change file size in a directory entry.
Also in lib/minitest/unit.rb it should be a.close and b.close instead of
a.flush and b.flush (you don't need to keep files open when you only
need a filename).

Also, maybe update tempfile.rb to mention this file size not being in
sync as expected on Windows, and that people should close their
tempfiles before giving filenames to other processes, etc.? (since it
seems like old unexpected behavior was mostly with tempfiles, and not
closing tempfiles in a block is so strangely common).
----------------------------------------
Bug #9153: IO#flush causes unnecessary fsync on Windows
https://bugs.ruby-lang.org/issues/9153#change-43396

Author: snaury (Alexey Borzenkov)
Status: Feedback
Priority: Normal
Assignee: usa (Usaku NAKAMURA)
Category: core
Target version:
ruby -v: ruby 2.0.0p353 (2013-11-22) [i386-mingw32]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


On Windows calling IO#flush is effectively identical to calling
IO#fsync, i.e. contents of the file are committed to disk platters
instead of just being flushed. I traced it back to bug #776 where the
original "bug" was worked around by forcing fsync to happen on flushes.
Unfortunately due to this change IO#flush becomes unusable, as fsync are
very expensive, e.g. on one of my machines I had fsync taking up to
150ms and I heard stories of machines where fsync takes on the order of
2000ms.

Originally I discovered this problem where my script would print out a
couple hundred lines using Kernel#p, and to my astonishment when I
redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there
was no issue to begin with. It's not even due to Windows per se why file
size is not updated, it's due to how NTFS driver is optimized to not
update file size (in the directory entry) until the file is closed.
Please read this blog post on details about what's going on:
http://blogs.msdn.com/b/oldnewthing/archive/2011/1...

What I mean is that IO#flush without fsync properly flushes all the data
to the file, you can read all this data from another process, the only
thing that is not updated is directory entry metadata (until the file is
closed), which is by design, it's how it's supposed to work on Windows
with NTFS filesystem. The workaround (i.e. fsync) working is more of an
accident, it's just when OS is forced to write all that data to disk it
currently tries to create a consistent picture and updates directory
metadata as well, there's nothing saying that it would keep doing that
in the future. Worst of all is that original bug was about temporary
files, and fsync during IO#flush forces them to be written to disk, even
if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround
correct Windows behavior and make it unbearably slow. Instead, people
need to learn how filesystems work on Windows and learn to close files
if they are finished writing to them and really need directory metadata
to be updated (however most of the time people shouldn't care about
directory metadata like file size, it's just some arbitrary cached value
and is not necessarily true all of the time).
8cbb39dadafaf2287a83a13ee4981ec9?d=identicon&s=25 unknown (Guest)
on 2014-03-02 18:57
(Received via mailing list)
Issue #9153 has been updated by Usaku NAKAMURA.

Status changed from Feedback to Closed
% Done changed from 0 to 100

Applied in changeset r45254.

----------
* io.c (rb_io_flush_raw, rb_io_fsync): [EXPERIMENTAL] remove force
  syncing for Win32 to speed up IO.  this may break some tests, and
  they'll be fixed later.
  [ruby-core:58570] [Bug #9153]

----------------------------------------
Bug #9153: IO#flush causes unnecessary fsync on Windows
https://bugs.ruby-lang.org/issues/9153#change-45578

* Author: Alexey Borzenkov
* Status: Closed
* Priority: Normal
* Assignee: Usaku NAKAMURA
* Category: core
* Target version:
* ruby -v: ruby 2.0.0p353 (2013-11-22) [i386-mingw32]
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN
----------------------------------------
On Windows calling IO#flush is effectively identical to calling
IO#fsync, i.e. contents of the file are committed to disk platters
instead of just being flushed. I traced it back to bug #776 where the
original "bug" was worked around by forcing fsync to happen on flushes.
Unfortunately due to this change IO#flush becomes unusable, as fsync are
very expensive, e.g. on one of my machines I had fsync taking up to
150ms and I heard stories of machines where fsync takes on the order of
2000ms.

Originally I discovered this problem where my script would print out a
couple hundred lines using Kernel#p, and to my astonishment when I
redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there
was no issue to begin with. It's not even due to Windows per se why file
size is not updated, it's due to how NTFS driver is optimized to not
update file size (in the directory entry) until the file is closed.
Please read this blog post on details about what's going on:
http://blogs.msdn.com/b/oldnewthing/archive/2011/1...

What I mean is that IO#flush without fsync properly flushes all the data
to the file, you can read all this data from another process, the only
thing that is not updated is directory entry metadata (until the file is
closed), which is by design, it's how it's supposed to work on Windows
with NTFS filesystem. The workaround (i.e. fsync) working is more of an
accident, it's just when OS is forced to write all that data to disk it
currently tries to create a consistent picture and updates directory
metadata as well, there's nothing saying that it would keep doing that
in the future. Worst of all is that original bug was about temporary
files, and fsync during IO#flush forces them to be written to disk, even
if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround
correct Windows behavior and make it unbearably slow. Instead, people
need to learn how filesystems work on Windows and learn to close files
if they are finished writing to them and really need directory metadata
to be updated (however most of the time people shouldn't care about
directory metadata like file size, it's just some arbitrary cached value
and is not necessarily true all of the time).

---Files--------------------------------
no-fsync-on-flush.patch (799 Bytes)
8cbb39dadafaf2287a83a13ee4981ec9?d=identicon&s=25 unknown (Guest)
on 2014-03-02 20:45
(Received via mailing list)
Issue #9153 has been updated by Usaku NAKAMURA.

ruby -v changed from ruby 2.0.0p353 (2013-11-22) [i386-mingw32] to -

 Hi,

 In message "Re: [ruby-trunk - Bug #9153] [Closed] IO#flush causes
unnecessary fsync on Windows"
     on Mar.03,2014 04:30:36, <usa@garbagecollect.jp> wrote:
 > > Ah, sorry, I can see now that it was already reverted. However it
was
 > > reverted together with #ifndef _WIN32. That #ifndef is not needed,
i.e.
 > > rb_thread_io_blocking_region(nogvl_fsync, fptr, fptr->fd) should be
called
 > > unconditionally.

 Oops, I mistook!
 You are completely right.


 Regards,
 --
 U.Nakamura <usa@garbagecollect.jp>

----------------------------------------
Bug #9153: IO#flush causes unnecessary fsync on Windows
https://bugs.ruby-lang.org/issues/9153#change-45579

* Author: Alexey Borzenkov
* Status: Closed
* Priority: Normal
* Assignee: Usaku NAKAMURA
* Category: core
* Target version:
* ruby -v: -
* Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN
----------------------------------------
On Windows calling IO#flush is effectively identical to calling
IO#fsync, i.e. contents of the file are committed to disk platters
instead of just being flushed. I traced it back to bug #776 where the
original "bug" was worked around by forcing fsync to happen on flushes.
Unfortunately due to this change IO#flush becomes unusable, as fsync are
very expensive, e.g. on one of my machines I had fsync taking up to
150ms and I heard stories of machines where fsync takes on the order of
2000ms.

Originally I discovered this problem where my script would print out a
couple hundred lines using Kernel#p, and to my astonishment when I
redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there
was no issue to begin with. It's not even due to Windows per se why file
size is not updated, it's due to how NTFS driver is optimized to not
update file size (in the directory entry) until the file is closed.
Please read this blog post on details about what's going on:
http://blogs.msdn.com/b/oldnewthing/archive/2011/1...

What I mean is that IO#flush without fsync properly flushes all the data
to the file, you can read all this data from another process, the only
thing that is not updated is directory entry metadata (until the file is
closed), which is by design, it's how it's supposed to work on Windows
with NTFS filesystem. The workaround (i.e. fsync) working is more of an
accident, it's just when OS is forced to write all that data to disk it
currently tries to create a consistent picture and updates directory
metadata as well, there's nothing saying that it would keep doing that
in the future. Worst of all is that original bug was about temporary
files, and fsync during IO#flush forces them to be written to disk, even
if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround
correct Windows behavior and make it unbearably slow. Instead, people
need to learn how filesystems work on Windows and learn to close files
if they are finished writing to them and really need directory metadata
to be updated (however most of the time people shouldn't care about
directory metadata like file size, it's just some arbitrary cached value
and is not necessarily true all of the time).

---Files--------------------------------
no-fsync-on-flush.patch (799 Bytes)
This topic is locked and can not be replied to.