Scattered I/O (on MS Windows)

danielb · October 23, 2007, 6:07pm

Hi all,

Park H. and I came up with a custom implementation for
IO.readlines using scattered I/O I thought would be fun to share. I
think I’m seeing a 2x performance increase, but page caching is making
it difficult to tell. Also, it looks like the main profiling issue is
the call to ‘split’ at the end, so you can remove that last bit of
logic if you want to see the speed without it.

What do folks think? Are you seeing a performance increase? You
probably won’t see any noticeable difference unless the file is
greater than 25mb or so, btw.

Link to ReadFileScatter() function definition:

nio.rb - requires the latest windows-pr gem

require ‘windows/file’
require ‘windows/handle’
require ‘windows/error’
require ‘windows/memory’
require ‘windows/nio’
require ‘windows/synchronize’
require ‘windows/system_info’
require ‘windows/msvcrt/io’
require ‘windows/msvcrt/buffer’
require ‘win32/event’

module Win32
class NIO
include Windows::File
include Windows::Handle
include Windows::Error
include Windows::Synchronize
include Windows::MSVCRT::IO
include Windows::MSVCRT::Buffer
include Windows::SystemInfo
include Windows::Memory
include Windows::NIO
extend Windows::File
extend Windows::Handle
extend Windows::Error
extend Windows::Synchronize
extend Windows::MSVCRT::IO
extend Windows::MSVCRT::Buffer
extend Windows::SystemInfo
extend Windows::Memory
extend Windows::NIO

  class Error < StandardError; end
  # Reads the entire file specified by portname as individual

lines, and
# returns those lines in an array. Lines are separated by +sep+.
#–
# The semantics are the same as the MRI version but the
implementation
# is drastically different. We use a scattered IO read, which is
about
# as fast as the MRI version for small files, but much faster
for very
# large files.
#
def self.readlines(file, sep = $/)
handle = CreateFile(
file,
GENERIC_READ,
FILE_SHARE_READ,
nil,
OPEN_EXISTING,
FILE_FLAG_OVERLAPPED | FILE_FLAG_NO_BUFFERING,
nil
)

     if handle == INVALID_HANDLE_VALUE
        raise Error, get_last_error
     end

     # Get your system's page size, probably 4k
     sysbuf = 0.chr * 40
     GetSystemInfo(sysbuf)
     page_size = sysbuf[4,4].unpack('L')[0]
     num_pages = (File.size(file).to_f / page_size).ceil

     base_address = VirtualAlloc(
        nil,
        page_size * num_pages,
        MEM_COMMIT,
        PAGE_READWRITE
     )

     buf_list = []

     for i in 0...num_pages
        buf_list.push(base_address + page_size * i)
     end

     seg_array = buf_list.pack('Q*') + 0.chr * 8
     olap = 0.chr * 20
     olap[16,4] = [CreateEvent(nil, 1, 0, nil)].pack('L')

     bool = ReadFileScatter(
        handle,
        seg_array,
        page_size * num_pages,
        nil,
        olap
     )

     unless bool
        raise Error, get_last_error
     end

     WaitForSingleObject(olap[16,4].unpack('L')[0], INFINITE)

     # MRI's File.size cannot be trusted for files larger than

2gb.
file_size = [0].pack(‘Q’)
GetFileSizeEx(handle, file_size)
file_size = file_size.unpack(‘Q’)[0]

     unless CloseHandle(handle)
        raise Error, get_last_error
     end

     buffer = 0.chr * file_size
     memcpy(buffer, buf_list[0], file_size)
     VirtualFree(base_address, 0, MEM_RELEASE)

     # TODO: Fix line ending issue (?)
     unless sep.nil?
        if sep.empty?
           buffer = buffer.split("\r\n\r\n")
        else
           buffer = buffer.split(sep)
        end
     end

     buffer
  end

end
end

danielb · October 24, 2007, 10:35am

2007/10/23, Daniel B. [email protected]:

probably won’t see any noticeable difference unless the file is
greater than 25mb or so, btw.

Can’t test it at the moment. But I wonder how a scattered read can
help with #readlines which reads a file sequentially. AFAIK scattered
reading is for reading different portions of the same file.

Kind regards

robert

danielb · October 24, 2007, 4:30pm

On Oct 24, 2:34 am, “Robert K.” [email protected]
wrote:

What do folks think? Are you seeing a performance increase? You
probably won’t see any noticeable difference unless the file is
greater than 25mb or so, btw.

Can’t test it at the moment. But I wonder how a scattered read can
help with #readlines which reads a file sequentially. AFAIK scattered
reading is for reading different portions of the same file.

No, it’s just a different technique for reading files. You pre-
allocate the line buffers up front and then it then reads the lines
into those buffers asynchronously. But the resulting array still ends
up “in order”. At least, that’s how it seems to work on MS Windows.

Regards,

Dan

danielb · October 24, 2007, 4:49pm

2007/10/24, Daniel B. [email protected]:

think I’m seeing a 2x performance increase, but page caching is making
reading is for reading different portions of the same file.

No, it’s just a different technique for reading files. You pre-
allocate the line buffers up front and then it then reads the lines
into those buffers asynchronously. But the resulting array still ends
up “in order”. At least, that’s how it seems to work on MS Windows.

Ah, I see! So the “scatter” does not refer to the user request but to
the fact that blocks of a file are likely scattered on the disk and
the OS attempts to do an optimized read. Thanks for clarifying!

Kind regards

robert