Forum: Ruby-core [ruby-trunk - Bug #8767][Open] IO.copy_stream should write in binary mode.

8c73dab2ecd23d5b5627a0235ae2f9b1?d=identicon&s=25 godfat (Lin Jen-Shin) (Guest)
on 2013-08-10 20:03
(Received via mailing list)
Issue #8767 has been reported by godfat (Lin Jen-Shin).

----------------------------------------
Bug #8767: IO.copy_stream should write in binary mode.
https://bugs.ruby-lang.org/issues/8767

Author: godfat (Lin Jen-Shin)
Status: Open
Priority: Normal
Assignee:
Category: core
Target version: current: 2.1.0
ruby -v: ruby 2.1.0dev (2013-08-11 trunk 42495) [x86_64-darwin12.2.1]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


This patch makes `IO.copy_stream' always copy in binary mode,
fixing the following scenario:

    require 'tempfile'
    require 'stringio'
    Encoding.default_internal = 'UTF-8'
    out = Tempfile.new('out')
    out.binmode
    # before this patch it raises:
    # in `write': "\xDE" from ASCII-8BIT to UTF-8
(Encoding::UndefinedConversionError)
    IO.copy_stream(StringIO.new("\xDE\xAD\xBE\xEF"), out)

The other way to fix this would be trying to preserve the file mode
from Tempfile instead of always writing in binary mode. However,
this won't be the case for other objects responding to `to_path'.

I think as we're treating destination as streams, we would always
want writing in binary. So I guess this is ok.

Thank you for reviewing.
This is discovered by using rubyzip, which is using Tempfile for
buffering input/output stream.

Commit on Github:
https://github.com/godfat/ruby/commit/94d9f3dd733f...
18813f71506ebad74179bf8c5a136696?d=identicon&s=25 Eric Wong (Guest)
on 2013-08-10 21:54
(Received via mailing list)
"godfat (Lin Jen-Shin)" <godfat@godfat.org> wrote:
> This patch makes `IO.copy_stream' always copy in binary mode,

I think your patch makes sense.

> fixing the following scenario:
>
>     require 'tempfile'
>     require 'stringio'
>     Encoding.default_internal = 'UTF-8'
>     out = Tempfile.new('out')
>     out.binmode
>     # before this patch it raises:
>     # in `write': "\xDE" from ASCII-8BIT to UTF-8
(Encoding::UndefinedConversionError)
>     IO.copy_stream(StringIO.new("\xDE\xAD\xBE\xEF"), out)

Can make the above a testcase with your patch?

> The other way to fix this would be trying to preserve the file mode
> from Tempfile instead of always writing in binary mode. However,
> this won't be the case for other objects responding to `to_path'.

Please no :)
8c73dab2ecd23d5b5627a0235ae2f9b1?d=identicon&s=25 godfat (Lin Jen-Shin) (Guest)
on 2013-08-11 20:08
(Received via mailing list)
Issue #8767 has been updated by godfat (Lin Jen-Shin).

File 0001-io.c-IO.copy_stream-should-write-in-binary-mode.patch added

normalperson (Eric Wong) wrote:
>  >     out = Tempfile.new('out')
>  >     out.binmode
>  >     # before this patch it raises:
>  >     # in `write': "\xDE" from ASCII-8BIT to UTF-8
(Encoding::UndefinedConversionError)
>  >     IO.copy_stream(StringIO.new("\xDE\xAD\xBE\xEF"), out)
>
>  Can make the above a testcase with your patch?

Yes, thank you for your support. I've added a test for this
and updated the patch in the attachment.

The updated commit on Github is located at:
https://github.com/godfat/ruby/commit/1ad5547335fd...

P.S. To my surprise, it's extremely easy to run the test suites.
----------------------------------------
Bug #8767: IO.copy_stream should write in binary mode.
https://bugs.ruby-lang.org/issues/8767#change-41095

Author: godfat (Lin Jen-Shin)
Status: Open
Priority: Normal
Assignee:
Category: core
Target version: current: 2.1.0
ruby -v: ruby 2.1.0dev (2013-08-11 trunk 42495) [x86_64-darwin12.2.1]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


This patch makes `IO.copy_stream' always copy in binary mode,
fixing the following scenario:

    require 'tempfile'
    require 'stringio'
    Encoding.default_internal = 'UTF-8'
    out = Tempfile.new('out')
    out.binmode
    # before this patch it raises:
    # in `write': "\xDE" from ASCII-8BIT to UTF-8
(Encoding::UndefinedConversionError)
    IO.copy_stream(StringIO.new("\xDE\xAD\xBE\xEF"), out)

The other way to fix this would be trying to preserve the file mode
from Tempfile instead of always writing in binary mode. However,
this won't be the case for other objects responding to `to_path'.

I think as we're treating destination as streams, we would always
want writing in binary. So I guess this is ok.

Thank you for reviewing.
This is discovered by using rubyzip, which is using Tempfile for
buffering input/output stream.

Commit on Github:
https://github.com/godfat/ruby/commit/94d9f3dd733f...
This topic is locked and can not be replied to.