Forum: Ruby Problems running code to read binary in Windows7

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-13 22:16
Hello to all in forum,

My first post here, I'm newbie in ruby, maybe somebody could help me
with this.

I have a script that read/parse a binary file. The scripts works fine if
I run it in Ubuntu, but I'm trying
to run the code in IRB on Windows7 with Ruby version "2.0.0p247
(2013-06-27) [i386-mingw32]" and I receive
the following errors.

##########################################################################
C:\Scripts>ruby script.rb binaryfile
script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'
##########################################################################

The script is like below (line 18 contains while gets):

##########################################################################
#!/usr/bin/env ruby

File.open(ARGV[0])

while gets
    line = $_.unpack('H*', "rb")[0]

  Some code
end
##########################################################################

Thanks in advance for any help.

Best Regards
7f7f6930bf1af19e4adc43ac3f723834?d=identicon&s=25 Abinoam J. (abinoampraxedes_m)
on 2013-09-13 23:26
(Received via mailing list)
First guess ...

- File.open(ARGV[0])
+ File.open(ARGV[0], 'rb') # opens file in 'binary' mode.

No Windows here to try it.

Abinoam Jr.
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-14 00:47
Hello Abinoam Jr.,

I've tried adding File.open(ARGV[0], 'rb') too, but is the same error :(

Maybe somebody could help me.

Thanks in advance.

Best regards
Chris Hulan (Guest)
on 2013-09-14 03:08
(Received via mailing list)
FIle.open returns a file object, which you are not saving a reference to
As far as I know gets will default to standard in. does the app wait
till
you hit a key?

IO has a binread method that looks applicable
http://www.ruby-doc.org/core-1.9.3/IO.html#method-c-binread
7f7f6930bf1af19e4adc43ac3f723834?d=identicon&s=25 Abinoam J. (abinoampraxedes_m)
on 2013-09-14 03:29
(Received via mailing list)
Chris is right (I didn't take care of the rest of the code).

Well... why do you want to handle a "binary" file in an "each_line"
fashion?

But, if this is really the case... (with your code as the start point, I
did).

File.open(ARGV[0], "rb") do |f|
  while f.gets
    line = $_.unpack('H*')[0]
    # Some code here
  end
end


To read the file as a whole... (as Chris suggested).

file = IO.binread(ARGV[0])
hexfile = file.unpack('H*')[0]

Abinoam Jr.
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-14 05:00
Hello Chris and Abinoam,

Thank you for your answers. I'll try your suggestions.

I try to use "while gets" because the binary is divided by blocks, so
each time that appears the beginning of each block executes the code
inside "while gets".

The issue is the binary files is 2GB in size and I don't know why the
code works in linux under ruby 2.0 and doesnt work in windows with ruby
2.0.

Thanks again for help so far
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-14 07:42
Hello again Chris and Abinoam,

I've tested Chris option but I still receive the error. Maybe is
something of the interpretation or encoding that ruby is not
understanding.

script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'

I've tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains "end". Is the end of the "while gets" loop.

Thanks in advance for your help.
4b4de96abbd3fe26cc285cebd5d806ca?d=identicon&s=25 tamouse m. (tamouse_m)
on 2013-09-14 10:21
(Received via mailing list)
On Sep 14, 2013, at 12:42 AM, Sever Siller <lists@ruby-forum.com> wrote:

>
> I've tested Abinoam option too and I receive this error:
> script.rb:50: syntax error, unexpected end-of-input, expecting
> keyword_end
>
> Line 50 contains "end". Is the end of the "while gets" loop.
>
> Thanks in advance for your help.


Difficult to know what is happening here.

Wondering if you should specify:

   File.open(file, 'rb', :encoding => Encoding::UTF_8 ) do |f|
     while (f.read(1024,buffer))
        # process the buffer
     end
   end
13d0171c6ee97074d61f182a6e3a9f4e?d=identicon&s=25 Robert K. (robert_k78)
on 2013-09-14 12:30
(Received via mailing list)
On Sat, Sep 14, 2013 at 10:20 AM, Tamara Temple
<tamouse.lists@gmail.com> wrote:
>> (ArgumentError)
>
>
> Difficult to know what is happening here.
>
> Wondering if you should specify:
>
>    File.open(file, 'rb', :encoding => Encoding::UTF_8 ) do |f|
>      while (f.read(1024,buffer))
>         # process the buffer
>      end
>    end

You need to declare buffer:

irb(main):028:0> File.open('x','rb'){|io| while io.read(1024, buffer);
puts buffer.bytesize; end}
NameError: undefined local variable or method `buffer' for main:Object
        from (irb):28:in `block in irb_binding'
        from (irb):28:in `open'
        from (irb):28
        from /usr/bin/irb:12:in `<main>'

Also, there seems to be no point in declaring an encoding when reading
binary.

File.open(file, 'rb') do |f|
  buffer = ''

  while f.read(1024, buffer)
     # process the buffer
  end
end

Kind regards

robert
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-14 18:44
Hello tamouse/Robert

Thank you for answer.

Adding what you said it means eliminate the "while gets" loop, but now
that you say I need to process the buffer, instead of the line:

line = $_.unpack('H*')[0]

What should go inside the loop "while f.read(1024,buffer)"?

$/ is defined at the begin of the script as below:

############################################################
BEGIN{  $/="\xff\x45"   }

File.open(file, 'rb') do |f|
  buffer = ''

  while f.read(1024, buffer)
     line = $_.unpack('H*')[0] # What should go instead of this line?
     # process the buffer
  end
end
############################################################

Thanks again for the help.

Best regards
13d0171c6ee97074d61f182a6e3a9f4e?d=identicon&s=25 Robert K. (robert_k78)
on 2013-09-14 20:01
(Received via mailing list)
On Sat, Sep 14, 2013 at 6:44 PM, Sever Siller <lists@ruby-forum.com>
wrote:

> Adding what you said it means eliminate the "while gets" loop, but now
> that you say I need to process the buffer, instead of the line:
>
> line = $_.unpack('H*')[0]
>
> What should go inside the loop "while f.read(1024,buffer)"?

Maybe you go first with stating what it is that you want to achieve.

> $/ is defined at the begin of the script as below:

Note that you do not need that: you can use #gets with an argument for
the delimiter.
http://ruby-doc.org/core-1.9.3/IO.html#method-i-gets

Cheers

robert
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-14 20:23
Hello Robert,

Thank you.

I think it is fixed now adding
BEGIN{  $/="\xff\x45".force_encoding("BINARY")   }

"http://yehudakatz.com/2010/05/05/ruby-1-9-encoding...

Many thanks for all the your help
13d0171c6ee97074d61f182a6e3a9f4e?d=identicon&s=25 Robert K. (robert_k78)
on 2013-09-14 21:04
(Received via mailing list)
On Sat, Sep 14, 2013 at 8:23 PM, Sever Siller <lists@ruby-forum.com>
wrote:

> I think it is fixed now adding
> BEGIN{  $/="\xff\x45".force_encoding("BINARY")   }

Btw. there's no point in using a BEGIN block here.

And, still I would prefer to use the value as an argument to #gets and
not change the system wide setting. That is much more robust.

>
"http://yehudakatz.com/2010/05/05/ruby-1-9-encoding...
>
> Many thanks for all the your help

You're welcome.

Cheers

robert
8d810c523fae58bb1fead477300b0b40?d=identicon&s=25 Sever S. (severuf)
on 2013-09-14 21:35
Hello Robert,

How would be the way you say considering the current code is like this:

##########################################################
BEGIN{  $/="\xff\x45".force_encoding("BINARY")}

while gets
    line = $_.unpack('H*')[0]
    next unless line =~ /Regexp/
  #Some code using "line" content
end
##########################################################

Should be something like:

while line = gets(\xff\x45.force_encoding("BINARY"))

Thanks in advance
13d0171c6ee97074d61f182a6e3a9f4e?d=identicon&s=25 Robert K. (robert_k78)
on 2013-09-15 13:37
(Received via mailing list)
On Sat, Sep 14, 2013 at 9:35 PM, Sever Siller <lists@ruby-forum.com>
wrote:
>   #Some code using "line" content
> end
> ##########################################################
>
> Should be something like:
>
> while line = gets(\xff\x45.force_encoding("BINARY"))

Almost.  I'd put the result of this expression in a local variable.
That is more efficient and you can give it a telling name.

Cheers

robert
This topic is locked and can not be replied to.