Problems running code to read binary in Windows7

Hello to all in forum,

My first post here, I’m newbie in ruby, maybe somebody could help me
with this.

I have a script that read/parse a binary file. The scripts works fine if
I run it in Ubuntu, but I’m trying
to run the code in IRB on Windows7 with Ruby version “2.0.0p247
(2013-06-27) [i386-mingw32]” and I receive
the following errors.

##########################################################################
C:\Scripts>ruby script.rb binaryfile
script.rb:18:in gets': encoding mismatch: CP850 IO with UTF-8 RS (ArgumentError) from script.rb:18:ingets’
from script.rb:18:in `’
##########################################################################

The script is like below (line 18 contains while gets):

##########################################################################
#!/usr/bin/env ruby

File.open(ARGV[0])

while gets
line = $_.unpack(‘H*’, “rb”)[0]

Some code
end
##########################################################################

Thanks in advance for any help.

Best Regards

First guess …

  • File.open(ARGV[0])
  • File.open(ARGV[0], ‘rb’) # opens file in ‘binary’ mode.

No Windows here to try it.

Abinoam Jr.

Hello Abinoam Jr.,

I’ve tried adding File.open(ARGV[0], ‘rb’) too, but is the same error :frowning:

Maybe somebody could help me.

Thanks in advance.

Best regards

FIle.open returns a file object, which you are not saving a reference to
As far as I know gets will default to standard in. does the app wait
till
you hit a key?

IO has a binread method that looks applicable

Chris is right (I didn’t take care of the rest of the code).

Well… why do you want to handle a “binary” file in an “each_line”
fashion?

But, if this is really the case… (with your code as the start point, I
did).

File.open(ARGV[0], “rb”) do |f|
while f.gets
line = $_.unpack(‘H*’)[0]
# Some code here
end
end

To read the file as a whole… (as Chris suggested).

file = IO.binread(ARGV[0])
hexfile = file.unpack(‘H*’)[0]

Abinoam Jr.

Hello again Chris and Abinoam,

I’ve tested Chris option but I still receive the error. Maybe is
something of the interpretation or encoding that ruby is not
understanding.

script.rb:18:in gets': encoding mismatch: CP850 IO with UTF-8 RS (ArgumentError) from script.rb:18:ingets’
from script.rb:18:in `’

I’ve tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains “end”. Is the end of the “while gets” loop.

Thanks in advance for your help.

On Sep 14, 2013, at 12:42 AM, Sever S. [email protected] wrote:

I’ve tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains “end”. Is the end of the “while gets” loop.

Thanks in advance for your help.

Difficult to know what is happening here.

Wondering if you should specify:

File.open(file, ‘rb’, :encoding => Encoding::UTF_8 ) do |f|
while (f.read(1024,buffer))
# process the buffer
end
end

Hello Chris and Abinoam,

Thank you for your answers. I’ll try your suggestions.

I try to use “while gets” because the binary is divided by blocks, so
each time that appears the beginning of each block executes the code
inside “while gets”.

The issue is the binary files is 2GB in size and I don’t know why the
code works in linux under ruby 2.0 and doesnt work in windows with ruby
2.0.

Thanks again for help so far

On Sat, Sep 14, 2013 at 10:20 AM, Tamara T.
[email protected] wrote:

(ArgumentError)

Difficult to know what is happening here.

Wondering if you should specify:

File.open(file, ‘rb’, :encoding => Encoding::UTF_8 ) do |f|
while (f.read(1024,buffer))
# process the buffer
end
end

You need to declare buffer:

irb(main):028:0> File.open(‘x’,‘rb’){|io| while io.read(1024, buffer);
puts buffer.bytesize; end}
NameError: undefined local variable or method buffer' for main:Object from (irb):28:in block in irb_binding’
from (irb):28:in open' from (irb):28 from /usr/bin/irb:12:in

Also, there seems to be no point in declaring an encoding when reading
binary.

File.open(file, ‘rb’) do |f|
buffer = ‘’

while f.read(1024, buffer)
# process the buffer
end
end

Kind regards

robert

On Sat, Sep 14, 2013 at 6:44 PM, Sever S. [email protected]
wrote:

Adding what you said it means eliminate the “while gets” loop, but now
that you say I need to process the buffer, instead of the line:

line = $_.unpack(‘H*’)[0]

What should go inside the loop “while f.read(1024,buffer)”?

Maybe you go first with stating what it is that you want to achieve.

$/ is defined at the begin of the script as below:

Note that you do not need that: you can use #gets with an argument for
the delimiter.

Cheers

robert

Hello Robert,

Thank you.

I think it is fixed now adding
BEGIN{ $/=“\xff\x45”.force_encoding(“BINARY”) }

Ruby 1.9 Encodings: A Primer and the Solution for Rails

Many thanks for all the your help

Hello tamouse/Robert

Thank you for answer.

Adding what you said it means eliminate the “while gets” loop, but now
that you say I need to process the buffer, instead of the line:

line = $_.unpack(‘H*’)[0]

What should go inside the loop “while f.read(1024,buffer)”?

$/ is defined at the begin of the script as below:

############################################################
BEGIN{ $/="\xff\x45" }

File.open(file, ‘rb’) do |f|
buffer = ‘’

while f.read(1024, buffer)
line = $_.unpack(‘H*’)[0] # What should go instead of this line?
# process the buffer
end
end
############################################################

Thanks again for the help.

Best regards

On Sat, Sep 14, 2013 at 8:23 PM, Sever S. [email protected]
wrote:

I think it is fixed now adding
BEGIN{ $/=“\xff\x45”.force_encoding(“BINARY”) }

Btw. there’s no point in using a BEGIN block here.

And, still I would prefer to use the value as an argument to #gets and
not change the system wide setting. That is much more robust.

Ruby 1.9 Encodings: A Primer and the Solution for Rails

Many thanks for all the your help

You’re welcome.

Cheers

robert

Hello Robert,

How would be the way you say considering the current code is like this:

##########################################################
BEGIN{ $/="\xff\x45".force_encoding(“BINARY”)}

while gets
line = $_.unpack(‘H*’)[0]
next unless line =~ /Regexp/
#Some code using “line” content
end
##########################################################

Should be something like:

while line = gets(\xff\x45.force_encoding(“BINARY”))

Thanks in advance

On Sat, Sep 14, 2013 at 9:35 PM, Sever S. [email protected]
wrote:

#Some code using “line” content
end
##########################################################

Should be something like:

while line = gets(\xff\x45.force_encoding(“BINARY”))

Almost. I’d put the result of this expression in a local variable.
That is more efficient and you can give it a telling name.

Cheers

robert