Forum: Ruby Problems running code to read binary in Windows7

6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-13 22:16
Hello to all in forum,

My first post here, I'm newbie in ruby, maybe somebody could help me
with this.

I have a script that read/parse a binary file. The scripts works fine if
I run it in Ubuntu, but I'm trying
to run the code in IRB on Windows7 with Ruby version "2.0.0p247
(2013-06-27) [i386-mingw32]" and I receive
the following errors.

##########################################################################
C:\Scripts>ruby script.rb binaryfile
script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'
##########################################################################

The script is like below (line 18 contains while gets):

##########################################################################
#!/usr/bin/env ruby

File.open(ARGV[0])

while gets
    line = $_.unpack('H*', "rb")[0]

  Some code
end
##########################################################################

Thanks in advance for any help.

Best Regards
09a32175057418748822c587ac08c429?d=identicon&s=25 Abinoam Jr. (abinoampraxedes_m)
on 2013-09-13 23:26
(Received via mailing list)
First guess ...

- File.open(ARGV[0])
+ File.open(ARGV[0], 'rb') # opens file in 'binary' mode.

No Windows here to try it.

Abinoam Jr.
6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-14 00:47
Hello Abinoam Jr.,

I've tried adding File.open(ARGV[0], 'rb') too, but is the same error :(

Maybe somebody could help me.

Thanks in advance.

Best regards
23172b6630dc631a134c9bad2fec2a39?d=identicon&s=25 Chris Hulan (Guest)
on 2013-09-14 03:08
(Received via mailing list)
FIle.open returns a file object, which you are not saving a reference to
As far as I know gets will default to standard in. does the app wait
till
you hit a key?

IO has a binread method that looks applicable
http://www.ruby-doc.org/core-1.9.3/IO.html#method-c-binread
09a32175057418748822c587ac08c429?d=identicon&s=25 Abinoam Jr. (abinoampraxedes_m)
on 2013-09-14 03:29
(Received via mailing list)
Chris is right (I didn't take care of the rest of the code).

Well... why do you want to handle a "binary" file in an "each_line"
fashion?

But, if this is really the case... (with your code as the start point, I
did).

File.open(ARGV[0], "rb") do |f|
  while f.gets
    line = $_.unpack('H*')[0]
    # Some code here
  end
end


To read the file as a whole... (as Chris suggested).

file = IO.binread(ARGV[0])
hexfile = file.unpack('H*')[0]

Abinoam Jr.
6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-14 05:00
Hello Chris and Abinoam,

Thank you for your answers. I'll try your suggestions.

I try to use "while gets" because the binary is divided by blocks, so
each time that appears the beginning of each block executes the code
inside "while gets".

The issue is the binary files is 2GB in size and I don't know why the
code works in linux under ruby 2.0 and doesnt work in windows with ruby
2.0.

Thanks again for help so far
6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-14 07:42
Hello again Chris and Abinoam,

I've tested Chris option but I still receive the error. Maybe is
something of the interpretation or encoding that ruby is not
understanding.

script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'

I've tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains "end". Is the end of the "while gets" loop.

Thanks in advance for your help.
Aa082c8b00a50928e5860dcd70bf2368?d=identicon&s=25 tamouse m. (tamouse_m)
on 2013-09-14 10:21
(Received via mailing list)
On Sep 14, 2013, at 12:42 AM, Sever Siller <lists@ruby-forum.com> wrote:

>
> I've tested Abinoam option too and I receive this error:
> script.rb:50: syntax error, unexpected end-of-input, expecting
> keyword_end
>
> Line 50 contains "end". Is the end of the "while gets" loop.
>
> Thanks in advance for your help.


Difficult to know what is happening here.

Wondering if you should specify:

   File.open(file, 'rb', :encoding => Encoding::UTF_8 ) do |f|
     while (f.read(1024,buffer))
        # process the buffer
     end
   end
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (robert_k78)
on 2013-09-14 12:30
(Received via mailing list)
On Sat, Sep 14, 2013 at 10:20 AM, Tamara Temple
<tamouse.lists@gmail.com> wrote:
>> (ArgumentError)
>
>
> Difficult to know what is happening here.
>
> Wondering if you should specify:
>
>    File.open(file, 'rb', :encoding => Encoding::UTF_8 ) do |f|
>      while (f.read(1024,buffer))
>         # process the buffer
>      end
>    end

You need to declare buffer:

irb(main):028:0> File.open('x','rb'){|io| while io.read(1024, buffer);
puts buffer.bytesize; end}
NameError: undefined local variable or method `buffer' for main:Object
        from (irb):28:in `block in irb_binding'
        from (irb):28:in `open'
        from (irb):28
        from /usr/bin/irb:12:in `<main>'

Also, there seems to be no point in declaring an encoding when reading
binary.

File.open(file, 'rb') do |f|
  buffer = ''

  while f.read(1024, buffer)
     # process the buffer
  end
end

Kind regards

robert
6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-14 18:44
Hello tamouse/Robert

Thank you for answer.

Adding what you said it means eliminate the "while gets" loop, but now
that you say I need to process the buffer, instead of the line:

line = $_.unpack('H*')[0]

What should go inside the loop "while f.read(1024,buffer)"?

$/ is defined at the begin of the script as below:

############################################################
BEGIN{  $/="\xff\x45"   }

File.open(file, 'rb') do |f|
  buffer = ''

  while f.read(1024, buffer)
     line = $_.unpack('H*')[0] # What should go instead of this line?
     # process the buffer
  end
end
############################################################

Thanks again for the help.

Best regards
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (robert_k78)
on 2013-09-14 20:01
(Received via mailing list)
On Sat, Sep 14, 2013 at 6:44 PM, Sever Siller <lists@ruby-forum.com>
wrote:

> Adding what you said it means eliminate the "while gets" loop, but now
> that you say I need to process the buffer, instead of the line:
>
> line = $_.unpack('H*')[0]
>
> What should go inside the loop "while f.read(1024,buffer)"?

Maybe you go first with stating what it is that you want to achieve.

> $/ is defined at the begin of the script as below:

Note that you do not need that: you can use #gets with an argument for
the delimiter.
http://ruby-doc.org/core-1.9.3/IO.html#method-i-gets

Cheers

robert
6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-14 20:23
Hello Robert,

Thank you.

I think it is fixed now adding
BEGIN{  $/="\xff\x45".force_encoding("BINARY")   }

"http://yehudakatz.com/2010/05/05/ruby-1-9-encoding...

Many thanks for all the your help
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (robert_k78)
on 2013-09-14 21:04
(Received via mailing list)
On Sat, Sep 14, 2013 at 8:23 PM, Sever Siller <lists@ruby-forum.com>
wrote:

> I think it is fixed now adding
> BEGIN{  $/="\xff\x45".force_encoding("BINARY")   }

Btw. there's no point in using a BEGIN block here.

And, still I would prefer to use the value as an argument to #gets and
not change the system wide setting. That is much more robust.

>
"http://yehudakatz.com/2010/05/05/ruby-1-9-encoding...
>
> Many thanks for all the your help

You're welcome.

Cheers

robert
6b7caaaf16716078e87578c883704756?d=identicon&s=25 Sever Siller (severuf)
on 2013-09-14 21:35
Hello Robert,

How would be the way you say considering the current code is like this:

##########################################################
BEGIN{  $/="\xff\x45".force_encoding("BINARY")}

while gets
    line = $_.unpack('H*')[0]
    next unless line =~ /Regexp/
  #Some code using "line" content
end
##########################################################

Should be something like:

while line = gets(\xff\x45.force_encoding("BINARY"))

Thanks in advance
E0d864d9677f3c1482a20152b7cac0e2?d=identicon&s=25 Robert Klemme (robert_k78)
on 2013-09-15 13:37
(Received via mailing list)
On Sat, Sep 14, 2013 at 9:35 PM, Sever Siller <lists@ruby-forum.com>
wrote:
>   #Some code using "line" content
> end
> ##########################################################
>
> Should be something like:
>
> while line = gets(\xff\x45.force_encoding("BINARY"))

Almost.  I'd put the result of this expression in a local variable.
That is more efficient and you can give it a telling name.

Cheers

robert
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.