Hexdump (#171)

mansfiem · July 25, 2008, 7:28pm

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

The three rules of Ruby Q. 2:

Please do not post any solutions or spoiler discussion for this
quiz until 48 hours have passed from the time on this message.
Support Ruby Q. 2 by submitting ideas as often as you can! (A
permanent, new website is in the works for Ruby Q. 2. Until then,
please visit the temporary website at

http://splatbang.com/rubyquiz/.
Enjoy!

Suggestion: A [QUIZ] in the subject of emails about the problem
helps everyone on Ruby T. follow the discussion. Please reply to
the original quiz message, if you can.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

hexdump (#171)

Quiz idea provided by Robert D…

This week’s quiz should be quick and easy for experienced Rubyists,
and a good lesson for beginners. Your task this week is to write a
utility that outputs a hex dump of the input.

There are a number of hex dump utilities in existence, that go by the
names hd, od, hexdump… I’m sure there are more. Pick one you’d
like to reproduce: If you’re on any variety of Unix or BSD (including
Mac OS X), you can get man pages from the command-line to see how they
work. On Windows, if you don’t have one installed, you can check out
this man page for hexdump and use that as a model.

You are not required to implement all the various command-line
switches, but I should be able to run your script on a file and, as a
minimum, see output resembling this (view with fixed-width font for
best results):

0000000 6573 2074 6c68 0a73 7973 746e 7861 6f20
0000010 0a6e 6f63 6f6c 7372 6863 6d65 2065 6564
0000020 6573 7472 0a0a 6573 2074 7865 6170 646e
0000030 6174 0a62 6573 2074 6174 7362 6f74 3d70
0000040 0a32 6573 2074 6873 6669 7774 6469 6874
0000050 323d 220a 6573 2074 6574 7478 6977 7464
0000060 3d68 3836 0a0a 2022 2051 6f63 6d6d 6e61
0000070 2064 6f74 7220 6665 726f 616d 2074 6170
0000080 6172 7267 7061 7368 6120 646e 6c20 7369
0000090 2e74 6e0a 6f6e 6572 616d 2070 2051 7167
00000a0 0a7d 0a0a
00000a4

Your submission should accept input either from a named file (part of
the command-line arguments) or from standard input if no filename is
provided.

Finally, when submitting, make sure to describe what existing hex dump
program you are emulating/reproducing (if any), and what arguments to
your script are needed, if any, to produce the basic output above.

mansfiem · July 25, 2008, 10:08pm

On Jul 25, 2008, at 12:22 PM, Matthew M. wrote:

There are a number of hex dump utilities in existence, that go by the
names hd, od, hexdump… I’m sure there are more. Pick one you’d
like to reproduce…

xxd is my favorite.

James Edward G. II

mansfiem · July 25, 2008, 11:40pm

On Jul 25, 11:22 am, Matthew M. [email protected] wrote:

00000a0 0a7d 0a0a
00000a4

Is this really what you dumped, Matthew? I was hoping for something a
little more… comprehensible.

Chris

es tlh
systnxao
nocolsrhcme eedestr

es txeapdnat
bes tatsbot=p
2es thsfiwtdiht2="
es tettxiwtd=h86

" Qocmmna dotr feroam taparrgpasha dnl si.tn
oneram p Qqg
}

mansfiem · July 26, 2008, 5:31am

On Jul 25, 4:34 pm, Chris S. [email protected] wrote:

0000080 6172 7267 7061 7368 6120 646e 6c20 7369

oneram p Qqg

I believe your endianness is off, sir.

mansfiem · July 25, 2008, 11:02pm

Will you be accepting golfed solutions? Of course you will.

– a,b=%Q=Z,O^NPO\r4_PV\PI\x15^-\x0\v=,email=%%%%c%115%%# Mikael
Hoilund, CTO
okay=%#;hmm=(0…a.size).map{|i|((a[i]-email[i]+2)%128).# of Meta.io
ApS from
chr}.join;!email.gsub!‘o’,“%c%c”%[3+?0.<<(2),?G.~@];aha=#############
Denmark
hmm.scan(/#{‘(.)’*5}/);!puts(email[1…-12]+aha.shift.zip(*aha).join)#
Ruby <3

mansfiem · July 27, 2008, 10:56pm

Well here goes my reference implementation, in good ol’ RQ tradition.
Nothing fancy here just 16 bytes per line
with hexaddresses and ASCII output at the right, like the System V hd
command.

http://pastie.org/242020

Robert

–
http://ruby-smalltalk.blogspot.com/

There’s no one thing that’s true. It’s all true.

mansfiem · July 26, 2008, 8:16am

On Jul 25, 9:25 pm, Matthew M. [email protected] wrote:

0000040 0a32 6573 2074 6873 6669 7774 6469 6874
2es thsfiwtdiht2="
es tettxiwtd=h86

" Qocmmna dotr feroam taparrgpasha dnl si.tn
oneram p Qqg

I believe your endianness is off, sir.

Whoops. I knew it looked almost like a slightly shuffled
somethingorother.

Chris

mansfiem · July 26, 2008, 3:22pm

On Jul 25, 3:56 pm, Mikael Høilund [email protected] wrote:

Will you be accepting golfed solutions? Of course you will.

Well, sure… Though in this case, I’d somewhat prefer to see nicely
written solutions that offered up more command-line options, such as
those provided by the various utilities. Things like grouping by 1, 2
or 4 bytes; ASCII display; binary/octal; etc.

But golfed solutions are okay, as usual…

mansfiem · July 27, 2008, 11:51pm

Oh hi, I just thought I’d golf a solution. I’m sure other people can
do a much better job than I making a full hexdumping suite, so I just
had some fun. Can’t seem to get it lower than 78 characters,
unfortunately.

i=0;$<.read.scan(/.{0,16}/m){puts"%08x "%i+$&.unpack(‘H4’*8).join(’
');i+=16}

Expanded and parenthesified, clarified:

i = 0
ARGF.read.scan(/.{0,16}/m) {
puts(("%08x " % i) + $&.unpack(‘H4’*8).join(’ '))
i += 16
}

ARGF (aliased as $<) is the file handle of all file names given in the
arguments concatenated, STDIN if none — exactly what we need. The
regex to scan matches between 0 and 16 characters (including newline)
greedily. Change it to 1,16 if you don’t want the empty line at the end.

Instead of letting the block to scan take an argument, I used a trick
I picked up from the last Ruby Q. I participated in (Obfuscated
Email), and use $& inside the block, which is the last regex match.
Saves two characters \o/

The unpack returns an array of eight strings, each of four characters,
with the hexadecimal representation of the ASCII value of two
consecutive characters. Fun, fun, fun.

mansfiem · July 28, 2008, 4:27pm

I added an ascii column to your solution… now it’s about twice the
size

i = 0
$<.read.scan(/.{0,16}/m) {
puts(("%08x " % i) + $&.unpack(‘H4’*8).join(’ ‘) + ’ [’+
$&.split(//).collect { |c| c.inspect[1] == 92 ? ‘.’ :c }.join +
‘]’ )
i += 16
}

mansfiem · July 30, 2008, 11:37pm

On 7/28/08, Martin B. [email protected] wrote:

$<.read.scan(/.{0,16}/m) {
puts(("%08x " % i) + $&.unpack(‘H4’*8).join(’ ‘) + ’ [’+
$&.split(//).collect { |c| c.inspect[1] == 92 ? ‘.’ :c }.join + ‘]’ )
i += 16
}

I can’t resist golf: I got Martin’s solution down to 95 bytes (If you
take out the ascii column it’s down to 71).

i=0;$<.read.scan(/.{0,16}/m){puts"%08x0 "%i+$&.unpack(‘H4’8)’ ‘+’ |
‘+$&.tr(’^ -~‘,’.');i+=1}

Tricks: *’ ’ is a shorter version of .join(’ ‘) for arrays,
and $&.tr(’^ -~‘,’.') says translate any character not between ’ ’ and
‘~’ (32 to 126) to a ‘.’ That saved a ton over the
split/collect/inspect method. (By the way, map and dump save a few
bytes over collect and inspect)

I also did a more full-featured version that supports some command line
options

-Adam

#hexdump utility for RubyQuiz#171
USAGE=<<USAGE

Usage:
#{$0.split(/[/\]/)[-1]} [-n length] [-s skip] [-g group] [-w
width] [-a] file

Dumps bytes of in hex format, starting at offset .
Prints bytes per line in groups of size .
Prints the ascii on the right unless <-a> specified

Default is all bytes of $stdin in 16/2 format.

USAGE
begin
width=16
group=2
skip=0
length=Float::MAX
do_ascii = true
file = $stdin

while (opt=ARGV.shift)
if opt[0]==?-
case opt[1]
when ?n
length=ARGV.shift.to_i
when ?s
skip=ARGV.shift.to_i
when ?g
group = ARGV.shift.to_i
when ?w
width = ARGV.shift.to_i
when ?a
do_ascii = false
else
raise ArgumentError,“invalid Option #{opt}”
end
else
file = File.new(opt)
end
end

n=0
ascii=‘’
file.read(skip)
file.each_byte{|b|
if n%width == 0
print "%s\n%08x "%[ascii,n+skip]
ascii=‘| ’ if do_ascii
end
print “%02x”%b
print ’ ’ if (n+=1)%group==0
ascii << “%s”%b.chr.tr(’^ -~‘,’.') if do_ascii
break if n>length
}
puts ’ '(((2+width-ascii.size)(2*group+1))/group.to_f).ceil+ascii
#this is probably the most complicated line
#it pads out the line to get the remaining ascii to align:

(2+width-ascii.size) is the number of bytes missing (the 2 is for

the ’ | ')

(2group+1) is the width of a group of bytes with the space

/group.to_f divides by the number of groups

.ceil rounds up, otherwise we misalign on partial groups

rescue =>x
puts USAGE, “ERROR: #{x}”
end

mansfiem · August 8, 2008, 11:37pm

Matthew M. wrote:

hexdump (#171)

Quiz idea provided by Robert D…

This week’s quiz should be quick and easy for experienced Rubyists,
and a good lesson for beginners. Your task this week is to write a
utility that outputs a hex dump of the input.

I did something a little different, I made a module that can be used to
extend IO objects. This means you can extend any File or socket objects
to become hex writers. Since I don’t think you can “un-extend” an
object, it would probably be best if you dup the IO object if you need
to switch between hex and normal output.

#!/usr/bin/env ruby

UziMonkey [email protected]

module HexWriter
def self.extend_object(o)
class << o
alias_method :old_write, :write
end

 super

end

def write(s)
s.each_byte do|b|
if @bytes % 16 == 0 and @address != 0
end_line
new_line
end

   write_byte b
 end

end

def new_file
@address = 0
new_line
end

def end_line
old_write " " * (16 - @bytes)
old_write " #{@ascii}\n"
end

def new_line
@bytes = 0
@address ||= 0
@ascii = “”

 old_write "%08x" % @address

end

def write_byte(b)
old_write " %02x" % b

 @ascii << ((b.chr =~ /[[:print:]]/).nil? ? '.' : b.chr)

 @bytes += 1
 @address += 1

end
end

hex = STDOUT.dup.extend HexWriter
ARGV.each do|f|
puts “File: #{f}”
hex.new_file
hex.write File.read(f)
hex.end_line
end

mansfiem · July 31, 2008, 1:54am

On Jul 30, 2008, at 23:36, Adam S. wrote:

On 7/28/08, Martin B. [email protected] wrote:

I can’t resist golf: I got Martin’s solution down to 95 bytes (If you
take out the ascii column it’s down to 71).

i=0;$<.read.scan(/.{0,16}/m){puts"%08x0 "%i+$&.unpack(‘H4’8)’ ‘+’ |
‘+$&.tr(’^ -~‘,’.');i+=1}

That’s pretty neat! I’d totally forgotten about that trick. The way
you handle the counter is ;)-ish