Strange characters... how are they created?

This isn’t a Ruby-specific question, I suppose, but it’s the language
I’m using for this project so I figured I’d ask here.

I’ve created a .torrent file, and between the bencoding is a
representation of the contents of the file I want to share (a small text
file):

20:Ÿø\îµÊ^Ω¬ ‡m$æÉ* ú0*

My question is, how does the BT client go from plain text to that? The
BT protocol spec says that pieces of data get SHA1’d, but I’ve never
seen a SHA1 hash look like that.

What am I missing?

In case the sample characters I pasted don’t show up right, here’s a
screen shot:

http://rabbitcreative.com/strange-charas.png

On 6 Feb 2008, at 08:31, Daniel W. wrote:

My question is, how does the BT client go from plain text to that? The
BT protocol spec says that pieces of data get SHA1’d, but I’ve never
seen a SHA1 hash look like that.

That sort of depends how you display your SHA1 hash. The result will
just be 20 random looking bytes (as you’ve got before), which doesn’t
display particularly nicely, which is why it’s often written out in
hex. However if it’s never going to be read by anything approaching a
human you might as well just store the 20 bytes without trying to
encode them in anyway.

Fred

Frederick C. wrote:

That sort of depends how you display your SHA1 hash. The result will
just be 20 random looking bytes (as you’ve got before), which doesn’t
display particularly nicely, which is why it’s often written out in
hex. However if it’s never going to be read by anything approaching a
human you might as well just store the 20 bytes without trying to
encode them in anyway.

Wow Frederick, you just opened my eyes! Thank you!

You said…

…which is why it’s often written out in hex.

And on a whim I tried:

Digest::SHA1.digest(…)

And lo’ and behold, I got much more bizarre looking output:

Digest::SHA1.digest(‘d3:cow3:moo4:spam4:eggs’)
=> “?:\231\005?5&&U\203?@\217\261\216?%t”

I had no idea there was anything other than the hexdigest method, or
that the resulting hash could be represented so differently.

I feel like I’m moving forward, but I’m facing the wrong direction by a
degree or two.

The output, while more similar to what Transmission is generating, isn’t
quite as “funky.”

On Feb 6, 2008 4:11 AM, Daniel W. [email protected] wrote:

I don’t know anything about bittorrent, but

s = ‘20:Ÿø\îµÊ^Ω¬ ‡m$æÉ* ú0*’
puts s.unpack(‘h*’)

might point you in the right direction.

Todd

Daniel W. wrote:

The output, while more similar to what Transmission is generating, isn’t
quite as “funky.”

Scatch that “quite as funky” bit. I’ve hashed the SAME content and
gotten different results that what Transmission is generating. So I’m
still doing something wrong.

On Feb 6, 2008 6:22 PM, Daniel W. [email protected] wrote:

though. Perhaps the characters being pasted mean something special to
require ‘digest/sha1’

output = File.open(‘output’, ‘w+’)
source = IO.read(‘./README_FOR_APP’, 512)
output << Digest::SHA1.digest(source)
output.close

Yields the same “style” output that Transmission generates.

Oh, I see, you want to create a digest. I misunderstood you.

Todd

Todd B. wrote:

I don’t know anything about bittorrent, but

s = ‘20:Ÿø\îµÊ^Ω¬ ‡m$æÉ* ú0*’
puts s.unpack(‘h*’)

might point you in the right direction.

That’s an interesting operation, but nay, no directional change. (Also,
pasting it into IRB causes my machine to beep and pause. Dunno why
though. Perhaps the characters being pasted mean something special to
Ruby?)

I asked a friend of mine and he said that if you dump the output of the
digest method directly to a file, you’ll get the funky output.

My guess is console does some kind of conversion – possibly with
character encoding?

Anywho, the following…

require ‘digest/sha1’

output = File.open(‘output’, ‘w+’)
source = IO.read(’./README_FOR_APP’, 512)
output << Digest::SHA1.digest(source)
output.close

Yields the same “style” output that Transmission generates.

On Feb 6, 2008 6:22 PM, Daniel W. [email protected] wrote:

though. Perhaps the characters being pasted mean something special to
Ruby?)

I think it has to do with your console. It also doesn’t work for me
when pasting into IRB, but seems to work when running with ruby from a
.rb file.

I asked a friend of mine and he said that if you dump the output of the
digest method directly to a file, you’ll get the funky output.

I guess that makes sense.

My guess is console does some kind of conversion – possibly with
character encoding?

I think so.

Anywho, the following…

require ‘digest/sha1’

output = File.open(‘output’, ‘w+’)
source = IO.read(‘./README_FOR_APP’, 512)
output << Digest::SHA1.digest(source)
output.close

Yields the same “style” output that Transmission generates.

But not the same?

From your original post with my code, I get…

2303a3f98fc5ee5bace5f4ca0278d6426e9ca202af03a2

…which if you leave off the first three bytes (20:) should be a
reasonable sha1 digest of 160 bits length.

Todd

Todd B. wrote:

Oh, I see, you want to create a digest. I misunderstood you.

It’s possible too I wasn’t very clear (it’s happened before), in which
case I apologize. =)