Error in Ruby text comparison?

‘1sqHmb5b8G9mN’ < ‘1Xv5LeB9bMdar’

Wouldn’t you think that is supposed to be TRUE ?

All my text editors and Excel and Numbers all sort it so that 1s…
comes before 1X…

But Ruby says the above comparison is false.

What am I missing?

– gw

Greg W. wrote:

‘1sqHmb5b8G9mN’ < ‘1Xv5LeB9bMdar’

Wouldn’t you think that is supposed to be TRUE ?

All my text editors and Excel and Numbers all sort it so that 1s…
comes before 1X…

But Ruby says the above comparison is false.

What am I missing?

Strange, this list:

s, S, s, a, B

in Excel, Numbers, and in Araelium Edit comes out as this when sorted

a, B, s, S, s

TextWrangler puts them as

a, B, s, s, S

Ruby sorts [‘s’,‘S’,‘s’,‘a’,‘B’].sort as

[“B”, “S”, “a”, “s”, “s”]

MySQL (as I have it set up) sorts like the OS X apps.

Trying to do a binary search with a ruby array based on text sorted by
something else is getting hosed.

Oh… duh, have Ruby sort it :stuck_out_tongue:

Could get very expensive, but I guess I’ll have to do it.

– gw

On Sat, 2007-10-27 at 20:27 +0900, Greg W. wrote:

a, B, s, S, s

TextWrangler puts them as

a, B, s, s, S

Ruby sorts [‘s’,‘S’,‘s’,‘a’,‘B’].sort as

[“B”, “S”, “a”, “s”, “s”]

irb to the rescue:
irb(main):001:0> ‘a’ < ‘b’
=> true
irb(main):002:0> “A” < ‘b’
=> true
irb(main):003:0> “a” < “B”
=> false
irb(main):004:0> ?a < ?b
=> true
irb(main):005:0> ?A < ?b
=> true
irb(main):006:0> ?a < ?B
=> false
irb(main):007:0> ?A
=> 65
irb(main):008:0> ?a
=> 97
irb(main):009:0>

Ruby’s sorting these strings by ASCII order, and as you can see here,
capital letters come first! So “A” is always less than ‘a’, etc.

Arlen

Hi,

At Sat, 27 Oct 2007 20:13:28 +0900,
Greg W. wrote in [ruby-talk:276099]:

‘1sqHmb5b8G9mN’ < ‘1Xv5LeB9bMdar’

Wouldn’t you think that is supposed to be TRUE ?

No.

But Ruby says the above comparison is false.

‘1sqHmb5b8G9mN’.casecmp(‘1Xv5LeB9bMdar’) < 0

returns true.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Ruby’s sorting these strings by ASCII order, and as you can see here, capital letters come first! So “A” is always less than ‘a’, etc.

This is because Ruby follow the lexicographical order for sorting. If
you need
case-insensitive comparisons, you can change the way the sorting works
with:

puts ‘1sqHmb5b8G9mN’ < ‘1Xv5LeB9bMdar’
puts ‘a’ < ‘B’

class String
alias <=> casecmp
end

puts ‘1sqHmb5b8G9mN’ < ‘1Xv5LeB9bMdar’
puts ‘a’ < ‘B’

Let’s see what ri tell us about casecmp:

ri casecmp


String#casecmp
str.casecmp(other_str) => -1, 0, +1


 Case-insensitive version of String#<=>.

    "abcdef".casecmp("abcde")     #=> 1
    "aBcDeF".casecmp("abcdef")    #=> 0
    "abcdef".casecmp("abcdefg")   #=> -1
    "abcdef".casecmp("ABCDEF")    #=> 0

Best regards,


Eustáquio “TaQ” Rangel
http://eustaquiorangel.com

“When someone says, ‘I want a programming language in which I need only
say what
I want done,’ give him a lollipop.”
Alan Perlis

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHIzTYb6UiZnhJiLsRApzlAKCPKoMhI2Wt+puNwOJQB3yo2gTBHQCfd2Pf
R/rIWKh9b7/tXJphk7KRziI=
=wNTo
-----END PGP SIGNATURE-----

unsubscribe

Greg W. wrote:

‘1sqHmb5b8G9mN’ < ‘1Xv5LeB9bMdar’

Wouldn’t you think that is supposed to be TRUE ?

All my text editors and Excel and Numbers all sort it so that 1s…
comes before 1X…

But Ruby says the above comparison is false.

What am I missing?

– gw

Ruby - like several other languages - makes the comparison based on the
underlying code. When you only have English texts it may be confusing,
because you expect a different behavior.

I think it is always a bad idea to compare texts in this way, because
the character sequence is first based on language definitions (e.g.
letter “ö” is in German the same lake “oe”, while in Swedish it comes
after “z”), and second for a language there may be different standards
too (e.g. telefone book sequence versus language definition sequence, as
in German).

If You want to compare Strings based on the usage in a language, you
should better use an appropriate sequence definition.

Wolfgang Nádasi-Donner

On 10/27/07, Wolfgang Nádasi-Donner [email protected] wrote:

If You want to compare Strings based on the usage in a language, you
should better use an appropriate sequence definition.

What is the best way to create a new sort order?
For example, if the order is this:
%w[1 2 3 4 5 6 7 8 9 0 A a B b…]
and every other character to be sorted using Array#sort after the
alphanumeric characters.

On 10/27/07, Devi Web D. [email protected] wrote:

For example, if the order is this:

Mind you, that was just a sample, I really want to understand the best
method to make any arbitrary sort order.