Question regarding design of the String Class


#1

Was there a reason the string class was implemented with str[i]
returning the code of position i in str? The reason I ask this is that
in other languages str[i] returns the string starting at position i.
For example C uses t = strcpy(str[i]) and Business Basic uses S$=T$(I)
to copy a string from position i.
I can see no way to do this in Ruby other than using something like: t =
str[i,9999]. It seemed strange that copying ranges of strings uses the
same format as C (t =strncpy(str[i],n)) but not when copying the
remainder.


#2

On Apr 22, 2007, at 10:00 PM, Michael W. Ryder wrote:

Was there a reason the string class was implemented with str[i]
returning the code of position i in str? The reason I ask this is
that in other languages str[i] returns the string starting at
position i. For example C uses t = strcpy(str[i]) and Business
Basic uses S$=T$(I) to copy a string from position i.
I can see no way to do this in Ruby other than using something
like: t = str[i,9999]. It seemed strange that copying ranges of
strings uses the same format as C (t =strncpy(str[i],n)) but not
when copying the remainder.

Try str[i,-1], or one of the myriad other ways to access ranges of a
string as defined in String#[]


#3

Roland Crosby wrote:

remainder.

Try str[i,-1], or one of the myriad other ways to access ranges of a
string as defined in String#[]

If I enter:
a = “This is a test.”
b = a[1, -1]
puts b
irb returns nil. Obviously this is not what I want. If instead of -1 I
use 9999 it returns “his a test.” which is what I was looking for. This
seems like a kludge and an inconsistency. Like I pointed out other
languages just use b = a[1] to get the remainder of the string instead
of 104. The string class already has methods like each_byte for
converting characters in a string to a number, so why does it need
another shortcut for something that is probably very rarely used.


#4

Try:
b = a[1…-1]

-Stephen


#5

On Apr 22, 2007, at 10:35 PM, Michael W. Ryder wrote:

when copying the remainder.
of the string instead of 104. The string class already has methods
like each_byte for converting characters in a string to a number,
so why does it need another shortcut for something that is probably
very rarely used.

Sorry, I meant a[1…-1] rather than a[1,-1]. I don’t know why Ruby
returns the character codes like that, but for what it’s worth, I
believe Ruby 1.9 is going to switch to returning a single-character
string when you put one integer in String#[].


#6

“Michael W. Ryder” removed_email_address@domain.invalid writes:

Was there a reason the string class was implemented with str[i]
returning the code of position i in str? The reason I ask this is
that in other languages str[i] returns the string starting at position
i. For example C uses t = strcpy(str[i]) and Business Basic uses
S$=T$(I) to copy a string from position i.

I can’t comment on what “Business Basic” uses, but your C code is
completely wrong. In C, str[i] returns a char which, since C has
“char” as one of its integral types, is equivalent to returning the
character code.

The usual usage of strcpy to copy only from the second (index 1)
character onward is:

strcpy(dest, src + 1);

(And incidentally, using strcpy instead of strncpy is a practice that
often leads to security vulnerabilities)

In other words, ruby’s behavior with str[i] matches the behavior of C

  • it returns the character at that position, where “character” is
    viewed simply as a number.

#7

On 4/22/07, Daniel M. removed_email_address@domain.invalid wrote:

“char” as one of its integral types, is equivalent to returning the
character code.

Daniel, your points are well taken, but if the rusty old neurons in
my brain which contain knowledge of C aren’t mistaken, str[i] isn’t a
function, and therefore doesn’t ‘return’ anything.

C doesn’t really have a string type. A string literal is really an
array of chars, although in almost all cases (i.e. either than when
it’s used in a string initializer, or as the argument to sizeof), it’s
interpreted as a pointer to the first character, due to the
relationship between arrays and pointers in C.

So if str is declared either as:

char str[];
or
char *str;

the expression str[i] is equivalent to *((str) + (i)), it’s really a
pointer to a char, which because of the relationship between arrays
and pointers in c, can be interpreted as an array of chars.

And in Ruby the whole notion of pointers is meaningless.

The point here, of course, is that when learning Ruby, or any other
language, one needs to be aware that things one knows from other
languages often don’t carry over without conceptual modification, if
at all.

If all languages did everything exactly the same way, there’d be no
need for so many of them.

To sum it up, let Ruby be Ruby, don’t expect it to be Java, C++,
Visual Basic, or anything else.


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/


#8

Rick DeNatale wrote:

completely wrong. In C, str[i] returns a char which, since C has
interpreted as a pointer to the first character, due to the
and pointers in c, can be interpreted as an array of chars.

To sum it up, let Ruby be Ruby, don’t expect it to be Java, C++,
Visual Basic, or anything else.

I guess my point was that str[i] behaves totally different from all the
other implementations of []. All of the others return a string. This
seems to be an inconsistency. If there is a valid reason for it I have
no problem, it just makes it harder to transfer over 25 years of
experience to a new language.
In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don’t have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t[i]. Others may not have this problem.


#9

Roland Crosby wrote:

uses the same format as C (t =strncpy(str[i],n)) but not when
other languages just use b = a[1] to get the remainder of the string
instead of 104. The string class already has methods like each_byte
for converting characters in a string to a number, so why does it need
another shortcut for something that is probably very rarely used.

Sorry, I meant a[1…-1] rather than a[1,-1].

I figured that out right after I posted my reply. I had forgotten about
ranges as I have never programmed in a language that used them before.
The little differences can really get you, especially when you find so
many similarities.

I don’t know why Ruby


#10

On Apr 23, 2007, at 10:23 PM, Rick DeNatale wrote:

And in Ruby the whole notion of pointers is meaningless.

This is one thing I really really love about Ruby. (one of many)


#11

On Tue, Apr 24, 2007 at 02:40:09AM +0900, Michael W. Ryder wrote:

I guess my point was that str[i] behaves totally different from all the
other implementations of []. All of the others return a string.

You clearly know a lot of languages then :slight_smile:

As pointed out before, in C, str[i] is an expression whose value is an
integer for the character at position i, exactly as in Ruby.

In Perl, it doesn’t do what you expect either:

$ perl -e ‘$a = “abcde”; print $a[2], “\n”;’

$

(what this actually does is extract an element from the array @a, which
I
have not initialised, and is completely unrelated to the scalar $a)

In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don’t have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t[i]. Others may not have this problem.

Personally I would be very surprised if str[i] returned all the
characters
from ‘i’ to the end of the string. But then I don’t program in Business
Basic.

I do program in C though. If I wanted the string from position i to the
end
of the string, I would write str + i, or possibly &str[i]

In Perl you have to be explicit and call substr()

Brian.


#12

Brian C. wrote:

On Tue, Apr 24, 2007 at 02:40:09AM +0900, Michael W. Ryder wrote:

I guess my point was that str[i] behaves totally different from all the
other implementations of []. All of the others return a string.

You clearly know a lot of languages then :slight_smile:

I probably should have phrased that differently. What I meant that all
of the other implementations of [] in Ruby for the String class return a
string, only str[i] returns a number.

have not initialised, and is completely unrelated to the scalar $a)

Business Basic has been doing this for over the 25 years I have been
programming in it. For example if I enter: A$=“abcdefg” and then say:
Print A$(3) it prints cdefg. Like Ruby, if I enter B$=A$(3,3) B$
contains cde. Other than the beginning number of the string they act
the same.

I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str[i]

But you do not have to provide a length or ending position for the copy
which was part of my confusion. I specify a starting position and the
language copies the rest of the string. In Ruby just providing a
starting position gives me a numeric value.


#13

Robert D. wrote:

I probably should have phrased that differently. What I meant that all
But there are other tools around that make up for it.

(what this actually does is extract an element from the array @a,
order to convince a rubyist that other features might be nice because
they are present in language X, I’d rather chose X from Python, Lisp,
Smalltalk, Self, IO or Lua (and I am leaving out some by laziness and
ignorance)

I am not worried about the influence of other languages or trying to
make Ruby like language X, I am trying to understand the logic behind
some of the choices. It makes it hard when at first glance it looks
familiar, but then you get bitten because of the differences that are
not readily apparent.

this is sounding blunt, to take a break from too much comparing with
they were I really shifted into the paradigm of Ruby, and yes I still
get bitten by duck typing and no I do not introduce type checking, I
just write better tests.

I program for a living in Business Basic so I can’t take a break from
the language to learn Ruby. Instead, I am trying to convert functions
from Business Basic to Ruby and may consider then trying to convert some
of my programs to Ruby. I have already completed one of the function
conversions and may post the results later. Part of the reason for the
conversion would be the ability to “upgrade” the programs to use things
like browsers and SQL instead of the glass tty and flat files built into
Business Basic.


#14

One small nit:

“Rick DeNatale” removed_email_address@domain.invalid writes:

the expression str[i] is equivalent to *((str) + (i)), it’s really a
pointer to a char, which because of the relationship between arrays
and pointers in c, can be interpreted as an array of chars.

No. str[i] is indeed equivalent to *((str) + (i)), but that’s not a
pointer to a char.

((str) + (i)) is a pointer to a char.

*((str) + (i)) is a char.

*((str) + (i)) cannot be interpreted as an array of chars starting i
characters into the original string.

((str) + (i)) could be so interpreted. ((str) + (i)) is equivalent to
&(str[i]) which is a very different thing from str[i].


#15

On 4/23/07, Michael W. Ryder removed_email_address@domain.invalid wrote:

string, only str[i] returns a number.
I copy that, you have made a somehow valid point, that has been
discussed before and do not like either that “ab”[0] == ?a (instead
of “a”). But it is not a clearcut error either.

The overloading (in human terms not computer science terms) of [] to
get elements and substrings of a string might not be the best choice
either. And that there is String#each_byte and not
String#each_character might hurt too.
But there are other tools around that make up for it.

(what this actually does is extract an element from the array @a, which I
have not initialised, and is completely unrelated to the scalar $a)

In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don’t have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t[i]. Others may not have this problem.

I guess that the influence of Basic and C to Ruby are minimal. In
order to convince a rubyist that other features might be nice because
they are present in language X, I’d rather chose X from Python, Lisp,
Smalltalk, Self, IO or Lua (and I am leaving out some by laziness and
ignorance)

I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str[i]

But you do not have to provide a length or ending position for the copy
which was part of my confusion. I specify a starting position and the
language copies the rest of the string. In Ruby just providing a
starting position gives me a numeric value.
Well if you want to get the maximum from Ruby I’d advice you, sorry if
this is sounding blunt, to take a break from too much comparing with
other languages.
Paradigm shifts are tough, after that break you might still think that
“ab”[0] == ?a is
not a good thing, but I am sure that you will be able to bring your
point across much better.

Sorry if I became lecturing, just thought it might help, after all ;).
I remember very well when I was lectured about duck typing, first I
was angry, and I said lots of stupid things (they were very clever in
my Ada world of course), but when I let go and looked at things as
they were I really shifted into the paradigm of Ruby, and yes I still
get bitten by duck typing and no I do not introduce type checking, I
just write better tests.

Welcome to Ruby.

Cheers
Robert


#16

On 4/24/07, Michael W. Ryder removed_email_address@domain.invalid wrote:

I program for a living in Business Basic so I can’t take a break from
the language to learn Ruby. Instead, I am trying to convert functions
from Business Basic to Ruby and may consider then trying to convert some
of my programs to Ruby. I have already completed one of the function
conversions and may post the results later. Part of the reason for the
conversion would be the ability to “upgrade” the programs to use things
like browsers and SQL instead of the glass tty and flat files built into
Business Basic.
I see that makes it particularly difficult, well it just might be a
slower process.
My hint would than be to concentrate on things that seem logical to
you and continue discussing things that do not on this list.
The problem with the single item you chose is that it is indeed a
little odd, but useful in some circumstances.
Maybe you should just live with the str[i…i] notation for a while and
focus on other things.
Sorry for not being more helpful :frowning:

Cheers
Robert


#17

Lee wrote:

are indeed getting back.

If I were getting back a character or string I would not have a problem.
The problem is that every other use of [] in the string class returns
a string or nil, this one returns an integer. I was curious if this was
necessary for some other part of the language or just an “accident”.


#18

You know, you can always type:

s = ‘Hello World’
puts s[1,s.length - 1]

Maybe that’s a bit wordy for you? I think Ruby’s use of [] really
follows convention that you are accessing an index. That is what most
programmers think of when they see []. If you provide a single
Fixnum, one would expect you would get back the contents of that
index, and since strings are arrays of characters, that is what you
are indeed getting back.


#19

Robert D. wrote:

I see that makes it particularly difficult, well it just might be a
slower process.
My hint would than be to concentrate on things that seem logical to
you and continue discussing things that do not on this list.
The problem with the single item you chose is that it is indeed a
little odd, but useful in some circumstances.
Maybe you should just live with the str[i…i] notation for a while and
focus on other things.
Sorry for not being more helpful :frowning:

Thats what I plan on doing. Of course my next project will really test
me as I am going to have to see if there is a way to duplicate a
function in Business Basic that returns the position of a string in
another string. I know that if I “hard code” the search string I can
use str =~ /test/ to get what I want for some cases. The problem is
that I am trying to make test also a string and haven’t figured out how
to do this yet. Just using str.include? test returns either true or
false but not the position of test in str. To make this even more
complicated I want to be able to say something like a = pos(b,c,x) where
b and c are strings and x is an integer. This would mean that the
search only started on every x characters – i.e. if b = “1234” and c =
“234123411234” a would equal 9 rather than 3.
Thanks for the input.


#20

From: Michael W. Ryder [mailto:removed_email_address@domain.invalid]

If I were getting back a character or string I would not have

a problem. The problem is that every other use of [] in the string

class returns a string or nil, this one returns an integer. I was

curious if this was necessary for some other part of the language

or just an “accident”.

evolution perhaps? ruby caters low/old to high-level/newer problem
domains, from specific to general, so…

http://redhanded.hobix.com/inspect/futurismUnicodeInRuby.html


“What exactly can I say to placate you? I can’t have you in despair.” -
_why