Ruby Hacking Guide - New chapters (and a bonus)

Vincent_I · April 5, 2006, 11:01am

Hi everyone,

Here they are, translations of chapter 3, 4 and 6 of the Ruby Hacking
Guide!
We know the translation is far from being perfect, and we welcome any
correction on the text or diagrams of any chapter (even chapter 2).
Please send them as patches (attached to the mail, not just in the
body of the message) on the rhg-discussion mailing list
(http://rubyforge.org/mailman/listinfo/rhg-discussion). The patches
should be done against the text files in the SVN repository
(http://rubyforge.org/scm/?group_id=1387).

We also introduced a new feature: previews. It means we put on the web
page chapters that have not be fully proofread and that may have
missing diagrams. They are labelled with a big ‘PREVIEW’ on it. So do
not hesitate to check our web page often to be able to read the
chapters before they are announced (and send us corrections!). For
example, the previews chapters released today were made available on
the web page more that one week ago.

I may repeat myself but we still need people to help, especially
translators. If you can, even if it’s only for one chapter, come help
us. Proofreaders are also welcome, the more they are, the better.

I would like to especially thank the following people for making it
possible:

Clifford Caoile for translating chapter 3
Meinrad R. for making the diagrams
Jim Driscoll for his proofreading
and of course Minero A. for allowing us to translate his book.

So if you want to read it, the official web page is still
http://rhg.rubyforge.org/ :).

But wait! Today I also have a bonus: a quick translation of matz’
YAPC::Asia 2006 slides. The slide in Japanese are available here:
http://www.rubyist.net/~matz/slides/yapc2006/
They are mainly about multilingualisation in Ruby 2. Many thanks to
matz for letting me post this translation and correcting my stupid
mistranslations.

For those who have no idea what TRON or Mojikyo is, and what are the
problems of Unicode in Japan, you should check this:
http://www.jbrowse.com/text/unij.html

So here comes the translation. It’s fact from being perfect, but it’s
still easier to understand than the Japanese version

– beginning of the translation
YAPC::Asia 2006

Ruby on Perl(s)

Yukihiro “Matz” Matsumoto
[email protected]

Copyright (c) 2006 Yukihiro “Matz” Matsumoto, No rights reserved though.

How was Ruby born?

in a Lisp(ish) system
I added object oriented capabilities
and took in some Perl functions
–
That’s why
Perl is Ruby’s big brother
–
Or
Ruby’s big sister
–
Therefore

Hello World is
print “hello world\n”
in Perl, Ruby or Python

But in PHP it’s

<?php echo "hello world"?>

quite different on this point

Ruby and Perl are similar but

Perl has everything
Ruby’s heart is object oriented
–
Ruby and Perl are similar but
Perl uses (most of the time) a functional word order
Ruby uses a Japanese word order
–
Functional word order

print reverse();

prints the reversed ARGV.

Japanese word order

ARGF.readlines.reverse.display

Take ARGF,
call readlines on it,
reverse readlines’ result,
display reverse’s result
(this is natural order in Japanese language)

Ruby and Perl are similar but

Larry is American (even if he studies the Japanese language)
Matz is Japanese (even if he studies the English language)
–
Ruby and Perl are similar but

Perl is Unicode-centered
Ruby is decentralized
–
Ruby and Perl are similar but
Perl uses UCS (Universal Character Set)
Ruby is (will be) CSI (Character Set Independent)
–
What are your complaints towards Unicode?
it’s thoroughly used, isn’t it.
resentment towards Han unification?
inferiority complex of Japanese people?
–
What are your complaints towards Unicode?
no, no I do no have any complaints about Unicode
in the domains where Unicode is adequate
–
Then, why CSI?

In most applications, UCS is enough thanks to Unicode.
However, there are also applications for which this is not the case.

Fields for which Unicode is not enough
Big character sets

Konjaku-Mojikyo (Japanese encoding which includes many more than
Unicode)
TRON code
GB18030
–
Fields for which Unicode is not fitted
Legacy encodings
conversion to UCS is useless
big conversion tables
round-trip problem
–
If a language chooses the UCS system
you cannot write non-UCS applications
you can’t handle text that can’t be expressed with Unicode
–
If a language chooses the CSI system
CSI is a superset of UCS
Unicode just has to be handled in CSI
–
… is what we can say but
CSI is difficult
can it really be implemented?
–
That’s where comes out Japan’s traditional arts

Adaptation for the Japanese language of applications

Modification of English language applications to be able to process
Japanese
–
Adaptation for the Japanese language of applications
What engineers of long ago experienced for sure
- Emacs (NEmacs)
- Perl (JPerl)
- Bash
  –
  Accumulation of know-how

In Japan, the know-how of adaptation for the Japanese language
(multi-byte text processing)
has been accumulated.

Accumulation of know-how

in the first place, just for local use,
text using 3 encodings circulate
(4 if including UTF-8)

Based on this know-how

multibyte text encodings
switching between encodings at the string level
processing them at practical speed
is finished
–
Available encodings

euc_tw euc_jp iso8859_* utf-8 utf-32le
ascii euc_kr koi8 utf-16le utf-32be
big5 gb2312 sjis utf-16be

…and many others
If it’s a stateless encodings, in principle it can be available.

It means
For applications using only one encoding, code conversion is not needed

Moreover
Applications wanting to handle multiple encodings can choose an
internal encoding (generally Unicode) that includes all others

If you want to

you can also handle multiple encodings without conversion, letting
characters as they are
but this is difficult so I do not recommend it
–
However,
only the basic part is done,
it’s far from being ready for practical use
code conversion
guessing encoding
etc.
–
For the time being, today
I want to tell everyone:
UCS is practical
but not all-purpose
CSI is not impossible
–
The reason I’m saying that
They may add CSI in Perl6 as they had added
Methods called by “.”
Continuations
from Ruby.
Basically, they hate losing.
–
Thank you
– end of the translation

Vincent_I · April 5, 2006, 11:35am

Vincent I. ha scritto:

Hi everyone,

Here they are, translations of chapter 3, 4 and 6 of the Ruby Hacking Guide!

But wait! Today I also have a bonus: a quick translation of matz’
YAPC::Asia 2006 slides.

guys, thanks sooo much to all of you

Vincent_I · April 5, 2006, 11:45am

Vincent,

This is awesome. Thank you to you and everyone else in this project.
This is just excellent!

Zach

Vincent_I · April 5, 2006, 4:11pm

Great work doing the translations!

Vincent_I · April 5, 2006, 9:22pm

On Apr 5, 2006, at 4:58 AM, Vincent I. wrote:

(http://rubyforge.org/scm/?group_id=1387).
Thanks! I just starting reading and, and found this gem of an
explanation for Qnil:

By the way, what is the ?Q? of Qnil? ?R? I would have understood
but why ?Q?? When I asked, the answer was ?Because it?s like that
in Emacs?. I did not have the fun answer I was expecting?
Me either!

Vincent_I · April 5, 2006, 10:02pm

On 4/5/06, Logan C. [email protected] wrote:

Please send them as patches (attached to the mail, not just in the
in Emacs". I did not have the fun answer I was expectingâ?¦
Me either!

It surely has to do with StarTrek, does it not?

–
Deux choses sont infinies : l’univers et la bÃªtise humaine ; en ce qui
concerne l’univers, je n’en ai pas acquis la certitude absolue.

Albert Einstein

Vincent_I · April 7, 2006, 6:36am

On Apr 6, 2006, at 7:29 AM, Christian N. wrote:

correction on the text or diagrams of any chapter (even chapter 2).
but why ?Q?? When I asked, the answer was ?Because it?s like that
in Emacs?. I did not have the fun answer I was expecting?
Me either!

Same reason ?x is used for character codes, by the way.

Whatever happened to #\x ??

Vincent_I · April 6, 2006, 1:32pm

Logan C. [email protected] writes:

(http://rubyforge.org/mailman/listinfo/rhg-discussion). The patches
should be done against the text files in the SVN repository
(http://rubyforge.org/scm/?group_id=1387).

Thanks! I just starting reading and, and found this gem of an
explanation for Qnil:

By the way, what is the â??Qâ?? of Qnil? â??Râ?? I would have understood
but why â??Qâ??? When I asked, the answer was â??Because itâ??s like that
in Emacsâ?. I did not have the fun answer I was expectingâ?¦
Me either!

Same reason ?x is used for character codes, by the way.

Vincent_I · April 7, 2006, 12:03pm

Logan C. [email protected] writes:

We know the translation is far from being perfect, and we welcome

By the way, what is the â??Qâ?? of Qnil? â??Râ?? I would have understood
but why â??Qâ??? When I asked, the answer was â??Because itâ??s like that
in Emacsâ?. I did not have the fun answer I was expectingâ?¦
Me either!

Same reason ?x is used for character codes, by the way.

Whatever happened to #\x ??

Debugger entered–Lisp error: (invalid-read-syntax “#”)

I’m not sure where the ?x really orginates from, MacLisp didn’t
support it, and it meant something different in TECO.

Ruby Hacking Guide - New chapters (and a bonus)

Copyright (c) 2006 Yukihiro “Matz” Matsumoto, No rights reserved though.

Hello World is print “hello world\n” in Perl, Ruby or Python

quite different on this point

prints the reversed ARGV.

Take ARGF, call readlines on it, reverse readlines’ result, display reverse’s result (this is natural order in Japanese language)

In most applications, UCS is enough thanks to Unicode. However, there are also applications for which this is not the case.

In Japan, the know-how of adaptation for the Japanese language (multi-byte text processing) has been accumulated.

in the first place, just for local use, text using 3 encodings circulate (4 if including UTF-8)

…and many others If it’s a stateless encodings, in principle it can be available.

It means For applications using only one encoding, code conversion is not needed

Moreover Applications wanting to handle multiple encodings can choose an internal encoding (generally Unicode) that includes all others