tfpt review “/shelveset:EncodingsFinal;REDMOND\tomat”
Outer DLR:
-
Adds Invariant, Ensures, Result, Parameter and Out stubs to
ContractUtils mimicking Dev10 contracts. These allow us to specify
post-conditions and object invariants in code rather than comments.
Ruby:
-
Implements infrastructure for $KCODE variable. There are only
3 encodings settable to KCODE (UTF8, SJIS, EUC). These encodings are
implemented as special encodings (aka “k-codings”, RubyEncoding.KCode*
singletons) and need to be special cased. For example, String#size on a
string containing a single UTF8 2-byte character returns 1 if its
encoding is UTF8, but 2 if it is KCodeUTF8. This emulates MRI 1.8 where
strings have no associated encoding.
-
$KCODE is in general considered obsolete and is not available
in Silverlight build.
-
Replaces List<byte> and StringBuilder MutableString
representations with byte[] and char[]. Reimplements basic
char/byte/string buffer operations and moves them to Utils.cs.
-
Improves implementation of MutableString.GetHashCode - the
hashcode is now cached on the string until the string is modified. The
hash code calculation includes encoding if there are any non-ASCII
characters in the string. Otherwise the encoding is not part of the
hash.
-
Adds support for multi-byte identifiers in source code if the
file has non-binary encoding or k-coding. Any non-ASCII character is
considered a lower case letter for the purpose of identifier
classification (constant, global var, instance var, class var, local,
method name).
-
Fixes \xXX escapes in encoded strings - subsequent escaped
bytes can form a single character or part if a character. In both cases
the string’s representation is switched to binary so that no information
is lost. StringContentBuilder takes care of construction such strings.
At runtime a string with an incomplete character suffix can be
concatenated with a string with the missing part of the character and
together these bytes might form a valid character.
-
Adds bunch of unit tests for MutableString and encodings.
-
Reimplements String#dump and String#inspect to handle encoded
strings correctly. Moves the implementation to MutableString so that we
can use it as a debug view for MutableString as well.
-
Fixes specs - KCODE was set to UTF8 by one spec and not
restored, which affected subsequent specs.
Tomas