Ruby 1.8 vs 1.9

Peter_Pincus · November 27, 2010, 9:05am

On Friday, November 26, 2010 05:51:38 am Phillip G. wrote:

On Fri, Nov 26, 2010 at 1:42 AM, David M. [email protected] wrote:

I’m really curious why anyone would go with an IBM mainframe for a
greenfield system, let alone pick EBCDIC when ASCII is fully supported.

Because that’s how the other applications written on the mainframe the
company bought 20, 30, 40 years ago expect their data, and the same
code still runs.

In other words, not quite greenfield, or at least, a somewhat
different
sense of greenfield.

But I guess that explains why you’re on a mainframe at all. Someone put
their
data there 20, 30, 40 years ago, and you need to get at that data,
right?

Legacy systems like that have so much money invested in them, with
code poorly understood (not necessarily because it’s bad code, but
because the original author has retired 20 years ago),

Which implies bad code, bad documentation, or both. Yes, having the
original
author available tends to make things easier, but I’m not sure I’d know
what
to do with the code I wrote 1 year ago, let alone 20, unless I document
the
hell out of it.

Want perpetual job security? Learn COBOL.

I considered that…

It’d have to be job security plus a large enough paycheck I could either
work
very part-time, or retire in under a decade. Neither of these seems
likely, so
I’d rather work with something that gives me job satisfaction, which is
why
I’m doing Ruby.

Peter_Pincus · November 26, 2010, 4:32pm

The big picture is that IEEE floating point is solidly grounded in
mathematics regarding infinity. Phillip wants to convince us that this
is not the case. He wants us to believe that the design of floating
point regarding infinity is wrong and that he knows better. He is
mistaken. That is all you need to know. Details follow.

The most direct refutation of his claims comes from the actual reason
why infinity was included in floating point:

http://docs.sun.com/source/806-3568/ncg_goldberg.html#918

Infinity prevents wildly incorrect results. It also removes the need
to check certain special cases.

Now it happens that floating point is backed by a mathematical model:
the extended real line. Phillip tells us that the extended real line
is only useful for the 1% of programmers who are mathematicians. He is
wrong. It is used every time infinity prevents an incorrect result or
simplifies a calculation. The mathematics behind floating point design
is slightly more than elementary, but that does not mean every
programmer is required to have full knowledge of it.

What follows is an examination of Phillip’s descent into absurdity,
apparently caused by a compelling need justify the mantras he learned
in high school. If you are interested in the psychological phenomenon
of cognitive dissonance, or if you still think that Phillip is being
coherent, then keep reading.

This conversation began when Phillip said about 1/0,

It cannot be infinity. It does, quite literally not compute. There’s
no room for interpretation, it’s a fact of (mathematical) life that
something divided by nothing has an undefined result. It doesn’t
matter if it’s 0, 0.0, or -0.0. Undefined is undefined.

That other languages have the same issue makes matters worse, not
better (but at least it is consistent, so there’s that).

It’s clear here that Phillip is unaware that IEEE floating point was
designed to approximate the affinely extended real numbers, which has
a rigorous definition of infinity along with operations upon it.

Floating point infinity obeys all the rules laid out there. Also
notice the last paragraph in that link.

The IEEE standard, however, does not define how mathematics work.
Mathematics does that. In math, x_0/0 is undefined. It is not
infinity (David kindly explained the difference between limits and
numbers), it is not negative infinity, it is undefined. Division by
zero cannot happen…

So, from a purely mathematical standpoint, the IEEE 754 standard is
wrong by treating the result of division by 0.0 any different than
dividing by 0…

Here Phillip further confirms that he is unaware that IEEE used the
mathematical definition of the extended reals. He thinks infinity was
defined on the whim of the IEEE designers. No, mathematics told them
how it worked.

This conversation only continues because Phillip is trying desperately
cover up his ignorance rather than simply acknowledging it and moving
on.

I was polite when I corrected him the first time, however when he
ignored this correction along with a similar one by Adam, obstinately
repeating his mistaken belief instead, that’s when directness is
required. For whatever reason he is compelled to “fake” expertise in
this area despite being repeatedly exposed for doing so. To wit:

Infinity defined this way has solid mathematical meaning and is
established on a firm foundation, described in the link above.

A firm foundation that is not used in algebraic math.

This sentence is not even meaningful. What is “algebraic math”? That
phrase makes no sense, especially to a mathematician. The extended
reals is of course an algebraic structure with algebraic properties,
so whatever “algebraic math” means here must apply to the extended
reals.

Right, IEEE does not define how mathematics works. IEEE took the
mathematical definition and properties of infinity and
incorporated it into the standard. Clearly, you were unaware of
this and repeatedly ignored the information offered to you about
it.

It took a definition and a set of properties. If we are
splitting hairs, let’s do it properly, at least.

The affinely extended reals is the two-point compactification of the
real line. The real projective line is the one-point compactification
of the real line. These compactifications are unique. In a desperate
display of backpedaling, Phillip only succeeds confirming his
ignorance of the topic about which he claims expertise.

Pal, in algebraic maths, division by zero is undefined. End of
story. We are talking about algebraic math here…

More nonsensical “algebraic math” terminology. What is this? Do you
mean algebraic numbers? No, you can’t mean that, since floating point
is used for approximating transcendentals as well. Again, the extended
reals is an algebraic structure with algebraic properties. Your
introduction of the term “algebraic math” is just more backpedaling
done in a manifestly incompetent way. In trying to move the goalposts,
the goalposts fell on your head. As if that wasn’t bad enough, you
absurdly claim that I was moving goalposts.

For more on what happened to Phillip, see the Dunning-Kruger
effect. Don’t let it happen to you.

Peter_Pincus · November 27, 2010, 6:42pm

On Sat, Nov 27, 2010 at 9:04 AM, David M. [email protected]
wrote:

sense of greenfield.
You don’t expect anyone to throw their older mainframes away, do you?

But I guess that explains why you’re on a mainframe at all. Someone put their
data there 20, 30, 40 years ago, and you need to get at that data, right?

Oh, don’t discard mainframes. For a corporation the size of SAP (or
needing SAP software), a mainframe is still the ideal hardware to
manage the enormous databases collected over the years.

And mainframes with vector CPUs are ideal for all sorts of simulations
engineers have to do (like aerodynamics), or weather research.

Legacy systems like that have so much money invested in them, with
code poorly understood (not necessarily because it’s bad code, but
because the original author has retired 20 years ago),

Which implies bad code, bad documentation, or both. Yes, having the original
author available tends to make things easier, but I’m not sure I’d know what
to do with the code I wrote 1 year ago, let alone 20, unless I document the
hell out of it.

It gets worse 20 years down the line: The techniques used and state of
the art then are forgotten now, for example (nobody uses GOTO, or
should use it, anyway) any more, and error handling is done with
exceptions these days, instead of error codes, for example. And TDD
didn’t even exist as a technique.

Together with a very, very conservative attitude, changes are
difficult to deal with, if they can be implemented at all.

Assuming the source code still exists, anyway.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Peter_Pincus · November 27, 2010, 9:48pm

On Sat, Nov 27, 2010 at 7:50 PM, David M. [email protected]
wrote:

I suppose I expected people to be developing modern Linux apps that just
happen to compile on that hardware.

Linux is usually not the OS the vendor supports. Keep in mind, a day
of lost productivity on this kind of systems means losses in the
millions of dollars area.

But then, corporations the size of Google tend to store their information
distributed on cheap PC hardware.

If they were incorporated where there was such a thing as “cheap PC
hardware”. Google is a young corporation, even in IT. And they need
loads of custom code to make their search engine and datacenters
perform and scale, too.

And mainframes with vector CPUs are ideal for all sorts of simulations
engineers have to do (like aerodynamics), or weather research.

When you say “ideal”, do you mean they actually beat out the cluster of
commodity hardware I could buy for the same price?

Sure, if you can shell out for about 14 000 Xeon CPUs and 7 000 Tesla
GPGPUs (Source: Tianhe-1 - Wikipedia ).

All three of which suggest to me that in many cases, an actual greenfield
project would be worth it. IIRC, there was a change to the California minimum
wage that would take 6 months to implement and 9 months to revert because it
was written in COBOL – but could the same team really write a new payroll
system in 15 months? Maybe, but doubtful.

So, you’d bet the corporation on the size of Exxon Mobile, Johnson &
Johnson, General Electric and similar, just because you think it is
easier to do changes 40 years later in an unproven, unused, upstart
language?

The clocks in the sort of shops that still run mainframes tick very
different from what you or I are used to.

But it’s still absurdly wasteful. A rewrite would pay for itself with only a
few minor changes that’d be trivial in a sane system, but major year-long
projects with the legacy system.

If the rewrite would pay for itself in the short term, then why hasn’t
it been done?

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Peter_Pincus · November 27, 2010, 7:51pm

On Saturday, November 27, 2010 11:41:59 am Phillip G. wrote:

On Sat, Nov 27, 2010 at 9:04 AM, David M. [email protected] wrote:

On Friday, November 26, 2010 05:51:38 am Phillip G. wrote:

On Fri, Nov 26, 2010 at 1:42 AM, David M. [email protected]
wrote:

You don’t expect anyone to throw their older mainframes away, do you?

I suppose I expected people to be developing modern Linux apps that just
happen to compile on that hardware.

But I guess that explains why you’re on a mainframe at all. Someone put
their data there 20, 30, 40 years ago, and you need to get at that data,
right?

Oh, don’t discard mainframes. For a corporation the size of SAP (or
needing SAP software), a mainframe is still the ideal hardware to
manage the enormous databases collected over the years.

Well, now that it’s been collected, sure – migrations are painful.

But then, corporations the size of Google tend to store their
information
distributed on cheap PC hardware.

And mainframes with vector CPUs are ideal for all sorts of simulations
engineers have to do (like aerodynamics), or weather research.

When you say “ideal”, do you mean they actually beat out the cluster of
commodity hardware I could buy for the same price?

the art then are forgotten now, for example (nobody uses GOTO, or
should use it, anyway) any more, and error handling is done with
exceptions these days, instead of error codes, for example. And TDD
didn’t even exist as a technique.

Together with a very, very conservative attitude, changes are
difficult to deal with, if they can be implemented at all.

Assuming the source code still exists, anyway.

All three of which suggest to me that in many cases, an actual
greenfield
project would be worth it. IIRC, there was a change to the California
minimum
wage that would take 6 months to implement and 9 months to revert
because it
was written in COBOL – but could the same team really write a new
payroll
system in 15 months? Maybe, but doubtful.

But it’s still absurdly wasteful. A rewrite would pay for itself with
only a
few minor changes that’d be trivial in a sane system, but major
year-long
projects with the legacy system.

So, yeah, job security. I’d just hate my job.

Peter_Pincus · November 27, 2010, 10:35pm

Robert K. wrote in post #963807:

But that basically is my point. In order to make your program
comprehensible, you have to add extra incantations so that strings are
tagged as UTF-8 everywhere (e.g. when opening files).

However this in turn adds nothing to your program or its logic, apart
from preventing Ruby from raising exceptions.

Checking input and ensuring that data reaches the program in proper
ways is generally good practice for robust software.

But that’s not what Ruby does!.

If you do
s1 = File.open(“foo”,“r:UTF-8”).gets
it does not check that the data is UTF-8. It just adds a tag saying
that it is.

Then later, when you get s2 from somewhere else, and have a line like s3
= s1 + s2, it might raise an exception if the encodings are different.
Or it might not, depending on the actual content of the strings at that
time.

Say s2 is a string read from a template. It may work just fine, as long
as s2 contains only ASCII characters. But later, when you decide to
translate the program and add some non-ASCII characters into the
template, it may blow up.

If it blew up on the invalid data, I’d accept that. If it blew up
whenever two strings of different encodings encounter, I’d accept that.
But to have your program work through sheer chance, only to blow up some
time later when it encounters a different input stream - no, that sucks.

In that case, I would much rather the program didn’t crash, but at least
carried on working (even in the garbage-in-garbage-out sense).

Brian, it seems you want to avoid the complex matter of i18n - by
ignoring it. But if you work in a situation where multiple encodings
are mixed you will be forced to deal with it - sooner or later.

But you’re never going to want to combine two strings of different
encodings without transcoding them to a common encoding, as that
wouldn’t make sense.

So either:

Your program deals with the same encoding from input through to
output, in which case there’s nothing to do
You transcode at the edges into and out of your desired common
encoding

Neither approach requires each individual string to carry its encoding
along with it.

Peter_Pincus · November 28, 2010, 1:57am

On Saturday, November 27, 2010 02:47:12 pm Phillip G. wrote:

On Sat, Nov 27, 2010 at 7:50 PM, David M. [email protected] wrote:

I suppose I expected people to be developing modern Linux apps that just
happen to compile on that hardware.

Linux is usually not the OS the vendor supports. Keep in mind, a day
of lost productivity on this kind of systems means losses in the
millions of dollars area.

In other words, you need someone who will support it, and maybe someone
who’ll
accept that kind of risk. None of the Linux vendors are solid enough? Or
is it
that they don’t support mainframes?

And mainframes with vector CPUs are ideal for all sorts of simulations
engineers have to do (like aerodynamics), or weather research.

When you say “ideal”, do you mean they actually beat out the cluster of
commodity hardware I could buy for the same price?

Sure, if you can shell out for about 14 000 Xeon CPUs and 7 000 Tesla
GPGPUs (Source: Tianhe-1 - Wikipedia ).

From that page:

“Both the original Tianhe-1 and Tianhe-1A use a Linux-based operating
system… Each blade is composed of two compute nodes, with each compute
node
containing two Xeon X5670 6-core processors and one Nvidia M2050 GPU
processor.”

I’m not really seeing a difference in terms of hardware.

All three of which suggest to me that in many cases, an actual greenfield
project would be worth it. IIRC, there was a change to the California
minimum wage that would take 6 months to implement and 9 months to
revert because it was written in COBOL – but could the same team really
write a new payroll system in 15 months? Maybe, but doubtful.

So, you’d bet the corporation

Nope, which is why I said “doubtful.”

just because you think it is
easier to do changes 40 years later in an unproven, unused, upstart
language?

Sorry, “unproven, unused, upstart”? Which language are you talking
about?

But it’s still absurdly wasteful. A rewrite would pay for itself with
only a few minor changes that’d be trivial in a sane system, but major
year-long projects with the legacy system.

If the rewrite would pay for itself in the short term, then why hasn’t
it been done?

The problem is that it doesn’t. What happens is that those “few minor
changes”
get written off as “too expensive”, so they don’t happen. Every now and
then,
it’s actually worth the expense to make a “drastic” change anyway, but
at that
point, again, 15 months versus a greenfield rewrite – the 15 months
wins.

So it very likely does pay off in the long run – being flexible makes
good
business sense, and sooner or later, you’re going to have to push
another of
those 15-month changes. But it doesn’t pay off in the short run, and
it’s hard
to predict how long it will be until it does pay off. The best you can
do is
say that it’s very likely to pay off someday, but modern CEOs get
rewarded in
the short term, then take their pensions and let the next guy clean up
the
mess, so there isn’t nearly enough incentive for long-term thinking.

And I’m not sure I could make a solid case that it’d pay for itself
eventually. I certainly couldn’t do so without looking at the individual
situation. Still wasteful, but maybe not worth fixing.

Also, think about the argument you’re using here. Why hasn’t it been
done? I
can think of a few reasons, some saner than others, but sometimes the
answer
to “Why hasn’t it been done?” is “Everybody was wrong.” Example: “If it
was
possible to give people gigabytes of email storage for free, why hasn’t
it
been done?” Then Gmail did, and the question became “Clearly it’s
possible to
give people gigabytes of email storage for free. Why isn’t Hotmail doing
it?”

Peter_Pincus · November 28, 2010, 12:06am

On Wed, Nov 24, 2010 at 11:07 AM, James Edward G. II
[email protected] wrote:

On Nov 24, 2010, at 9:47 AM, Phillip G. wrote:

Convert your strings to UTF-8 at all times, and you are done. You have
to check for data integrity anyway, so you can do that in one go.

Thank you for being the voice of reason.

I’ve fought against Brian enough in the past over this issue, that I try to stay
out of it these days. However, his arguments always strike me as wanting to
unlearn what we have learned about encodings.

We can’t go back. Different encodings exist. At least Ruby 1.9 allows us to work
with them.

My experience with 1.9 so far is that some of my ruby scripts have
become much faster. I have other scripts which have needed to deal
with a much wider range of characters than “standard ascii”. I got
those string-related scripts working fine in 1.8. They all seem to
break in 1.9.

In my own opinion, the problem isn’t 1.9, is that I wrote these
string-handling scripts in ruby before ruby really supported all the
characters I had to deal with. I look forward to getting my scripts
switched over to 1.9, but there’s no question that getting to 1.9 is
going to require a bunch of work from me. That’s just the way it is.
Not the fault of ruby 1.9, but it’s still some work to fix the
scripts.

Peter_Pincus · November 28, 2010, 5:36pm

On Sunday, November 28, 2010 08:00:18 am Phillip G. wrote:

On Sun, Nov 28, 2010 at 1:56 AM, David M. [email protected] wrote:

In other words, you need someone who will support it, and maybe someone
who’ll accept that kind of risk. None of the Linux vendors are solid
enough? Or is it that they don’t support mainframes?

Both, and the Linux variant you use has to be certified by the
hardware vendor, too. Essentially, a throwback to the UNIX
workstations of yore: if you run something uncertified, you don’t get
the support you paid for in the first place.

Must be some specific legacy systems, because IBM does seem to be
supporting,
or at least advertising, Linux on System Z.

get them to play well with each other (like concurrency, and avoiding
bottlenecks that leads to a hold up in several nodes of you cluster).

Probably. You originally called this a “Mainframe”, and that’s what
confused
me – it definitely seems to be more a cluster than a mainframe, in
terms of
hardware and software.

Sorry, “unproven, unused, upstart”? Which language are you talking about?

Anything that isn’t C, ADA or COBOL. Or even older.

Lisp, then?

This is a very,
very conservative mindset, where not even Java has a chance.

If age is the only consideration, Java is only older than Ruby by a few
months, depending how you count.

I’m not having a problem with it being a conservative mindset, but it
seems
irrationally so. Building a mission-critical system which is not allowed
to
fail out of a language like C, where an errant pointer can corrupt data
in an
entirely different part of the program (let alone expose
vulnerabilities),
seems much riskier than the alternatives.

About the strongest argument I can see in favor of something like C over
something like Lisp for a greenfield project is that it’s what everyone
knows,
it’s what the schools are teaching, etc. Of course, the entire reason
the
schools are teaching COBOL is that the industry demands it.

Don’t forget the engineering challenge. Doing the Great Rewrite for
software that’s 20 years in use (or even longer), isn’t something that
is done on a whim, or because this new-fangled “agile movement” is
something the programmers like.

I’m not disputing that.

Unless there is a very solid business case (something on the level of
“if we don’t do this, we will go bankrupt in 10 days” or similarly
drastic), there is no incentive to fix what ain’t broke (for certain
values of “ain’t broke”, anyway).

This is what I’m disputing. This kind of thinking is what allows
companies
like IBM to be completely blindsided by companies like Microsoft.

Google has a big incentive, and a big benefit going for it:

Which doesn’t change my core point. After all:

a) Google wants your data, so they can sell you more and better ads.

What’s Microsoft’s incentive for running Hotmail at all? I have to
imagine
it’s a similar business model.

b) The per MB cost of hard drives came down significantly in the
last 10 years.

Yes, but Google was the first to offer this. And while it makes sense in
hindsight, when it first came out, people were astonished. No one
immediately
said “Oh, this makes business sense.” They were too busy rushing to
figure out
how they could use this for their personal backup, since gigabytes of
online
storage for free was unprecedented.

Then, relatively quickly, everyone else did the same thing, because
people
were leaving Hotmail for Gmail for the storage alone, and no one wanted
to be
the “10 mb free” service when everyone else was offering over a hundred
times
as much.

I’m certainly not saying people should do things just because they’re
cool, or
because programmers like them. Clearly, there has to be a business
reason. But
the fact that no one’s doing it isn’t a reason to assume it’s a bad
idea.

Peter_Pincus · November 28, 2010, 3:01pm

On Sun, Nov 28, 2010 at 1:56 AM, David M. [email protected]
wrote:

In other words, you need someone who will support it, and maybe someone who’ll
accept that kind of risk. None of the Linux vendors are solid enough? Or is it
that they don’t support mainframes?

Both, and the Linux variant you use has to be certified by the
hardware vendor, too. Essentially, a throwback to the UNIX
workstations of yore: if you run something uncertified, you don’t get
the support you paid for in the first place.

“Both the original Tianhe-1 and Tianhe-1A use a Linux-based operating
system… Each blade is composed of two compute nodes, with each compute node
containing two Xeon X5670 6-core processors and one Nvidia M2050 GPU
processor.”

I’m not really seeing a difference in terms of hardware.

We are probably talking on cross purposes here:
You can build a vector CPU cluster out of commodity hardware, but it
involves a) a lot of hardware and b) a lot of customization work to
get them to play well with each other (like concurrency, and avoiding
bottlenecks that leads to a hold up in several nodes of you cluster).

Sorry, “unproven, unused, upstart”? Which language are you talking about?

Anything that isn’t C, ADA or COBOL. Or even older. This is a very,
very conservative mindset, where not even Java has a chance.

So it very likely does pay off in the long run – being flexible makes good
business sense, and sooner or later, you’re going to have to push another of
those 15-month changes. But it doesn’t pay off in the short run, and it’s hard
to predict how long it will be until it does pay off. The best you can do is
say that it’s very likely to pay off someday, but modern CEOs get rewarded in
the short term, then take their pensions and let the next guy clean up the
mess, so there isn’t nearly enough incentive for long-term thinking.

Don’t forget the engineering challenge. Doing the Great Rewrite for
software that’s 20 years in use (or even longer), isn’t something that
is done on a whim, or because this new-fangled “agile movement” is
something the programmers like.

Unless there is a very solid business case (something on the level of
“if we don’t do this, we will go bankrupt in 10 days” or similarly
drastic), there is no incentive to fix what ain’t broke (for certain
values of “ain’t broke”, anyway).

Also, think about the argument you’re using here. Why hasn’t it been done? I
can think of a few reasons, some saner than others, but sometimes the answer
to “Why hasn’t it been done?” is “Everybody was wrong.” Example: “If it was
possible to give people gigabytes of email storage for free, why hasn’t it
been done?” Then Gmail did, and the question became “Clearly it’s possible to
give people gigabytes of email storage for free. Why isn’t Hotmail doing it?”

Google has a big incentive, and a big benefit going for it:
a) Google wants your data, so they can sell you more and better ads.
b) The per MB cost of hard drives came down significantly in the
last 10 years. For my external 1TB HD I paid about 50 bucks, and for
my internal 500GB 2.5" HD I paid about 50 bucks. For that kind of
money, you couldn’t buy a 500 GB HD 5 years ago.

Without cheap storage, free email accounts with Gigabytes of storage
are pretty much impossible.

CUDA and GPGPUs have become available only in the last few years, and
only because GPUs have become insanely powerful and insanely cheap at
the same time.

If you were building the architecture that requires mainframes today,
I doubt anyone would buy a Cray without some very serious
considerations (power consumption, ease of maintenance, etc) in favor
of the Cray.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Peter_Pincus · November 28, 2010, 6:19pm

On Sun, Nov 28, 2010 at 5:33 PM, David M. [email protected]
wrote:

Must be some specific legacy systems, because IBM does seem to be supporting,
or at least advertising, Linux on System Z.

Oh, they do. But it’s this specific Linux, and you get locked into it.
Compile the kernel yourself, and you lose support.

And, of course, IBM does that to keep their customers locked in. While
Linux is open source, it’s another angle for IBM to stay in the game.
Not all that successful, considering that mainframes are pretty much a
dying breed, but it keeps this whole sector on life support.

Probably. You originally called this a “Mainframe”, and that’s what confused
me – it definitely seems to be more a cluster than a mainframe, in terms of
hardware and software.

Oh, it is. You can’t build a proper mainframe out of off the shelf
components, but a mainframe is a cluster of CPUs and memory, anyway,
so you can “mimic” the architecture.

Sorry, “unproven, unused, upstart”? Which language are you talking about?

Anything that isn’t C, ADA or COBOL. Or even older.

Lisp, then?

If there’s commercial support, then, yes. The environment LISP comes
from is the AI research in MIT, which was done on mainframes, way back
when.

This is a very,
very conservative mindset, where not even Java has a chance.

If age is the only consideration, Java is only older than Ruby by a few
months, depending how you count.

It isn’t. Usage on mainframes is a component, too. And perceived
stability and roadmap safety (a clear upgrade path is desired quite a
bit, I wager).

And, well, Java and Ruby are young languages, all told. Mainframes
exist since the 1940s at the very least, and that’s the perspective
that enabled “Nobody ever got fired for buying IBM [mainframes]”.

I’m not having a problem with it being a conservative mindset, but it seems
irrationally so. Building a mission-critical system which is not allowed to
fail out of a language like C, where an errant pointer can corrupt data in an
entirely different part of the program (let alone expose vulnerabilities),
seems much riskier than the alternatives.

That is a problem of coding standards and practices. Another reason
why change in these sorts of systems is difficult to achieve. Now
imagine a language like Ruby that comes with things like reflection,
duck typing, and dynamic typing.

About the strongest argument I can see in favor of something like C over
something like Lisp for a greenfield project is that it’s what everyone knows,
it’s what the schools are teaching, etc. Of course, the entire reason the
schools are teaching COBOL is that the industry demands it.

A vicious cycle, indeed. Mind, for system level stuff C is still the
goto language, but not for anything that sits above that. At least,
IMO.

Unless there is a very solid business case (something on the level of
“if we don’t do this, we will go bankrupt in 10 days” or similarly
drastic), there is no incentive to fix what ain’t broke (for certain
values of “ain’t broke”, anyway).

This is what I’m disputing. This kind of thinking is what allows companies
like IBM to be completely blindsided by companies like Microsoft.

Assuming that the corporation is actually an IT shop. Proctor &
Gamble, or ThyssenKrupp aren’t. For them, IT is supporting the actual
business, and is much more of a cost center than a way to stay
competitive.

Or do you care if the steel beams you buy by the ton, or the cleaner
you buy are produced by a company that does its ERP on a mainframe or
a beowulf cluster?

Google has a big incentive, and a big benefit going for it:

Which doesn’t change my core point. After all:

a) Google wants your data, so they can sell you more and better ads.

What’s Microsoft’s incentive for running Hotmail at all? I have to imagine
it’s a similar business model.

Since MS doesn’t seem to have a clue, either…

Historically, MS bought hotmail, because every body else started
offering free email accounts, and not just ISPs.

And Hotmail still smells of “me, too”-ism.

b) The per MB cost of hard drives came down significantly in the
last 10 years.

Yes, but Google was the first to offer this. And while it makes sense in
hindsight, when it first came out, people were astonished. No one immediately
said “Oh, this makes business sense.” They were too busy rushing to figure out
how they could use this for their personal backup, since gigabytes of online
storage for free was unprecedented.

Absolutely. And Google managed to give possible AdWords customers
another reason to use AdSense: “Look, there’s a million affluent,
tech-savvy people using our mail service, which allows us to mine the
data and to show your ads that much more effectvely!”

Then, relatively quickly, everyone else did the same thing, because people
were leaving Hotmail for Gmail for the storage alone, and no one wanted to be
the “10 mb free” service when everyone else was offering over a hundred times
as much.

That, and Google was the cool kid on the block back then. Which counts
for quite a bit, too. And the market of freemail offerings was rather
stale, until GMail shook it up, and got lots of mind share really
fast.

But most people stuck with their AOL mail addresses, since they didn’t
care about storage, but cared about stuff working. The technorati
quickly switched (I’m guilty as charged), but aunts, and granddads
kept their AOL, EarthLink, or Yahoo! accounts.

I’m certainly not saying people should do things just because they’re cool, or
because programmers like them. Clearly, there has to be a business reason. But
the fact that no one’s doing it isn’t a reason to assume it’s a bad idea.

Of course. But if a whole sector, a whole user base, says “Thanks, but
no thanks”, it has its reasons, too. Cost is one, and the human nature
of liking stability and disliking change plays into it, as well.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Peter_Pincus · November 28, 2010, 6:20pm

On 26.11.2010 01:42, David M. wrote:

16 bits.
The JLS is a bit difficult to read IMHO. Characters are 16 bit and a
single character covers the range of code points 0000 to FFFF.

http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.2.1

Characters with code points greater than FFFF are called “supplementary
characters” and while UTF-16 provides encodings for them as well, these
need two code units (four bytes). They write “The Java programming
language represents text in sequences of 16-bit code units, using the
UTF-16 encoding.”:

http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#95413

IMHO this is not very precise: all calculations based on char can not
directly represent the supplementary characters. These use just a
subset of UTF-16. If you want to work with supplementary characters
things get really awful. Then you need methods like this one

http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#toChars(int)

And if you stuff this sequence into a String all of a sudden
String.length() does no longer return the length in characters what is
in line with what the JavaDocs states

http://download.oracle.com/javase/6/docs/api/java/lang/String.html#length()

Unfortunately the majority of programs I have seen never takes this into
account and uses String.length() as “length in characters”. This awful
mixture becomes apparent in the JavaDoc of class Character, which
explicitly states that there are two ways to deal with characters:

type char (no supplementary supported)
type int (with supplementary)

http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#unicode

You can produce corrupt strings and slice into a half-character in
Java just as you can in Ruby 1.8.

Wait, how?

You can convert a code point above FFFF via Character.toChars() (which
returns a char[] of length 2) and truncate it to 1. But: the resulting
sequence isn’t actually invalid since all values in the range 0000 to
FFFF are valid characters. This isn’t really robust. Even though the
docs say that the longest matching sequence is to be considered during
decoding there is no reliably way to determine whether d80d dd53
represents a single character (code point 013553) or two separate
characters (code points d80d and dd53).

If you like you can play around a bit with this:

gist.github.com

https://gist.github.com/719100

EncodingTest.java

package enc;

public class EncodingTest {

  public static void main(String[] args) {
    boolean lastDefined = !Character.isDefined(0);
    int lastStart = -1;

    for (int i = Character.MIN_CODE_POINT; i <= Character.MAX_CODE_POINT; ++i) {
      final boolean def = Character.isDefined(i);

This file has been truncated. show original

output.txt

Range 0x000000 - 0x000236 (   567 chars) defined
Range 0x000237 - 0x00024f (    25 chars) -
Range 0x000250 - 0x000357 (   264 chars) defined
Range 0x000358 - 0x00035c (     5 chars) -
Range 0x00035d - 0x00036f (    19 chars) defined
Range 0x000370 - 0x000373 (     4 chars) -
Range 0x000374 - 0x000375 (     2 chars) defined
Range 0x000376 - 0x000379 (     4 chars) -
Range 0x00037a - 0x00037a (     1 chars) defined
Range 0x00037b - 0x00037d (     3 chars) -

This file has been truncated. show original

I mean, yes, you can deliberately build strings out of corrupt data, but if
you actually work with complete strings and string concatenation, and you
aren’t doing crazy JNI stuff, and you aren’t digging into the actual bits of
the string, I don’t see how you can create a truncated string.

Well, you can (see above) but unfortunately it is still valid. It just
happens to represent a different sequence.

Kind regards

robert

Peter_Pincus · November 28, 2010, 9:20pm

On Sunday, November 28, 2010 11:19:06 am Phillip G. wrote:

Probably. You originally called this a “Mainframe”, and that’s what
confused me – it definitely seems to be more a cluster than a
mainframe, in terms of hardware and software.

Oh, it is. You can’t build a proper mainframe out of off the shelf
components, but a mainframe is a cluster of CPUs and memory, anyway,
so you can “mimic” the architecture.

When I hear “mainframe”, I think of a combination of hardware and
software
(zOS) which you actually can’t get anywhere else, short of an emulator
(like
Hercules).

This is a very,
very conservative mindset, where not even Java has a chance.

If age is the only consideration, Java is only older than Ruby by a few
months, depending how you count.

It isn’t. Usage on mainframes is a component, too.

IBM does seem to be aggressively promoting not just Linux on mainframes,
but a
Unix subsystem and support for things like Java.

And perceived
stability and roadmap safety (a clear upgrade path is desired quite a
bit, I wager).

Is there “roadmap safety” in C, though?

And, well, Java and Ruby are young languages, all told. Mainframes
exist since the 1940s at the very least, and that’s the perspective
that enabled “Nobody ever got fired for buying IBM [mainframes]”.

Right, that’s why I mentioned Lisp. They’re old enough that I’d argue
the time
to be adopting is now, but I can see someone with a mainframe several
times
older wanting to wait and see.

I’m not having a problem with it being a conservative mindset, but it
seems irrationally so. Building a mission-critical system which is not
allowed to fail out of a language like C, where an errant pointer can
corrupt data in an entirely different part of the program (let alone
expose vulnerabilities), seems much riskier than the alternatives.

That is a problem of coding standards and practices.

There’s a limit to what you can do with that, though.

Another reason
why change in these sorts of systems is difficult to achieve. Now
imagine a language like Ruby that comes with things like reflection,
duck typing, and dynamic typing.

In practice, it doesn’t seem like any of these are as much of a problem
as the
static-typing people fear. Am I wrong?

Given the same level of test coverage, a bug that escapes through a Ruby
test
suite (particularly unit tests) might lead to something like an
“undefined
method” exception from a nil – relatively easy to track down. In Java,
it
might lead to NullPointerExceptions and the like. In C, it could lead to
anything, including silently corrupting other parts of the program.

Technically, it’s possible Ruby could do anything to any other part of
the
program via things like reflection – but this is trivial to enforce.
People
generally don’t monkey-patch core stuff, and monkey-patching is easy to
avoid,
easy to catch, and relatively easy to do safely in one place, and avoid
throughout the rest of your program.

Contrast to C – it’s not like you can avoid pointers, arrays, pointer
arithmetic, etc. And Ruby at least has encapsulation and namespacing –
I
really wouldn’t want to manage a large project in C.

About the strongest argument I can see in favor of something like C over
something like Lisp for a greenfield project is that it’s what everyone
knows, it’s what the schools are teaching, etc. Of course, the entire
reason the schools are teaching COBOL is that the industry demands it.

A vicious cycle, indeed.

I have to wonder if it would be worth it for any of these companies to
start
demanding Lisp. Ah, well.

Mind, for system level stuff C is still the
goto language, but not for anything that sits above that. At least,
IMO.

For greenfield system-level stuff, I’d be seriously considering
something like
Google’s Go. But my opinion probably isn’t worth much here, as I don’t
really
do system-level stuff if I can avoid it (which is almost always). If I
had to,
I’d pass as much off to userland as I could get away with.

Or do you care if the steel beams you buy by the ton, or the cleaner
you buy are produced by a company that does its ERP on a mainframe or
a beowulf cluster?

Not particularly, but I do care if someone else can sell me those beams
cheaper. Even just as a cost center, it matters how much it costs.

And who knows? Maybe someone else just implemented a feature that
actually
does matter to me. Maybe they anticipate when their customers need more
steel
and make them an offer then, or maybe they provide better and tighter
estimates as to when it’ll be ready and how long it’ll take to ship –
maybe
it’s an emergency, I need several tons RIGHT NOW, and someone else
manages
their inventory just a bit better, so they can get it to me days
earlier.

Granted, it’s a slower industry, so maybe spending years (or decades!)
on
changes like the above makes sense. Maybe no one is offering or asking
for the
features I’ve suggested – I honestly don’t know. But this is why it can
matter than one organization can implement a change in a few weeks, even
a few
months, while another would take years and will likely just give up.

But most people stuck with their AOL mail addresses, since they didn’t
care about storage, but cared about stuff working. The technorati
quickly switched (I’m guilty as charged), but aunts, and granddads
kept their AOL, EarthLink, or Yahoo! accounts.

Most of them, for awhile.

But even granddads have grandkids emailing them photos, so there goes
that 10
megs. Now they have to delete stuff, possibly download it and then
delete it.
A grandkid hears them complaining and suggests switching to Gmail.

Peter_Pincus · November 28, 2010, 11:32pm

On Sun, Nov 28, 2010 at 9:19 PM, David M. [email protected]
wrote:

Is there “roadmap safety” in C, though?

Since it is, technically, a standardized language, with defined
behavior in all cases (as if), it is.

Though, considering C++0x was supposed to be finished two years ago…

That is a problem of coding standards and practices.

There’s a limit to what you can do with that, though.

One of cost. Nobody wants to spend the amount of money that NASA
spends on the source for the Space Shuttle, but that code is
guaranteed bug free. Not sure which language is used, though, but I
guess it’s ADA.

In practice, it doesn’t seem like any of these are as much of a problem as the
static-typing people fear. Am I wrong?

Nope. But perceived risk outweighs actual risk. See also: US policy
since 2001 vis a vis terrorism.

throughout the rest of your program.
You know that, I know that, but the CTO of Johnson and Johnson
doesn’t, and probably doesn’t care. Together with the usual
bureaucratic infighting and processes to change anything, you’ll be
SOL most of the time. Alas.

Contrast to C – it’s not like you can avoid pointers, arrays, pointer
arithmetic, etc. And Ruby at least has encapsulation and namespacing – I
really wouldn’t want to manage a large project in C.

Neither would I. But then again, there’s a lot of knowledge for
managing large C code bases. Just look at the Linux kernel, or Windows
NT.

and make them an offer then, or maybe they provide better and tighter
estimates as to when it’ll be ready and how long it’ll take to ship – maybe
it’s an emergency, I need several tons RIGHT NOW, and someone else manages
their inventory just a bit better, so they can get it to me days earlier.

Production, these days, is Just In Time. To stay with our steel
example: Long before the local county got around to nodding your
project through so that you can begin building, you already know what
components you need, and when (since you want to be under budget,
and on time, too), so you order 100 beams of several kinds of steel,
and your aggregates, and bribe the local customs people, long before
you actually need the hardware.

There’s (possibly) prototyping, testing (few 100MW turbines can be
built in series, because demands change with every application), and
nobody keeps items like steel beams (or even cars!) in storage
anymore.

Similar with just about anything that is bought in large quantities
and / or with loads of lead time (like the 787, or A380).

In a nutshell: being a day early, or even a month, doesn’t pay off
enough to make it worthwhile to restructure the whole company’s
production processes, just because J. Junior Developer found a way to
shave a couple of seconds off of the DB query to send off ordering
iron ore.

Granted, it’s a slower industry, so maybe spending years (or decades!) on
changes like the above makes sense. Maybe no one is offering or asking for the
features I’ve suggested – I honestly don’t know. But this is why it can
matter than one organization can implement a change in a few weeks, even a few
months, while another would take years and will likely just give up.

Since it takes years to build a modern production facility, it is
a slower industry, all around. IT is special in that iterates through
hardware, software, and techniques much faster that the rest of the
world.

And an anecdote:
A large-ish steel works corp introduced a PLC system to monitor their
furnaces down to the centidegree Celsius, and the “recipe” down to the
gram. After a week, they deactivated the stuff, since the steel
produced wasn’t up to spec, and the veteran cookers created much
better steel, and cheaper.

Most of them, for awhile.

But even granddads have grandkids emailing them photos, so there goes that 10
megs. Now they have to delete stuff, possibly download it and then delete it.
A grandkid hears them complaining and suggests switching to Gmail.

Except that AOhoo! upgraded their storage. And you’d be surprised
how… stubborn non-techies can be. One reason why I don’t do family
support anymore.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Peter_Pincus · November 29, 2010, 9:12am

On Sun, Nov 28, 2010 at 6:20 PM, Robert K.
[email protected] wrote:

On 26.11.2010 01:42, David M. wrote:

On Wednesday, November 24, 2010 08:40:22 pm Jrg W Mittag wrote:

I mean, yes, you can deliberately build strings out of corrupt data, but
if
you actually work with complete strings and string concatenation, and you
aren’t doing crazy JNI stuff, and you aren’t digging into the actual bits
of
the string, I don’t see how you can create a truncated string.

Well, you can (see above) but unfortunately it is still valid. It just
happens to represent a different sequence.

After reading RFC 2781 - UTF-16, an encoding of ISO 10646 I am not
sure any more whether the last statement still holds. It seems the
presented algorithm can only work reliable if certain code points are
unused. And indeed checking with
Character Name Index shows that D800 and DC00
are indeed reserved. Interestingly enough Java’s
Character.isDefined() returns true for D800 and DC00:

gist.github.com

https://gist.github.com/719100

EncodingTest.java

package enc;

public class EncodingTest {

  public static void main(String[] args) {
    boolean lastDefined = !Character.isDefined(0);
    int lastStart = -1;

    for (int i = Character.MIN_CODE_POINT; i <= Character.MAX_CODE_POINT; ++i) {
      final boolean def = Character.isDefined(i);

This file has been truncated. show original

output.txt

Range 0x000000 - 0x000236 (   567 chars) defined
Range 0x000237 - 0x00024f (    25 chars) -
Range 0x000250 - 0x000357 (   264 chars) defined
Range 0x000358 - 0x00035c (     5 chars) -
Range 0x00035d - 0x00036f (    19 chars) defined
Range 0x000370 - 0x000373 (     4 chars) -
Range 0x000374 - 0x000375 (     2 chars) defined
Range 0x000376 - 0x000379 (     4 chars) -
Range 0x00037a - 0x00037a (     1 chars) defined
Range 0x00037b - 0x00037d (     3 chars) -

This file has been truncated. show original

Cheers

robert

Peter_Pincus · November 29, 2010, 11:53am

And you’d be surprised
how… stubborn non-techies can be.

Not terribly. I’m more surprised how stubborn techies can be.

IME the main problems are:

Operational. You have a whole workforce trained up to use
mainframe-based system A; getting them all to change to working with new
system B can be expensive. This is in addition to their “business as
usual” work.
Change resistance. If system B makes even minor aspects of life for
some of the users more difficult than it was before, those users will
complain very loudly.
Functional. System A embodies in its code a whole load of knowledge
about business processes, some of which is probably obsolete, but much
is still current. It’s probably either not documented, or there are
errors and omissions in the documentation. Re-implementing A as B needs
to reverse-engineer the behaviour and decide which is current and
which is obsolete, or else re-specify it from scratch.

And to be honest, over time new System B is likely to become as
undocumented and hard to maintain as System A was, unless you have a
highly skilled and strongly directed development team.

So, unless System B delivers some killer feature which could not instead
be implemented as new system C alongside existing system A, it’s hard to
make a business case for reimplementing A as B.

The market ensures that IBM prices their mainframe solutions just at the
level where the potential cost saving of moving away from A is
outweighed by the development and rollout cost of B, for most users
(i.e. those who have not migrated away already)

Peter_Pincus · November 29, 2010, 11:32pm

On Monday, November 29, 2010 04:53:48 am Brian C. wrote:

usual" work.
This is what I was talking about, mostly. I’m not even talking about
stuff
like switching to Linux or Dvorak, but I’m constantly surprised by
techies who
use IE because it’s there and they can’t be bothered to change, or C++
because
it’s what they know and they don’t want to learn a new language – yet
they’re
perfectly willing to learn a new framework, which is a lot more work.

Functional. System A embodies in its code a whole load of knowledge
about business processes, some of which is probably obsolete, but much
is still current. It’s probably either not documented, or there are
errors and omissions in the documentation. Re-implementing A as B needs
to reverse-engineer the behaviour and decide which is current and
which is obsolete, or else re-specify it from scratch.

This is probably the largest legitimate reason not to rewrite. In fact,
if
it’s just a bad design on otherwise good technology, an iterative
approach is
slow, torturous, but safe.

And to be honest, over time new System B is likely to become as
undocumented and hard to maintain as System A was, unless you have a
highly skilled and strongly directed development team.

Well, technologies do improve. I’d much rather have an undocumented
and hard
to maintain Ruby script than C program any day, let alone COBOL.

Peter_Pincus · November 29, 2010, 10:49am

On Sunday, November 28, 2010 04:29:34 pm Phillip G. wrote:

On Sun, Nov 28, 2010 at 9:19 PM, David M. [email protected] wrote:

In practice, it doesn’t seem like any of these are as much of a problem
as the static-typing people fear. Am I wrong?

Nope. But perceived risk outweighs actual risk. See also: US policy
since 2001 vis a vis terrorism.

Sounds like we don’t actually disagree.

monkey-patching is easy to avoid, easy to catch, and relatively easy to
do safely in one place, and avoid throughout the rest of your program.

You know that, I know that, but the CTO of Johnson and Johnson
doesn’t,

Then why the fsck is he CTO of anything?

and probably doesn’t care.

This is the part I don’t get.
How do you get to be CTO by not caring about technology?

Together with the usual
bureaucratic infighting and processes to change anything, you’ll be
SOL most of the time. Alas.

Which is, again, a point I’d hope the free market would resolve. If
there’s a
way to build a relatively large corporation without bureaucracy and
process
crippling actual progress, you’d think that’d be a competitive
advantage.

Contrast to C – it’s not like you can avoid pointers, arrays, pointer
arithmetic, etc. And Ruby at least has encapsulation and namespacing – I
really wouldn’t want to manage a large project in C.

Neither would I. But then again, there’s a lot of knowledge for
managing large C code bases. Just look at the Linux kernel, or Windows
NT.

In each case, there wasn’t really a better option, and likely still
isn’t.

Still, I don’t know about Windows, but on Linux, there seems to be a
push to
keep the kernel as small as it can be without losing speed or
functionality.
There were all sorts of interesting ideas in filesystems, but now we
have
fuse, so there’s no need for ftpfs in the kernel. Once upon a time,
there was
a static HTTP server in the kernel, but even a full apache in userspace
is
fast enough.

And the reason is clear: Something blows up in a C program, it can
affect
anything else in that program, or any memory it’s connected to.
Something
blows up in the kernel, it can affect anything.

I’m also not sure how much of that knowledge really translates. After
all, if
an organization is choosing C because it’s the “safe” choice, what are
the
chances they’ll use Git, or open development, or any of the other ways
the
Linux kernel is managed?

Production, these days, is Just In Time. To stay with our steel
example: Long before the local county got around to nodding your
project through so that you can begin building, you already know what
components you need, and when (since you want to be under budget,
and on time, too), so you order 100 beams of several kinds of steel,

So what happens if they cancel your project?

In a nutshell: being a day early, or even a month, doesn’t pay off
enough to make it worthwhile to restructure the whole company’s
production processes, just because J. Junior Developer found a way to
shave a couple of seconds off of the DB query to send off ordering
iron ore.

Shaving a couple seconds off is beside the point. The question is
whether
there’s some fundamental way in which the process can be improved –
something
which can be automated which actually costs a large amount of time, or
some
minor shift in process, or small amount of knowledge…

Another contrived example: Suppose financial records were kept as text
fields
and balanced by hand. The computer still helps, because you have all the
data
in one place, easily backed up, multiple people can be looking at the
same
data simultaneously, and every record is available to everyone who needs
it
instantly.

But as soon as you want to analyze any sort of financial trend, as soon
as you
want to mine that data in any meaningful way, you have a huge problem.
The
query running slowly because it’s text is probably minor enough. The
problem
is that your data is mangled – it’s got points where there should be
commas,
commas where there should be points, typo after typo, plus a few
“creative”
entries like “a hundred dollars.” None of these were issues before –
the
system did work, and had no bugs. But clearly, you want to at least
start
validating new data entered, even if you don’t change how it’s stored or
processed just yet.

In a modern system, adding a validation is a one-liner. Some places,
that
could take a week to go through the process. Some places, it could be
pushed
to production the same day. (And some places arguably don’t have enough
process, and could see that one-liner in production thirty seconds after
someone thought of it.)

To retrofit that onto an ancient COBOL app could take a lot more work.

I don’t know enough about steel to say whether it’s relevant here, but I
have
to imagine that even here, there are opportunities to dramatically
improve
things. Given an opportunity to make the change, in choosing whether to
rewrite or not, I’d have to consider that this isn’t likely to be the
last
change anyone ever makes.

The depressing thing is that in a modern corporation, this sort of
discussion
would be killed reflexively by that conservative-yet-short-term
mentality. A
rewrite may or may not be a sound investment down the road, but if it
costs
money and doesn’t pay off pretty immediately, it’s not worth the risk at
pretty much any level of the company. Not so much because it might not
pay off
ever, but more because investors will see you’ve cost the company money
(if
only in the short term) and want you gone.

And an anecdote:
A large-ish steel works corp introduced a PLC system to monitor their
furnaces down to the centidegree Celsius, and the “recipe” down to the
gram. After a week, they deactivated the stuff, since the steel
produced wasn’t up to spec, and the veteran cookers created much
better steel, and cheaper.

Cool.

I can only wonder how well that works when the veteran cookers retire.
Does
that knowledge translate?

I’ve definitely learned something about steel today, though. Interesting
stuff. Also good to know what I want to avoid…

Most of them, for awhile.

But even granddads have grandkids emailing them photos, so there goes
that 10 megs. Now they have to delete stuff, possibly download it and
then delete it. A grandkid hears them complaining and suggests switching
to Gmail.

Except that AOhoo! upgraded their storage.

Which is kind of my point. Why did they upgrade? While it’s true that
it’s
relatively cheap, and they may also be monetizing their customer’s data,
I
have to imagine at least part of the reason is that they were feeling
the
pressure from Gmail.

And you’d be surprised
how… stubborn non-techies can be.

Not terribly. I’m more surprised how stubborn techies can be.

Peter_Pincus · November 30, 2010, 9:31am

On Mon, Nov 29, 2010 at 10:38 AM, David M. [email protected]
wrote:

Then why the fsck is he CTO of anything?

and probably doesn’t care.

This is the part I don’t get.
How do you get to be CTO by not caring about technology?

Because C-level execs working for any of the S&P 500 don’t deal with
minutiae, and details. They set policy. Whether or not to even look
into the cloud services, if and how to centralize IT support, etc.

The CTO supports the CEO, and you hardly expect the CEO to be
well-versed with a tiny customer, either, would you?

Oh, and he’s the fall guy in case the database gets deleted.

Together with the usual
bureaucratic infighting and processes to change anything, you’ll be
SOL most of the time. Alas.

Which is, again, a point I’d hope the free market would resolve. If there’s a
way to build a relatively large corporation without bureaucracy and process
crippling actual progress, you’d think that’d be a competitive advantage.

There isn’t. The bureaucratic overhead is a result of keeping a) a
distributed workforce on the same page, and b) to provde consistent
results, and c) to keep the business running even if the original
first five employees have long since quit.

It’s why McD and BK can scale, but a Michelin star restaurant can’t.

And the reason is clear: Something blows up in a C program, it can affect
anything else in that program, or any memory it’s connected to. Something
blows up in the kernel, it can affect anything.

I’m also not sure how much of that knowledge really translates. After all, if
an organization is choosing C because it’s the “safe” choice, what are the
chances they’ll use Git, or open development, or any of the other ways the
Linux kernel is managed?

None to zero. But C is older than Linux or Git, too. It’s around for
quite a few years now, and well understood.

Production, these days, is Just In Time. To stay with our steel
example: Long before the local county got around to nodding your
project through so that you can begin building, you already know what
components you need, and when (since you want to be under budget,
and on time, too), so you order 100 beams of several kinds of steel,

So what happens if they cancel your project?

At that late a stage, a project doesn’t get canceled anymore. It can
be postponed, or paused, but it rarely gets canceled.

You don’t order a power plant or a skyscraper on a whim, but because
it is something that is necessary.

And the postponing (or cancelling, as rarely as it happens), has
extreme repercussions. But that’s why there’s breach of contract fees
and such included, to cover the work already done.

Shaving a couple seconds off is beside the point. The question is whether
there’s some fundamental way in which the process can be improved – something
which can be automated which actually costs a large amount of time, or some
minor shift in process, or small amount of knowledge…

That assumes that anything can be optimized. Considering the
accounting standards and practices that are needed, the ISO
certification for ISO 900x, etc. There is little in the way of
optimizing the actual processes of selling goods. Keep in mind, that
IT isn’t he lifeblood of any non-IT corporation, but a means to an
end.

commas where there should be points, typo after typo, plus a few “creative”

To retrofit that onto an ancient COBOL app could take a lot more work.

Why do you think the Waterfall Process was invented? Or IT processes
in the first place? To discover and deliver the features required.

That’s also why new software generally is preferred to change existing
software: It’s easier to implement changes that way, and to plug into
the ERP systems that already exist.

I don’t know enough about steel to say whether it’s relevant here, but I have
to imagine that even here, there are opportunities to dramatically improve
things. Given an opportunity to make the change, in choosing whether to
rewrite or not, I’d have to consider that this isn’t likely to be the last
change anyone ever makes.

If a steel cooker goes down, it takes 24 to 48 hours to get it going
again. It takes about a week for the ore to smelt, and to produce
iron. Adding in carbon to create steel makes this process take even
longer.

So, what’d be the point of improving a detail, when it doesn’t speed
up the whole process significantly?

The depressing thing is that in a modern corporation, this sort of discussion
would be killed reflexively by that conservative-yet-short-term mentality. A
rewrite may or may not be a sound investment down the road, but if it costs
money and doesn’t pay off pretty immediately, it’s not worth the risk at
pretty much any level of the company. Not so much because it might not pay off
ever, but more because investors will see you’ve cost the company money (if
only in the short term) and want you gone.

Agreed.

I can only wonder how well that works when the veteran cookers retire. Does
that knowledge translate?

Yup. Their subordinates acquire the knowledge. That’s how trades are
taught in Europe (in general): In a master-apprentice system, where an
accomplished tradesman teaches their apprentice what they know (used
to be that a freshly minted “Geselle”, as we call non-Masters,
non-apprentices in Germany, went on a long walk through Europe, to
acquire new and refine their skills, before settling down and have
their own apprentices; that’s how the French style of Cathedral
building came to England, for example.)

I’ve definitely learned something about steel today, though. Interesting
stuff. Also good to know what I want to avoid…

Just take what I say with a grain of salt. The closest I got to an
iron smelter was being 500 yards away from one when it had to do an
emergency shutdown because the power cable broke.

Which is kind of my point. Why did they upgrade? While it’s true that it’s
relatively cheap, and they may also be monetizing their customer’s data, I
have to imagine at least part of the reason is that they were feeling the
pressure from Gmail.

Absolutely! GMail did lots of good at reinvograting a stagnat market.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

Peter_Pincus · December 1, 2010, 2:40am

On Tuesday, November 30, 2010 02:31:29 am Phillip G. wrote:

into the cloud services, if and how to centralize IT support, etc.
To do that effectively would require some understanding of these,
however. In
particular, “cloud” has several meanings, some of which might make
perfect
sense, and some of which might be dropped on the floor.

The CTO supports the CEO, and you hardly expect the CEO to be
well-versed with a tiny customer, either, would you?

I’d expect the CEO to know and care at least about management, and
hopefully
marketing and the company itself.

Oh, and he’s the fall guy in case the database gets deleted.

Ideally, the person who actually caused the database to get deleted
would be
responsible – though management should also bear some responsibility.

distributed workforce on the same page,
Yet Google seems to manage with less than half the, erm, org-chart-depth
that
Microsoft has. Clearly, there’s massive room for improvement.

b) to provde consistent
results,

This almost makes sense.

c) to keep the business running even if the original
first five employees have long since quit.

This really doesn’t. How does bureaucracy ensure that more than, say,
the
apprenticeship you described in the steel industry?

You don’t order a power plant or a skyscraper on a whim, but because
it is something that is necessary.

Nothing’s stopping you from switching contractors, or switching to a
different
approach entirely – there’s more than one way to get power.

And the postponing (or cancelling, as rarely as it happens), has
extreme repercussions. But that’s why there’s breach of contract fees
and such included, to cover the work already done.

Then what’s the point of the “final approval” that you’re waiting for?

end.
That seems to be true almost by definition, but major improvements in IT
do
affect non-IT companies. Shipping companies and airlines benefit from
improved
ways to find routes, track packages or flights, and adapt quickly to
changing
conditions (like weather). Supermarkets and retail outlets benefit from
improved ways to manage inventory – to track it, anticipate spikes and
problems, and react to things like a late shipment.

It may be that all the important problems in these areas are solved, but
again, it seems risky to assume that.

But as soon as you want to analyze any sort of financial trend, as soon
as you want to mine that data in any meaningful way, you have a huge
problem…

Why do you think the Waterfall Process was invented? Or IT processes
in the first place? To discover and deliver the features required.

The point of this example is that you don’t necessarily know up front
what the
“requirements” are. It’s not required that you be able to perform such
analysis, and it might not have been feasible when the original program
was
written, but it’s certainly valuable today.