Writing to Binary Files

I’m having some serious issues writing to a binary file. Can anyone
give me a hand?
And could I be having this problem because I’m trying to do this on a
Linux machine?
Thanks!

doug meyer wrote:

I’m having some serious issues writing to a binary file. Can anyone
give me a hand?
Can you show us some code?

And could I be having this problem because I’m trying to do this on a
Linux machine?
Unlikely… I do this fairly regularly.

Well, try to tell us what you are doing?

If the answer to your question is as simple as

textFile=File.open("~/foo.text",“w”)
binaryFile=File.open("~/bar.dump",“wb”)

Then everyone will tell you to read up ;), but since it’s probably
more complex, tell us what the issue is.

(and if it’s writing Marshal.dump stuff, your problem isn’t writing,
it’s reading)
–Kyle

Hi,

Am Mittwoch, 22. Aug 2007, 06:35:32 +0900 schrieb doug meyer:

I’m having some serious issues writing to a binary file. Can anyone
give me a hand?
And could I be having this problem because I’m trying to do this on a
Linux machine?

I just repeat what I wrote this afternoon
(http://www.ruby-forum.com/topic/122170#544613).

The distinction between “text” and “binary” is the archetype
misdesign in DOS and Windows. It means nothing more than
that in “text” mode line ends are translated from “\n” to
“\r\n” what is of no use but to disturb file positions and
string lengths. The only purpose of this is to detain
programmers from doing anything in a non-Microsoft way.
Anywhere else you don’t need to care.

Sorry for the flame but that’s the way it is.

Bertram

Hi,

Am Mittwoch, 22. Aug 2007, 08:51:17 +0900 schrieb Michael T. Richter:

switches) which itself was inherited from IBM’s (and others’) various
operating systems.

So the system that had to do it different? Wasn’t Microsoft’s. Nor
even IBM’s. UNIX was the one that had to be different from everybody
else. And it is UNIX that is to blame for this artificial text/binary
file distinction.

You’re right. I almost forgot that. It’s not fair to imply
MS invented anything.

Bertram

Bertram S. wrote:

Bertram

So where do files like database or spreadsheet files fit? They
obviously aren’t text files, sending one to a printer will only produce
garbage or a locked printer. Binary is a useful distinction to denote
that a file is not “ready” for use without intervention, either through
the OS or a program. Whether Unix treats text differently than anyone
else is not of concern as a simple test will show if a file is a Unix
“text” file or other text file. I have zero problems moving text files
back and forth between Unix and Windows programs. I can even move my
“binary” database files between the two without any problems. I can’t
use either of them to read the database files without a program but that
is why it is treated differently.

On Wed, 2007-22-08 at 08:13 +0900, Bertram S. wrote:

The distinction between “text” and “binary” is the archetype
misdesign in DOS and Windows.

And this explains the distinction between opening binary vs. opening
text in UNIX APIs since LONG before MS-DOS how?

It means nothing more than
that in “text” mode line ends are translated from “\n” to
“\r\n” what is of no use but to disturb file positions and
string lengths. The only purpose of this is to detain
programmers from doing anything in a non-Microsoft way.
Anywhere else you don’t need to care.

Sorry for the flame but that’s the way it is.

It would help if you actually said things the way they were. This “text
mode” vs. “binary mode” thing is a UNIX “innovation” (one of many which
has plagued the computing world since UNIX’s misdesign). Let me
introduce to you what “the way it is” really is…

Way back in the bad old days, people talked to computers on teletype
machines: combination printer/keyboard. We didn’t have these fancy,
schmancy glass-screened terminals all over the place. On these
terminals “carriage return” meant “move the printer head to the far
left”. “Line feed” meant “scroll the paper down one line”. These were
completely separate actions requiring completely separate control codes.
("\n" is the “line feed” or “newline”. “\r” is the “carriage return”.)

Most systems of the day wrote everything in a single format. There was
no binary/text distinction. Each line was ended by a carriage return
and a line feed. (I still have some of these systems up and running on
my laptop thanks to good old SIMH.) When you printed these files,
whatever their contents were was run straight to the teletype and
printed out verbatim. That meant each line ended with “\r\n”.

UNIX, of course, being the half-bastard-child of real operating systems
(MULTICS and ITS) that it was, had to do things differently. To save on
space (!) its creators, in their nigh-infinite wisdom and judgement,
plagued the world with the notion of only using “\n” to terminate text
lines in text files. (Apparently saving one byte out of every line was
important! Never mind that OSes on smaller machines than ever ran UNIX
had no problem with that “wasted” carriage return…) Of course this
meant that you couldn’t just copy the bits of a document directly to the
teletype. Oh, no. You had to open the file in a special text mode so
the OS would convert things behind the scenes for you, switching every
“\n” into a “\r\n” before sending it off to the teletype. This was
perceived (incorrectly) as a Great Innovation.

Later, as the UNIX infection set in, “smart” terminals (teletypes and
glass screen) started to, if set appropriately, automatically convert
line feeds into carriage return/line feed combinations. This was a
feature added to make up for a misfeature in UNIX systems, though, not
something that was really necessary. (Indeed it breaks the definition
of a line feed according to the ASCII definition thereof.)

MS-DOS arrived on the scene from a different direction. It came from
the CP/M side of things which was itself heavily influenced by IBM’s
operating systems (scaled down, of course, to the teensy CPU that ran
it). CP/M? Used the more traditional (at the time) CR/LF combinations
found in pretty much every operating system of the day other than UNIX.
MS-DOS was a hack off of a CP/M clone for the new 8086 processor and, as
such, inherited CP/M’s approach to text files (and command line
switches) which itself was inherited from IBM’s (and others’) various
operating systems.

So the system that had to do it different? Wasn’t Microsoft’s. Nor
even IBM’s. UNIX was the one that had to be different from everybody
else. And it is UNIX that is to blame for this artificial text/binary
file distinction.

From: Michael T. Richter

behind the scenes for you, switching every “\n” into a “\r\n” before sending
it off to the teletype. This was perceived (incorrectly) as a Great
Innovation.

It’s sounding like it really was a Great Innovation in those days to
separate the
model from the view. In retrospect it’s taken for granted as good
design.
The specific print head movement characteristics of a particular piece
of display
hardware have no business polluting the internal representation of a
portable
text file format.

Later, as the UNIX infection set in, “smart” terminals (teletypes and glass
screen) started to, if set appropriately, automatically convert line feeds
into carriage return/line feed combinations. This was a feature added to
make up for a misfeature in UNIX systems, though, not something that was
really necessary.

A misfeature would be taking completely independent teletype carriage
return
and line feed output control bytes, and perverting them into an atomic
LINE ENDING
SEQUENCE PAIR.

A teletype doesn’t need the carriage return and linefeed characters to
follow one another back-to-back. They’re just independent ways to move
the print head. As I recall, I used to send carriage returns
independently from
linefeeds to our teletype whenever I pleased, if I wanted to write over
the same
line twice, for instance… for bold-face, underline, overstrike,
whatever.

Taking two independent print head control characters, and artifically
gluing them
together into an atomic line-ending marker, just adds noise to what
should have
been a portable, device-independent text file format.

Thankfully the Unix guys got it right.

Regards,

Bill

From: “John J.” [email protected]

Portable? In those days, networking and portability were not real
concerns on the minds of anyone.
Back then, people really believed their code would disappear in a few
years, to be replaced by something else!

Well I tried. :slight_smile: I started as a youngster in the late '70’s
but we DID have a real teletype printer. Complete with margin
bell at the end of the line. A fast touch typist could probably
have out-paced the thing.

All in all, it is should be evident that there are two (3 if you
count the old non-unix mac os) line ending paradigms to care about.
Not too bad! Consider how many other things are splintered more!

Indeed, but I’d call unix \n or old-mac \r format equally reasonable.
It’s the unnecessarily redundant \r\n that seems clunky to me.

This is not a matter of pointing fingers or saying my OS is better
than yours,
this is simply a matter of doing what needs to be done.

Oh, I wasn’t intending to get into the OS vs. OS finger-pointing.
It’s like the Atari ST vs. the Amiga. One of them totally sucked!

Regards,

Bill

2007/8/22, Bertram S. [email protected]:

The distinction between “text” and “binary” is the archetype
misdesign in DOS and Windows. It means nothing more than
that in “text” mode line ends are translated from “\n” to
“\r\n” what is of no use but to disturb file positions and
string lengths. The only purpose of this is to detain
programmers from doing anything in a non-Microsoft way.
Anywhere else you don’t need to care.

Sorry for the flame but that’s the way it is.

I prefer to just take it as given that different operating systems
treat line endings differently and go from there. I can’t remember
that this caused an issue for me: I open binary files with “rb” and
text files with “r” on all platforms. I even find this helps
understanding the code better (documentation). And I cannot remember a
single case where someone processed a text file and needed exact file
positions; line numbers are typically more interesting.

Relax :slight_smile:

robert

It’s sounding like it really was a Great Innovation in those days
to separate the
model from the view. In retrospect it’s taken for granted as good
design.
The specific print head movement characteristics of a particular
piece of display
hardware have no business polluting the internal representation of
a portable
text file format.
Portable? In those days, networking and portability were not real
concerns on the minds of anyone.
Back then, people really believed their code would disappear in a few
years, to be replaced by something else!

All in all, it is should be evident that there are two (3 if you
count the old non-unix mac os) line ending paradigms to care about.
Not too bad! Consider how many other things are splintered more!

This is not a matter of pointing fingers or saying my OS is better
than yours,
this is simply a matter of doing what needs to be done.

Even now, a “simple” text file could have any kind of crazy internal
formatting that is meaningful only to some particular program.
The history is interesting, but hardly important to writing the code
for now.

“Michael T. Richter” [email protected] writes:

On Wed, 2007-22-08 at 08:13 +0900, Bertram S. wrote:

The distinction between “text” and “binary” is the archetype
misdesign in DOS and Windows.

And this explains the distinction between opening binary vs. opening
text in UNIX APIs since LONG before MS-DOS how?

Hmm. So which Unix system call do you use to open a file in text
mode?

mode" vs. “binary mode” thing is a UNIX “innovation” (one of many which

plagued the world with the notion of only using “\n” to terminate text
lines in text files. (Apparently saving one byte out of every line was

In fact, Multics used LF as newline and Unix copied the convention
from it.

important! Never mind that OSes on smaller machines than ever ran UNIX
had no problem with that “wasted” carriage return…) Of course this
meant that you couldn’t just copy the bits of a document directly to the
teletype. Oh, no. You had to open the file in a special text mode so
the OS would convert things behind the scenes for you, switching every
“\n” into a “\r\n” before sending it off to the teletype. This was
perceived (incorrectly) as a Great Innovation.

You seem to be claiming that open(2) and read(2) had additional
functionality that was later removed (?). That’s interesting. Do you
have a reference?

Later, as the UNIX infection set in, “smart” terminals (teletypes and
glass screen) started to, if set appropriately, automatically convert
line feeds into carriage return/line feed combinations. This was a
feature added to make up for a misfeature in UNIX systems, though, not
something that was really necessary. (Indeed it breaks the definition
of a line feed according to the ASCII definition thereof.)

Translation in Unix is done by the tty driver in the kernel, not the
tty device itself…

MS-DOS arrived on the scene from a different direction. It came from
the CP/M side of things which was itself heavily influenced by IBM’s
operating systems (scaled down, of course, to the teensy CPU that ran
it). CP/M? Used the more traditional (at the time) CR/LF combinations
found in pretty much every operating system of the day other than UNIX.

You seem to be forgetting various systems that used other newline
conventions, e.g. Macintosh. And that EBCDIC has a newline character.

MS-DOS was a hack off of a CP/M clone for the new 8086 processor and, as
such, inherited CP/M’s approach to text files (and command line
switches) which itself was inherited from IBM’s (and others’) various
operating systems.

…And that MS-DOS also copied from CP/M the practice of marking the
end of a text file within the last cluster with ^Z which,

  1. required distinguishing between text and binary files independent
    of newline conventions, and
  2. indeed breaks the definition of ^Z according to the ASCII
    definition thereof.

Steve

This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.

| Privacy Policy | Terms of Service | Remote Ruby Jobs