I have a gazillion little files in memory (each is really just a chunk
of data, but it represents what needs to be a single file) and I need
to throw them all into a .tar.gz archive. In this case, it must be
in .tar.gz format and it must unzip into actual files–although I pity
the fellow that actually has to unzip this monstrosity.
Here’s the solutions I’ve come up with so far:
-
Not portable, extremely slow:
write out all these “files” into a directory and make a system
call to tar (tar -xzf …) -
Portable but still just as slow:
write out all these “files” into a directory and use archive-tar-
minitar to make the archive -
Not portable, but fast:
stream information into tar/gzip to create the archive (without
ever first writing out files)
I’ve been looking around on this and the closest I’ve come is this:
tar cvf - some_directory | gzip - > some_directory.tar.gz
Note that this would still require me to write the files to a
directory (which must be avoided at all costs), but at least the
problem now is how to write data into a tar file. I’ve been googling
and still haven’t turned up anything yet.
- Hack archive-tar-minitar to enable me to write my data directly
into the format. Looking at the source code, this doesn’t seem
terribly hard, but not terribly easy either. Am I missing a method
already written for this kind of thing?
Others?
Right now, anything resembling #3 or #4 would work for me.
My feeling is that it shouldn’t be that hard to write data into
a .tar.gz format in either linux or ruby without actually having any
files (i.e., everything in memory or streamed in).
Thanks a lot for any suggestions or ideas!