Best way to download >1GB files

What is the best way to download files from the internet (HTTP) that
are greater than 1GB?

Here’s the story in whole…
I was trying to use Ruby Net::HTTP to manage a download from
wikipedia… Specifically all current versions of the english one…
But anyways, as I was downloading it, I got a memory error as I ran
out of RAM.

My current code:
open(@opts[:out], “w”) do |f|
http = Net::HTTP.new(@url.host, @url.port)
c = http.start do |http|
a = Net::HTTP::Get.new(@url.page)
http.request(a)
end
f.write(c.body)
end

I was hoping there’d be some method that I can attach a block to, so
that for each byte it will call the block.

Is there some way to write the bytes to the file as they come in, not
at the end?

Thanks,
---------------------------------------------------------------|
~Ari
“I don’t suffer from insanity. I enjoy every minute of it” --1337est
man alive

thefed wrote:

    http = Net::HTTP.new(@url.host, @url.port)

Is there some way to write the bytes to the file as they come in, not at
the end?

Not precisely what you asked for, but this is how ara t. howard told me
to download large files, using open-uri. This gets one 8kb sized chunk
at a time:

     open(uri) do |fin|
       open(File.basename(uri), "w") do |fout|
         while (buf = fin.read(8192))
           fout.write buf
         end
       end
     end

On Dec 31, 2007, at 5:15 PM, Tim H. wrote:

    end

But doesn’t open-uri download the whole thing to your compy? I was
about to use it, but then I ran it in irb and saw it returned a file
object.

-------------------------------------------------------|
~ Ari
seydar: it’s like a crazy love triangle of Kernel commands and C code

thefed wrote:

But doesn’t open-uri download the whole thing to your compy? I was about
to use it, but then I ran it in irb and saw it returned a file object.

Isn’t that what you want to happen? I thought your question was about
how to download it in small chunks so it’s not all in memory at the same
time. This code downloads the whole file, but 8kb at a time.

On Dec 31, 2007, at 7:20 PM, Tim H. wrote:

thefed wrote:

But doesn’t open-uri download the whole thing to your compy? I was
about to use it, but then I ran it in irb and saw it returned a
file object.

Isn’t that what you want to happen? I thought your question was
about how to download it in small chunks so it’s not all in memory
at the same time. This code downloads the whole file, but 8kb at a
time.

No, I thought when you use Kernel#open with open-uri, it FIRST
downloads the entire 1GB file to your temp folder, and THEN runs your
block on that file in temp

On Dec 31, 2007, at 7:23 PM, Bryan D. wrote:

Is there some reason to not use wget or curl? Those are both
written already. What are you hoping to do with the files you
download?

I’m trying to write wget/axel in ruby. Plus add torrent support!

On 01/01/2008, thefed [email protected] wrote:

On Dec 31, 2007, at 7:23 PM, Bryan D. wrote:

Is there some reason to not use wget or curl? Those are both
written already. What are you hoping to do with the files you
download?

I’m trying to write wget/axel in ruby. Plus add torrent support!

Is there some particular reason not to use Aria2, it’s already written
:wink:

Yes, the UI sucks, and it cannot download multifile torrents from the
web as well but to compete with that you would have to make something
really good :slight_smile:

Thanks

Michal

Is there some reason to not use wget or curl? Those are both written
already. What are you hoping to do with the files you download?

-Bryan

On Jan 1, 2008, at 1:38 PM, Michal S. wrote:

Is there some particular reason not to use Aria2, it’s already
written :wink:

Yes, the UI sucks, and it cannot download multifile torrents from the
web as well but to compete with that you would have to make something
really good :slight_smile:

Well then I have a competitor!

I’m really writing this just for practice, but also because I think
the world needs a ruby downloader.

Maybe to give myself a fighting chance against aria2, I’ll lower the
version numbers instead of raising them.

  • Ari

thefed wrote:

same time. This code downloads the whole file, but 8kb at a time.

No, I thought when you use Kernel#open with open-uri, it FIRST downloads
the entire 1GB file to your temp folder, and THEN runs your block on
that file in temp

Interesting. I just tried downloading a 6.1MB file with open-uri and
didn’t see that behavior. I’m using Ruby 1.8.6 on OS X 10.5.

On Jan 1, 2008, at 1:56 PM, Tim H. wrote:

8kb at a time.
No, I thought when you use Kernel#open with open-uri, it FIRST
downloads the entire 1GB file to your temp folder, and THEN runs
your block on that file in temp

Interesting. I just tried downloading a 6.1MB file with open-uri
and didn’t see that behavior. I’m using Ruby 1.8.6 on OS X 10.5.

That’s good then! I’ll test it out myself juuuust to make sure. I
don’t to waste 4GB of space when i only need 2GB.

open-uri uses Net::HTTP, of course. Am I correct?

Net::HTTP wraps connections in a Timeout, which is REALLY screwing
with me downloading large files.

Will probably get some monkeys to patch that for me.

  • Ari

the world needs a ruby downloader.

Maybe to give myself a fighting chance against aria2, I’ll lower the
version numbers instead of raising them.

There’s also ruby-torrent:

http://rubytorrent.rubyforge.org/

I think you should definitely use BitTorrent rather than HTTP. I spoke
to the maintainer/developer a while ago and I think ruby-torrent isn’t
being actively worked on, but it could definitely save you some
headaches if you start there.


Giles B.

Podcast: http://hollywoodgrit.blogspot.com
Blog: http://gilesbowkett.blogspot.com
Portfolio: http://www.gilesgoatboy.org
Tumblelog: http://giles.tumblr.com

(I have homework to do)
Are you insane? Firstly it already has a RubyForge page with download
files, secondly I mentioned having spoken to the maintainer - which
would mean the maintainer was not me - and thirdly who would say yes
to that?

(And fourth, kind of a tangent, but who expects an O’Reilly book on
Ruby to have accurate information?)


Giles B.

Podcast: http://hollywoodgrit.blogspot.com
Blog: http://gilesbowkett.blogspot.com
Portfolio: http://www.gilesgoatboy.org
Tumblelog: http://giles.tumblr.com

On Jan 1, 2008, at 3:43 PM, Giles B. wrote:

(I have homework to do)

Are you insane?

If it’s a gem, it means EASY INSTALL

and thirdly who would say yes
to that?

Someone who’s looking to start off the new year with a good deed :slight_smile:

-------------------------------------------------------|
~ Ari
crap my sig won’t fit

Wouldn’t it be cool if we could keep Zed S. in a cage and feed him
newbies?

On 1/1/08, thefed [email protected] wrote:

to that?

Someone who’s looking to start off the new year with a good deed :slight_smile:

-------------------------------------------------------|
~ Ari
crap my sig won’t fit


Giles B.

Podcast: http://hollywoodgrit.blogspot.com
Blog: http://gilesbowkett.blogspot.com
Portfolio: http://www.gilesgoatboy.org
Tumblelog: http://giles.tumblr.com

On Jan 1, 2008, at 2:46 PM, Giles B. wrote:

There’s also ruby-torrent:

http://rubytorrent.rubyforge.org/

Heh, that’s what I’m using in snoopy right now. Although, ruby-
torrent is about 1000 LOC, and since its not a gem, snoopy could get
pretty fat.

Would you mind packaging it and releasing it as a gem?

(I have homework to do)

-------------------------------------------------------|
~ Ari
if god gives you lemons
YOU FIND A NEW GOD

Giles B. kirjoitti:

Wouldn’t it be cool if we could keep Zed S. in a cage and feed him newbies?

You mean AFTER you have sniped at the newbies, right? The kettle, the
pot, et cetera.

Csmr

On Jan 1, 2008, at 4:37 PM, Giles B. wrote:

Wouldn’t it be cool if we could keep Zed S. in a cage and feed
him newbies?

not funny >:-(

You asked about if frames are tagged as “leather” or “no leather”

On Jan 2, 2008 10:26 AM, Giles B. [email protected] wrote:

newbies, I say wouldn’t it be great if we could just let Zed handle
it. Because he’s better at it. Where does a pot and a kettle enter the
equation?

So I guess it’s a case of a pot saying about a kettle, that “black is
beautiful!”


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

On Jan 2, 2008, at 10:26 AM, Giles B. wrote:

You mean AFTER you have sniped at the newbies, right? The kettle, the
pot, et cetera.

What are you talking about? I don’t get it. Yes, after I snipe at
newbies, I say wouldn’t it be great if we could just let Zed handle
it. Because he’s better at it. Where does a pot and a kettle enter the
equation?

Thou art nub, good sir