When I open http://cnn.com in FF5 when all images are disabled, Firebug
says the size is about 500KB.
Ruby returns the following:
open(“http://cnn.com”).size #85161
Unfortunately when I open this url:
Firebugs says it is 117 KB and Ruby returns a size of 212888
I don’t understand what open-uri actually loads / how it actually works?
And I don’t understand the difference between Firebug and open-uri.
Especially not the last one, where the size is larger using ‘open-uri’
instead of FF5.
Can anyone explain this two things to me?
Jeroen van Ingen wrote in post #1006962:
When I open http://cnn.com in FF5 when all images are disabled, Firebug
says the size is about 500KB.
Ruby returns the following:
open(“http://cnn.com”).size #85161
Unfortunately when I open this url:
≥ Marktplaats - De plek om nieuwe en tweedehands spullen te kopen en verkopen
Firebugs says it is 117 KB and Ruby returns a size of 212888
I don’t understand what open-uri actually loads / how it actually works?
And I don’t understand the difference between Firebug and open-uri.
Especially not the last one, where the size is larger using ‘open-uri’
instead of FF5.
Can anyone explain this two things to me?
Perhaps the difference is attributable to things like this:
which causes the specified external files to be downloaded and added to
the page.
open-uri is a wrapper for net/http, etc, which fetches the raw
html–it doesn’t “execute” any of the html. The size() of a string is
generally going to be the number of 8-bit chunks in the string.
What do you exactly mean with ‘does not execute the HTML?’.
Do you mean it does not fetch external files like this?
Jeroen van Ingen wrote in post #1007027:
What do you exactly mean with ‘does not execute the HTML?’.
Do you mean it does not fetch external files like this?
Exactly. net/http fetches a page of text. A browser also fetches a
page of text, but then
it ‘renders’ the html that is in the text. Some of the html tells the
browser to fetch image files, other html tells the browser to download
css and js files. Some of the html tells the browser to display text in
this column, and other html tells the browser to display radio buttons,
checkboxes, and drop down selects for a form in another column.
If you go to a webpage and click on View Source, that is the entirety of
the text that net/http downloads. As far as net/http is concerned the
page
may as well be a page full of the letter ‘A’ typed over and over again.
Thanks, I figured it out by creating a simple script that counts the
characters from the source of a webpage and compare the with the outcome
of open-uri. It is not exactly the same, but the numbers are close
Now I understand how open-uri works. Thanks