Reading web service resposes in chunks?


#1

I am making a web service call and getting back very large responses
(sometimes 5gb). When I get this response it eats all of my RAM. I need
to read the response in chunks so I can store it in a file and I have no
idea how to do this. Any help is greatly appreciated.

Here is my code:

require ‘soap/wsdlDriver’

soap = SOAP::WSDLDriverFactory.new(“some url”).create_rpc_driver
soap.wiredump_file_base = “soapfile”

response = soap.GetWhatever(:whatever => "whatever)

Ironically, when reading the response it doesn’t dump it into the file
until it gets the entire response into memory, this is what’s killing my
server. Is there a more efficient way of doing this?

Thanks for your help and time.


#2

On Wed, Feb 28, 2007 at 07:08:32AM +0900, Ben J. wrote:

soap = SOAP::WSDLDriverFactory.new(“some url”).create_rpc_driver
soap.wiredump_file_base = “soapfile”

response = soap.GetWhatever(:whatever => "whatever)

Ironically, when reading the response it doesn’t dump it into the file
until it gets the entire response into memory, this is what’s killing my
server. Is there a more efficient way of doing this?

You just want to get the whole response into a file? Then I’d suggest:

  1. build the SOAP XML request as a string

  2. connect to the server using HTTP

  3. post the XML you built in step 1

  4. read the response as a stream and write it to a file.

To get the response as a stream, you can probably still use Net::HTTP
for
this. If the response from the server is chunked (use tcpdump to check
this), you can call HTTPResponse#read_body with a block, and you will
get
the chunks passed to you in turn. The following example is given in the
documentation:

 # using block
 http.request_post('/cgi-bin/nice.rb', 'datadatadata...') 

{|response|
p response.status
p response[‘content-type’]
response.read_body do |str| # read body now
print str
end
}

If the response is not chunked, then just pull out the @socket from the
object and read(65536) it in a loop.

If you want to parse the response on the fly, then you could use rexml
in
stream parsing mode: see
http://www.germane-software.com/software/XML/rexml/docs/tutorial.html
and
scroll down to “Stream Parsing”

You then may need an IO.pipe or similar object which accepts the HTTP
chunks
on one side and gives a readable stream on the other.

But this may still be a problem if your 5GB response consists mainly of
a
single element, …5GB of data…. I’m not sure if
REXML will call text() with blocks, or will try to slurp the whole 5GB
in
before calling text() once.

HTH,

Brian.