Forum: Ruby YARV speed test: XML processing

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
5810a1b7743eb2186ca0ea1d0a3469a0?d=identicon&s=25 Stephen Bannasch (Guest)
on 2006-05-25 19:54
(Received via mailing list)
I have a small (200 lines) program that processes XML files and
creates ruby object trees. It uses the native Ruby xml library REXML
a great deal.

On my MacOS X 2GHz macintel system using the latest YARV speed up my
code by a factor of 2.

ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]
53 seconds

ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1]
YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ]
28 seconds

My program processes BlackBoard XML course archives and produces
numerous statistics about the discussion threads. In the original
program it then produced an excel output file. In these tests that
function was removed. In my test I processed a course with 26
separate XML files with discussion threads and created ruby objects
representing the info I am interested in.
1c1e3bdfe006a22214102fcd6434a012?d=identicon&s=25 Daniel Sheppard (Guest)
on 2006-05-26 02:02
(Received via mailing list)
> ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1]
> YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ]
> 28 seconds

I've never looked at yarv's internals, but I'm guessing that it creates
ruby objects directly as it processes the stream.

Unless you're using the stream api in rexml, you're comparing apples and
oranges. It stands to reason that parsing a stream directly is much
faster than parsing a stream to a document tree and processing the tree.


I'd be surprised if it still wasn't faster to use Yarv to build a ruby
object tree rather than building from an xml stream, but the difference
probably wont be as great.
5810a1b7743eb2186ca0ea1d0a3469a0?d=identicon&s=25 Stephen Bannasch (Guest)
on 2006-05-26 05:13
(Received via mailing list)
Hi Daniel,

>probably wont be as great.
I'm not using the stream api. I'm creating a dom tree and using rexml's
version of xpath to process the tree. I do processing on the results and
create a set of ruby objects. Unless there is something important I'm
missing I assume that yarv is just executing my algorithms faster. I
can't see how it would know to use the stream api.

I hadn't used yarv before and wanted to see how it works and I picked a
script that runs pretty slowly and spends most of its time in native
Ruby so I could see how much faster it would be. I think a 2x speed up
is nice.
0ca6e5c33d7e7ff901d75ff0b13d9e1c?d=identicon&s=25 Sam Roberts (Guest)
on 2006-05-26 08:10
(Received via mailing list)
Quoting daniels@pronto.com.au, on Fri, May 26, 2006 at 08:59:38AM +0900:
> > ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1]
> > YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ]
> > 28 seconds
>
> I've never looked at yarv's internals, but I'm guessing that it creates
> ruby objects directly as it processes the stream.

I think that you are mistaking yarv for an XML parser. Its not. Its a
virtual machine for ruby code. Do a quick google.

The benchmarks are on the same code, same rexml, same xml processing
technique, same numbers of objects created.

Cheers,
Sam
5810a1b7743eb2186ca0ea1d0a3469a0?d=identicon&s=25 Stephen Bannasch (Guest)
on 2006-05-26 09:24
(Received via mailing list)
>Quoting daniels@pronto.com.au, on Fri, May 26, 2006 at 08:59:38AM +0900:
>I think that you are mistaking yarv for an XML parser. Its not. Its a
>virtual machine for ruby code. Do a quick google.

I know it's a ruby VM, I checked out the latest yarv from the subversion
repository and compiled it and then compared the speed of ruby 1.8.4 and
a newer ruby compiled by yarv by running my program with both ruby and
yarv. The xml parser is rexml a ruby library hat can be used by either
ruby or yarv.

I think you have misinterpreted my original post. I was comparing the
time of execution like this:

  ruby test.rb

and then ...

  /usr/local/yarv/bin/ruby-yarv test.rb
E34b5cae57e0dd170114dba444e37852?d=identicon&s=25 Logan Capaldo (Guest)
on 2006-05-28 01:40
(Received via mailing list)
On May 26, 2006, at 3:23 AM, Stephen Bannasch wrote:

>
> - Stephen Bannasch
>   Concord Consortium, http://www.concord.org
>

He knows that you know. He doesn't think Daniel Sheppard knows.
That's who he was replying to.
This topic is locked and can not be replied to.