I have a small (200 lines) program that processes XML files and creates ruby object trees. It uses the native Ruby xml library REXML a great deal. On my MacOS X 2GHz macintel system using the latest YARV speed up my code by a factor of 2. ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1] 53 seconds ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1] YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ] 28 seconds My program processes BlackBoard XML course archives and produces numerous statistics about the discussion threads. In the original program it then produced an excel output file. In these tests that function was removed. In my test I processed a course with 26 separate XML files with discussion threads and created ruby objects representing the info I am interested in.
on 2006-05-25 21:54
on 2006-05-26 04:02
> ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1] > YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ] > 28 seconds I've never looked at yarv's internals, but I'm guessing that it creates ruby objects directly as it processes the stream. Unless you're using the stream api in rexml, you're comparing apples and oranges. It stands to reason that parsing a stream directly is much faster than parsing a stream to a document tree and processing the tree. I'd be surprised if it still wasn't faster to use Yarv to build a ruby object tree rather than building from an xml stream, but the difference probably wont be as great.
on 2006-05-26 07:13
Hi Daniel, >probably wont be as great. I'm not using the stream api. I'm creating a dom tree and using rexml's version of xpath to process the tree. I do processing on the results and create a set of ruby objects. Unless there is something important I'm missing I assume that yarv is just executing my algorithms faster. I can't see how it would know to use the stream api. I hadn't used yarv before and wanted to see how it works and I picked a script that runs pretty slowly and spends most of its time in native Ruby so I could see how much faster it would be. I think a 2x speed up is nice.
on 2006-05-26 10:10
Quoting firstname.lastname@example.org, on Fri, May 26, 2006 at 08:59:38AM +0900: > > ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1] > > YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ] > > 28 seconds > > I've never looked at yarv's internals, but I'm guessing that it creates > ruby objects directly as it processes the stream. I think that you are mistaking yarv for an XML parser. Its not. Its a virtual machine for ruby code. Do a quick google. The benchmarks are on the same code, same rexml, same xml processing technique, same numbers of objects created. Cheers, Sam
on 2006-05-26 11:24
>Quoting email@example.com, on Fri, May 26, 2006 at 08:59:38AM +0900: >I think that you are mistaking yarv for an XML parser. Its not. Its a >virtual machine for ruby code. Do a quick google. I know it's a ruby VM, I checked out the latest yarv from the subversion repository and compiled it and then compared the speed of ruby 1.8.4 and a newer ruby compiled by yarv by running my program with both ruby and yarv. The xml parser is rexml a ruby library hat can be used by either ruby or yarv. I think you have misinterpreted my original post. I was comparing the time of execution like this: ruby test.rb and then ... /usr/local/yarv/bin/ruby-yarv test.rb