Wow, YAML / Psych in 1.9.3 is *slow*!

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML’s sluggishness. The chief problem seems to be
the Psych library.

In my main use case, a certain routine takes about 14 seconds.

Fortunately, I can switch to the “syck” library:

begin
YAML::ENGINE.yamler = ‘syck’
rescue

Using syck, the same routine takes about 7 seconds.

That’s actually a bit faster than the same routine under Ruby 1.8.7 with
the old YAML, where the time for the same routine is 8 or 9 seconds. So
to get the juicy goodness of improved speed in Ruby 1.9.3, I definitely
need to use “syck”.

A little googling suggests I’m not the only person to make this sort of
observation.

m.

File. A. Bug.

Ryan D. [email protected] wrote:

File. A. Bug.

With? Whom?

m.

The psych devs.

Are you asking for someone to look up their email address for you?

On Sep 14, 2012, at 16:20 , Matt N. [email protected] wrote:

Ryan D. [email protected] wrote:

File. A. Bug.

With? Whom?

Given that you’ve now blogged about psych and cited its source… I’m
guessing you know.

I think the question is a reasonable one. Who is to “blame” in a case
like this? The psych people? The YAML people, who have so forcibly
replaced syck with psych and made this horrible warning notice appear in
1.9.3 if you didn’t build psych into Ruby? The Ruby people, who have
accepted that situation and allowed it to be built into the core? I
don’t think the answer is obvious. m.

PS Don’t be a ninny. Assume that your interlocutor has a brain. Give me
the same benefit of the doubt that I give you.

Where is the performance problem? File the issue there.

James H.on [email protected] wrote:

Where is the performance problem? File the issue there.

In the end I filed in two places: Psych on github, and Ruby itself.
Here’s why. There are really two problems. First, Psych is demonstrably
slow. I was able to write a simple real-world case showing that it takes
Pysch more then 3 times as long as Syck to do a load_file:

However, there’s also a deeply disturbing philosophical problem. Psych
has been crammed down the throats of users before it’s ready for prime
time. If you build Ruby 1.9.3 without libyaml being present, Psych won’t
be installed and every time yaml is used, including every time you touch
“gem” in any way, you get a nasty warning. So I also filed at rubybugs
asking that this force-feeding of Psych be backed out. It won’t be, of
course, but the point is made. m.

Matt N. wrote in post #1076014:

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML’s sluggishness. The chief problem seems to be
the Psych library.

[…]

Fortunately, I can switch to the “syck” library:

begin
YAML::ENGINE.yamler = ‘syck’
rescue

Using syck, the same routine takes about 7 seconds.

Wow indeed! With a test app with some of my data (having to load 820
yaml documents) switching to syck took execution from 16.7 to 1.3
seconds!
Are there any known side-effects with doing this?

Patrick

On Sat, Sep 15, 2012 at 4:50 PM, Matt N. [email protected] wrote:

However, there’s also a deeply disturbing philosophical problem. Psych
has been crammed down the throats of users before it’s ready for prime
time.

You do realize Syck is completely broken and unmaintained, right?

Patrick B. писал 19.09.2012 19:14:

YAML::ENGINE.yamler = ‘syck’
rescue

Using syck, the same routine takes about 7 seconds.

Wow indeed! With a test app with some of my data (having to load 820
yaml documents) switching to syck took execution from 16.7 to 1.3
seconds!
Are there any known side-effects with doing this?

Yes; namely, not being compatible with YAML specification.
Psych was written for a reason, and that reason was Syck’s brokenness.
For years it has generated invalid YAML (or was refusing to consume
valid YAML; I have seen different opinions on this) in common cases.
For
example, this:
http://blog.rubygems.org/2011/08/31/shaving-the-yaml-yak.html

Peter Z. wrote in post #1076749:

Patrick B. писал 19.09.2012 19:14:

YAML::ENGINE.yamler = ‘syck’
rescue

Using syck, the same routine takes about 7 seconds.

Wow indeed! With a test app with some of my data (having to load 820
yaml documents) switching to syck took execution from 16.7 to 1.3
seconds!
Are there any known side-effects with doing this?

Yes; namely, not being compatible with YAML specification.
Psych was written for a reason, and that reason was Syck’s brokenness.
For years it has generated invalid YAML (or was refusing to consume
valid YAML; I have seen different opinions on this) in common cases.
For
example, this:
Shaving a YAML Yak - RubyGems Blog

So then what can be done to improve the speed of the ‘correct’ YAML
library? The difference is to put it mildly, significant. :frowning: