Forum: Ruby-core [ruby-trunk - Bug #6312][Open] Psych needlessly noisy parsing string node starting with number-ish s

E555a767a33427bfee0bb0878566293c?d=identicon&s=25 riffraff (gabriele renzi) (Guest)
on 2012-04-17 18:27
(Received via mailing list)
Issue #6312 has been reported by riffraff (gabriele renzi).

----------------------------------------
Bug #6312: Psych needlessly noisy parsing string node starting with
number-ish string
https://bugs.ruby-lang.org/issues/6312

Author: riffraff (gabriele renzi)
Status: Open
Priority: Low
Assignee:
Category:
Target version:
ruby -v: ruby 1.9.3p188 (2012-04-17 revision 35365)
[x86_64-darwin10.8.0]


For example:

     $ ruby -d -rpsych -e 'Psych.load("4 weddings")' 2>&1 | tail -n 2
     Exception `ArgumentError' at
/Users/riffraff/.rvm/rubies/ruby-1.9.3-head/lib/ruby/1.9.1/psych/scalar_scanner.rb:99
- invalid value for Integer(): "4 weddings"
     4 weddings


(tail because there are a bunch more printout due to load/name errors)

This is due to assuming by default that anything that is not another
scalar type should be considered first as a YAML !!int, and only if that
fails with an exception, as a string.

There was already a specific fix for one instance of this issue (#5186),
but it would be nicer to avoid it altogether.

Small patch attached importing the spec from yaml.org for what an int
should be. All psych tests still passing for me.

Notes:

* I did add a tiny test and some setup/teardown in the specific file so
that the debug would be visible on screen.
It could make sense to replace STDERr with a StringIO and check that but
it feels fragile, and I don't know how to test "does not cause debug
printouts" otherwise.

* checking for the INT regex makes the check for two "." in integer
unnecessary. I have added it back to the float case as  r32957 had fixed
the issue but it's been reintroduced (the yaml.org float regexp is wrong
or we don't parse the same floats)

* psych treats '1,2' as '12'.  This seem like a bug as I could not see
it in the spec, but I have changed the regexp accordingly.

* if the "1,2" == "12" parsing is removed then the String#gsub calls
become unnecessary

* there seem to be many capturing groups in this file which are not
really necessary

* sexagesimal formatting is handled by itself in another node, but it's
still in the FLOAT regex so I left it in the INT one too

Hope this is somewhat helpful.
F24ff61beb80aa5f13371aa22a35619c?d=identicon&s=25 mame (Yusuke Endoh) (Guest)
on 2012-04-19 20:30
(Received via mailing list)
Issue #6312 has been updated by mame (Yusuke Endoh).

Status changed from Open to Assigned
Assignee set to tenderlovemaking (Aaron Patterson)


----------------------------------------
Bug #6312: Psych needlessly noisy parsing string node starting with
number-ish string
https://bugs.ruby-lang.org/issues/6312#change-26011

Author: riffraff (gabriele renzi)
Status: Assigned
Priority: Low
Assignee: tenderlovemaking (Aaron Patterson)
Category:
Target version:
ruby -v: ruby 1.9.3p188 (2012-04-17 revision 35365)
[x86_64-darwin10.8.0]


For example:

     $ ruby -d -rpsych -e 'Psych.load("4 weddings")' 2>&1 | tail -n 2
     Exception `ArgumentError' at
/Users/riffraff/.rvm/rubies/ruby-1.9.3-head/lib/ruby/1.9.1/psych/scalar_scanner.rb:99
- invalid value for Integer(): "4 weddings"
     4 weddings


(tail because there are a bunch more printout due to load/name errors)

This is due to assuming by default that anything that is not another
scalar type should be considered first as a YAML !!int, and only if that
fails with an exception, as a string.

There was already a specific fix for one instance of this issue (#5186),
but it would be nicer to avoid it altogether.

Small patch attached importing the spec from yaml.org for what an int
should be. All psych tests still passing for me.

Notes:

* I did add a tiny test and some setup/teardown in the specific file so
that the debug would be visible on screen.
It could make sense to replace STDERr with a StringIO and check that but
it feels fragile, and I don't know how to test "does not cause debug
printouts" otherwise.

* checking for the INT regex makes the check for two "." in integer
unnecessary. I have added it back to the float case as  r32957 had fixed
the issue but it's been reintroduced (the yaml.org float regexp is wrong
or we don't parse the same floats)

* psych treats '1,2' as '12'.  This seem like a bug as I could not see
it in the spec, but I have changed the regexp accordingly.

* if the "1,2" == "12" parsing is removed then the String#gsub calls
become unnecessary

* there seem to be many capturing groups in this file which are not
really necessary

* sexagesimal formatting is handled by itself in another node, but it's
still in the FLOAT regex so I left it in the INT one too

Hope this is somewhat helpful.
054b5f6b8afdd5f6190bad08e46cd782?d=identicon&s=25 zzak (Zachary Scott) (Guest)
on 2013-08-14 22:13
(Received via mailing list)
Issue #6312 has been updated by zzak (Zachary Scott).


I get a NameError exception when I try to run your example on trunk:

Exception `NameError' at
/Users/zzak/.rubies/ruby-trunk/lib/ruby/2.1.0/psych/core_ext.rb:29 -
method `yaml_as' not defined in Module
Exception `NameError' at
/Users/zzak/.rubies/ruby-trunk/lib/ruby/2.1.0/psych/deprecated.rb:81 -
undefined method `to_yaml_properties' for class `Object'
----------------------------------------
Bug #6312: Psych needlessly noisy parsing string node starting with
number-ish string
https://bugs.ruby-lang.org/issues/6312#change-41159

Author: riffraff (gabriele renzi)
Status: Assigned
Priority: Low
Assignee: tenderlovemaking (Aaron Patterson)
Category:
Target version:
ruby -v: ruby 1.9.3p188 (2012-04-17 revision 35365)
[x86_64-darwin10.8.0]
Backport:


For example:

     $ ruby -d -rpsych -e 'Psych.load("4 weddings")' 2>&1 | tail -n 2
     Exception `ArgumentError' at
/Users/riffraff/.rvm/rubies/ruby-1.9.3-head/lib/ruby/1.9.1/psych/scalar_scanner.rb:99
- invalid value for Integer(): "4 weddings"
     4 weddings


(tail because there are a bunch more printout due to load/name errors)

This is due to assuming by default that anything that is not another
scalar type should be considered first as a YAML !!int, and only if that
fails with an exception, as a string.

There was already a specific fix for one instance of this issue (#5186),
but it would be nicer to avoid it altogether.

Small patch attached importing the spec from yaml.org for what an int
should be. All psych tests still passing for me.

Notes:

* I did add a tiny test and some setup/teardown in the specific file so
that the debug would be visible on screen.
It could make sense to replace STDERr with a StringIO and check that but
it feels fragile, and I don't know how to test "does not cause debug
printouts" otherwise.

* checking for the INT regex makes the check for two "." in integer
unnecessary. I have added it back to the float case as  r32957 had fixed
the issue but it's been reintroduced (the yaml.org float regexp is wrong
or we don't parse the same floats)

* psych treats '1,2' as '12'.  This seem like a bug as I could not see
it in the spec, but I have changed the regexp accordingly.

* if the "1,2" == "12" parsing is removed then the String#gsub calls
become unnecessary

* there seem to be many capturing groups in this file which are not
really necessary

* sexagesimal formatting is handled by itself in another node, but it's
still in the FLOAT regex so I left it in the INT one too

Hope this is somewhat helpful.
E555a767a33427bfee0bb0878566293c?d=identicon&s=25 riffraff (gabriele renzi) (Guest)
on 2013-09-05 00:14
(Received via mailing list)
Issue #6312 has been updated by riffraff (gabriele renzi).


Sorry I hadn't noticed this issue was updated, possibly rubymine is not
sending notifications anymore?

Anyway It's been 15 months since I submitted this, so please forgive me
if I'm not very fresh on the subject, but as far as I can tell the issue
that I reported appear to have been accidentally fixed by a change a
couple weeks ago:

https://github.com/tenderlove/psych/issues/156

So I guess this issue can be closed.

The errors you see are another example of the basic problem though:
needlessly raising and catching exception causes noise when running with
-d.
The lines in question are

  remove_method :yaml_as rescue nil  #in core_ext.rb

and

  undef :to_yaml_properties rescue nil # in deprecated.rb

both can, AFAICT, be replaced with explicit testing (undef if defined)
which avoids raising exceptions and eliminates this noise, with the
possible side effect of being  more efficient (no need to create
exceptions, fill stacktrace etc).


Please excuse me if I am missing something, but as I wrote, it's been a
while since I looked into this, and it's quite late in the night at the
time of this writing.
----------------------------------------
Bug #6312: Psych needlessly noisy parsing string node starting with
number-ish string
https://bugs.ruby-lang.org/issues/6312#change-41621

Author: riffraff (gabriele renzi)
Status: Assigned
Priority: Low
Assignee: tenderlovemaking (Aaron Patterson)
Category:
Target version:
ruby -v: ruby 1.9.3p188 (2012-04-17 revision 35365)
[x86_64-darwin10.8.0]
Backport:


For example:

     $ ruby -d -rpsych -e 'Psych.load("4 weddings")' 2>&1 | tail -n 2
     Exception `ArgumentError' at
/Users/riffraff/.rvm/rubies/ruby-1.9.3-head/lib/ruby/1.9.1/psych/scalar_scanner.rb:99
- invalid value for Integer(): "4 weddings"
     4 weddings


(tail because there are a bunch more printout due to load/name errors)

This is due to assuming by default that anything that is not another
scalar type should be considered first as a YAML !!int, and only if that
fails with an exception, as a string.

There was already a specific fix for one instance of this issue (#5186),
but it would be nicer to avoid it altogether.

Small patch attached importing the spec from yaml.org for what an int
should be. All psych tests still passing for me.

Notes:

* I did add a tiny test and some setup/teardown in the specific file so
that the debug would be visible on screen.
It could make sense to replace STDERr with a StringIO and check that but
it feels fragile, and I don't know how to test "does not cause debug
printouts" otherwise.

* checking for the INT regex makes the check for two "." in integer
unnecessary. I have added it back to the float case as  r32957 had fixed
the issue but it's been reintroduced (the yaml.org float regexp is wrong
or we don't parse the same floats)

* psych treats '1,2' as '12'.  This seem like a bug as I could not see
it in the spec, but I have changed the regexp accordingly.

* if the "1,2" == "12" parsing is removed then the String#gsub calls
become unnecessary

* there seem to be many capturing groups in this file which are not
really necessary

* sexagesimal formatting is handled by itself in another node, but it's
still in the FLOAT regex so I left it in the INT one too

Hope this is somewhat helpful.
This topic is locked and can not be replied to.