Forum: Ruby String#de_inspect (and Kernel#suspicious)

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Erik V. (Guest)
on 2006-01-25 12:27
(Received via mailing list)
If you do an inspect on a collection of Ruby objects, like a
hash, you end up with a string. It's possible to store this
string in a file, read it again somewhere in the future,
evaluate it and end up with the same collection of Ruby objects
in core.

So I've written this String#de_inspect, which uses
Kernel#suspicious (slow!) to avoid any malicious code from
being evaluated.

A kind of human-readable marshaling. That is-human-readable is
important, for me, in this situation.

(You can only dump objects which inspect to Ruby code, e.g.
Strings, Numerics, Symbols, Arrays, Hashes, nil, true and
false.)

I've attached the code and an example, though the example isn't
important.

Thoughts? Comments?

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

 module Kernel
   def suspicious(*parms, &block) # Just forget about the parms...
     Thread.new(*parms) do |*parms|
       $SAFE = 5

       block.call(*parms)
     end.value
   end
 end

 class String
   def de_inspect
     suspicious do
       eval(self, Module.new.module_eval{binding})
     end
   end
 end

 def journal(file)
   File.open(file) do |f|
     while (line = f.gets)
       yield(line.de_inspect)
     end
   end
 end

 journal("journal") do |x|
   p x
 end
Robert K. (Guest)
on 2006-01-25 12:36
(Received via mailing list)
Erik V. wrote:
> A kind of human-readable marshaling. That is-human-readable is
> important, for me, in this situation.
>
> (You can only dump objects which inspect to Ruby code, e.g.
> Strings, Numerics, Symbols, Arrays, Hashes, nil, true and
> false.)
>
> I've attached the code and an example, though the example isn't
> important.
>
> Thoughts? Comments?

A question: what is the advantage of this over YAML?

Kind regards

    robert
Erik V. (Guest)
on 2006-01-25 13:48
(Received via mailing list)
> A question: what is the advantage of this over YAML?

1) It's faster (see below). Probably because it uses the highly
   optimized parser/lexer/whatever of the Ruby interpreter
   itself. (You can turn off the suspicious mode if the data
   can be trusted, which makes it faster then YAML. If the
   suspicious is enabled, it's as fast as YAML.)

2) Memory (suspicious mode turned off) (see below).

3) It's small, whereas YAML is relatively huge. (Is being small
   an advantage? Not necessarily, but I mention it anyway...)

4) You can store not only raw data, but code as well. (I know,
   this is really DANGEROUS, like macros in Word. That's why I
   introduced Kernel#suspicious.)

5) I my real situation, I raise an exception if the line, read
   from the journal, doesn't end with \r, \n or both. This is
   an indication for a corrupted journal. Half a line in the
   journal could be valid Ruby code and, as such, appear to be
   valid data. That's why I check for the "commit". (Maybe YAML
   does this too. I don't know.)

In my case, where the data is only accessible via a dedicated
daemon on a server, I can turn off the suspicious mode. That's
the big win.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

 $ wc test.rbo test.yaml        # SAME DATA!
    3077   26739  681698 test.rbo
   29816   62709  697071 test.yaml

 $ ruby test.rb 10              # 10 times
        CPU    ELAPSED      COUNT CPU/INSTANCE   LABEL
   3.770000   4.016498          1   3.770000   :yaml
   3.630000   3.862287          1   3.630000   :rbo
   1.140000   1.140604          1   1.140000   :rbo_fast

 $ ruby testmem.rb rbo          # Disable GC, load testset once.
 VmSize:    21988 kB

 $ ruby testmem.rb rbo_fast     # Disable GC, load testset once.
 VmSize:    10904 kB

 $ ruby testmem.rb yaml         # Disable GC, load testset once.
 VmSize:    18004 kB
Mauricio F. (Guest)
on 2006-01-25 14:00
(Received via mailing list)
On Wed, Jan 25, 2006 at 07:23:12PM +0900, Erik V. wrote:
> So I've written this String#de_inspect, which uses
> Kernel#suspicious (slow!) to avoid any malicious code from
> being evaluated.
[...]


### code by Mr. Evil

File.open("journal", "w") do |f|
  f.puts <<-EOF.gsub("\n", ";")
    def (o=Object.new).inspect
      puts "gotcha! I'm running in $SAFE=\#{$SAFE}"
      puts "Fear my rm -rf"
      '"Just an innocent little string"'
    end
    o
  EOF
end

# back to your code
module Kernel
  def suspicious(*parms, &block) # Just forget about the parms...
    Thread.new(*parms) do |*parms|
      $SAFE = 5

      block.call(*parms)
    end.value
  end
end

class String
  def de_inspect
    suspicious do
      eval(self, Module.new.module_eval{binding})
    end
  end
end

def journal(file)
  File.open(file) do |f|
    while (line = f.gets)
      yield(line.de_inspect)
    end
  end
end

journal("journal") do |x|
  p x
end
# >> gotcha! I'm running in $SAFE=0
# >> Fear my rm -rf
# >> "Just an innocent little string"
Robert K. (Guest)
on 2006-01-25 14:06
(Received via mailing list)
Erik V. wrote:
> 3) It's small, whereas YAML is relatively huge. (Is being small
>    valid data. That's why I check for the "commit". (Maybe YAML
>    does this too. I don't know.)

That's quite an impressive list.  I'm glad I asked.

Kind regards

    robert
Erik V. (Guest)
on 2006-01-25 14:31
(Received via mailing list)
Okay, the block was defined in SAFE mode 0... :)

In the first version, I didn't introduce Kernel#suspicious (see
below). That one worked fine.

Then I naively abstracted the thread thing, which didn't do
what I expected. Oops...

Back to version 1...

Thanks.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

 File.open("journal", "w") do |f|
   f.puts "[:SAFE => $SAFE]"
 end

 # back to your code

 class String
   def de_inspect
     Thread.new do
       $SAFE = 5

       eval(self, Module.new.module_eval{binding})
     end.value
   end
 end

 def journal(file)
   File.open(file) do |f|
     while (line = f.gets)
       yield(line.de_inspect)
     end
   end
 end

 journal("journal") do |x|
   p x
 end
Joel VanderWerf (Guest)
on 2006-01-25 21:38
(Received via mailing list)
Erik V. wrote:
> A kind of human-readable marshaling. That is-human-readable is
> important, for me, in this situation.
>
> (You can only dump objects which inspect to Ruby code, e.g.
> Strings, Numerics, Symbols, Arrays, Hashes, nil, true and
> false.)

For the object->string direction, it may be useful to use amarshal,
rather than inspect:

http://cvs.m17n.org/~akr/amarshal/

One advantage is with cyclic references. Using #inspect will not
preserve enough information to reconstruct the reference.
Erik V. (Guest)
on 2006-01-25 23:11
(Received via mailing list)
> For the object-> string direction, it may be useful to use
> amarshal, rather than inspect:
>
> http://cvs.m17n.org/~akr/amarshal/
>
> One advantage is with cyclic references. Using #inspect will
> not preserve enough information to reconstruct the reference.

 % ruby -ramarshal -e 'AMarshal.dump([1,2,3], STDOUT)'
 v = []
 v[0] = Array.allocate()
 v[0] << 1
 v[0] << 2
 v[0] << 3
 v[0]

This idea is really nice, indeed.

gegroet,
Erik V. - http://www.erikveen.dds.nl/
This topic is locked and can not be replied to.