Ruby-internal

  • About ruby-internal
  • Sample code
  • Installation
  • Ruby-internal and irb
  • YARV support
  • Other tools
  • Nightly build scoreboard
  • Future directions
  • Links

    About

    Ruby-internal is Ruby module that provides direct access to Ruby's (MRI or YARV) internal data structures.

    How is ruby-internal useful? You can:

    Installing Ruby-internal

    To install it:

      $ gem install ruby-internal
    

    Sample code

    This will dump the class Foo (including its instance methods, class variables, etc.) and re-load it:

      require 'internal/node'
    
      class Foo
        def foo; puts "this is a test..."; end
      end
    
      s = Marshal.dump(Foo)
      p Marshal.load(s) #=> Foo
    

    Ruby-internal and irb

    Ruby-internal is very useful as a tool for digging into the internals of Ruby and figuring out what the interpreter is doing with your code. To use ruby-internal with irb, put the following in your .irbrc:
      require 'pp'
      require 'internal/node/pp'
      require 'internal/classtree'
      require 'internal/method/signature'
    
    Now you can print node trees:
      irb(main):001:0> pp (proc { 1 + 1 }.body)
      NODE_NEWLINE at (irb):1
      |-nth = 1
      +-next = NODE_CALL at (irb):1
        |-recv = NODE_LIT at (irb):1
        | +-lit = 1
        |-args = NODE_ARRAY at (irb):1
        | |-alen = 1
        | |-head = NODE_LIT at (irb):1
        | | +-lit = 1
        | +-next = false
        +-mid = :+
      => nil
    
    And view class hierarchies:
      irb(main):004:0> puts Object.new.classtree
      #<Object:0x40330ce8>
      +-class = Object
        |-class = #<Class:Object>
        | |-class = Class
        | | |-class = #<Class:Class>
        | | | |-class = #<Class:Class> (*)
        | | | +-super = #<Class:Module>
        | | |   |-class = Class (*)
        | | |   +-super = #<Class:Object> (*)
        | | +-super = Module
        | |   |-class = #<Class:Module> (*)
        | |   +-super = Object (*)
        | +-super = Class (*)
        +-super = #<PP::ObjectMixin?:0x40349568>
          +-class = PP::ObjectMixin?
            |-class = Module (*)
            +-super = #<Kernel:0x4033507c>
              +-class = Kernel
      => nil
    
    View method signatures:
      irb(main):015:0> def foo(a, b, *rest, &block); end; method(:foo).signature
      => #<MethodSig::Signature:0x4037093c @origin_class=Object, @arg_info={:b=>"b",
      :block=>"&block", :a=>"a", :rest=>"*rest"}, @name="foo", @arg_names=[:a,
      :b, :rest, :block]>
      irb(main):016:0> proc { |x, y, *rest| }.signature
      => #<Proc::Signature:0x4036cf30 @args=#<Proc::Arguments:0x4036d020 @rest_arg=2,
      @multiple_assignment=true, @names=[:x, :y, :rest]>, @arg_info={:x=>"x", :y=>"y",
      :rest=>"*rest"}>
    
    And reconstruct compiled methods:
      irb(main):001:0> def foo(a, b, *rest, &block)
      irb(main):002:1>   begin
      irb(main):003:2*     if not a and not b then
      irb(main):004:3*       raise "Need more input!"
      irb(main):005:3>     end
      irb(main):006:2>     return a + b
      irb(main):007:2>   ensure
      irb(main):008:2*     puts "In ensure block"
      irb(main):009:2>   end
      irb(main):010:1> end
      => nil
      irb(main):011:0> m = method(:foo)
      => #<Method: Object#foo>
      irb(main):012:0> puts m.as_code
      def foo(a, b, *rest, &block)
        begin
          (raise("Need more input!")) if (not a and not b)
          return a + b
        ensure
          puts("In ensure block")
        end
      end
    => nil
    

    YARV support

    Yes, ruby-internal works with YARV, too. The difference when using YARV is that sometimes you have nodes, and sometimes you have instruction sequences. So whereas pre-YARV you would have a pure AST, with YARV you get structures that look like this:

      irb(main):001:0> def foo; 1 + 1; end
      => nil
      irb(main):002:0> pp method(:foo).body  
      NODE_METHOD at (irb):1
      |-noex = PUBLIC
      |-body = <ISeq:foo@(irb)>
      | |-0000 trace            8
      | |-0002 trace            1
      | |-0004 putobject        1
      | |-0006 putobject        1
      | |-0008 opt_plus         
      | |-0009 trace            16
      | +-0011 leave            
      +-cnt = 0
    

    You can also access the original AST with Node.compile:

      irb(main):001:0> n = Node.compile_string('1+1')
      => #>Node::SCOPE:0x40420af0>
      irb(main):002:0> pp n
      NODE_SCOPE at (compiled):1
      |-rval = NODE_CALL at (compiled):1
      | |-recv = NODE_LIT at (compiled):1
      | | +-lit = 1
      | |-args = NODE_ARRAY at (compiled):1
      | | |-alen = 1
      | | |-head = NODE_LIT at (compiled):1
      | | | +-lit = 1
      | | +-next = false
      | +-mid = :+
      |-tbl = nil
      +-next = false
    

    compile it to a bytecode sequence:

      irb(main):003:0> is = n.bytecode_compile()
      => <ISeq:<main>@(compiled)>
      irb(main):004:0> puts is.disasm
      == disasm: >ISeq:>main>@(compiled)>=====================================
      0000 trace            1                                               (   1)
      0002 putobject        1
      0004 putobject        1
      0006 opt_plus         
      0007 leave            
      => nil
    

    iterate over the bytecode sequence:

      irb(main):004:0> is.each { |i| puts "#{i.inspect} #{i.length} #{i.operand_types.inspect}" }
      #<VM::Instruction::TRACE:0x40412324 @operands=[1]> 2 [:num]
      #<VM::Instruction::PUTOBJECT:0x404121d0 @operands=[1]> 2 [:value]
      #<VM::Instruction::PUTOBJECT:0x4041207c @operands=[1]> 2 [:value]
      #<VM::Instruction::OPT_PLUS:0x40411f28 @operands=[]> 1 []
      #<VM::Instruction::LEAVE:0x40411e24 @operands=[]> 1 []
      => nil
    

    then decompile it:

      irb(main):005:0> require 'as_expression'
      => true
      irb(main):006:0> is.as_expression
      => "1 + 1"
    

    There are still a few missing features (particularly in the decompiler), but expect to see more exciting tools for working with bytecode in the future!

    Other tools

    Ruby-internal comes with two useful tools, nwdump and nwobfusc. The nwdump tool works in much the same way as the older Pragmatic nodedump tool. If you require it from the command line:
      $ ruby -rinternal/node/dump test.rb
    
    it will dump your program's syntax tree. The nwobfusc tool is similar:
      $ ruby -rinternal/obfusc test.rb > test2.rb
    
    but its output is an obfuscated version of your program. The program must be run on the same version of both ruby-internal and the interpreter.

    Nightly build scoreboard

    To ensure that ruby-internal works on the widest possible variety of ruby versions, a nightly build has been set up to build it and run the tests on every stable release of ruby since 1.6.8:

    Future directions

    Links to related projects