Hash is all ponies

03 September 2008

We’ve been hard at work replacing the C virtual machine in Rubinius with one written in C++. Along the way, we decided to use a simpler data structure internally where we had been using a Hash before. This allowed us to rewrite Hash in pure Ruby. Since our goal is to write as much Ruby in Ruby, I was happy to work on this rewrite.

You can check out the resulting code here and here. It hasn’t been polished much yet, but the idea was to use expressive and natural Ruby. I didn’t take any performance-conscious tacks. Hash should provide a great case study for looking at optimizations in our compiler and VM. For example, Hash#keys can be succinctly written as:

1 def keys
2   inject([]) { |ary, entry| ary << entry.first }
3 end

Since #inject comes from the Enumerable module, we know it’s using #each internally. And for Hash, I implemented an external iterator of sorts to help hide the implementation details. That iterator is making a method call for each item. So we’ve piled on two block calls and a method call for each item. The point is, we want that expressive Ruby code in our applications and let the implementation optimize it. So far, we really haven’t had any Ruby implementation to give us that.

Another thing to check out is the structure of our kernel directory. Talking to the folks working on MagLev, we’ve restructured our runtime kernel (aka Ruby standard library) to better allow sharing with other implementations. The bootstrap and delta directories are implementation specific, while the common directory is intended to be shared. The bootstrap directory provides just enough functionality to load the common directory. The delta directory provides a location for implementation specialization where specific methods can be replaced with more performant or otherwise specialized versions.

As always, we love to hear from folks about what’s bugging them with Ruby. Drop by #rubinius on freenode.net or check out the Rubinius project on Github. Even though we’re in the middle of this VM rewrite, there’s plenty still to be done on the Ruby standard library.

Update: Evan points out that this is a better implementation of #keys:

1 def keys
2   map { |key, value| key }
3 end