Autovivification in Ruby
Autovivification is the concept that a hash style data structure can make inferences about its internal structure as it is being created
What is Autovivication?
Autovivification is the concept that a hash style data structure can make inferences about its internal structure as it is being created. For instance, creating a nested hash under one of the keys would normally require you to explicitly state that you wanted to create a hash there. Autovivification allows you to skip this step and create the nested structure on the fly.
To make the idea more tangible, here is a quick code example in Perl that automagically fills out a hash:
After using Ruby for 4 weeks, I’ll have you know I forgot every single semi-colon when writing this
In Ruby, a similar piece of code could look like this:
Notice how we had to create hashes down the way as we went? Not with autovivification.
A Bit More Background
The idea was introduced in the Perl programming language as default, but can be written into other languages. As a former Perl programmer who relied on this feature heavily, I thought I would do that comparison between Perl and Ruby, how we can make Ruby act similarly, and when to use autovivification in Ruby (hint: probably never).
Autoviv in Ruby
The Wikipedia article on Autovivification has a section on how to do it in Ruby. I have copied over that example, but spread it out to try to make it a bit more readable.
Let’s break it down into the three main lines, which cover three very important pieces to how this works.
Create a proc
We’ll go into more detail on this in section 3, but we need a way to save this block of code, and we know in Ruby that we can save a block by creating it as either a proc or lamdba.
Override the Newly Created Hash’s Behavior
According to Ruby’s documentation on creating a new hash, you can set the default value to be any item if passed in as a parameter. However, Ruby offers another, more powerful option. If you pass in a block, you can have access to the hash object and the key, and set the default value yourself. This allows us to do something very important, as we will see in magical step .
Set the Key to Another Autovivicious Hash
Since we have access to the hash and the key in the block we passed to Hash.new, we can now define the default value for that hash key we just tried to access. How about an autovivifying hash? Oh, we have a way to make one of those already – the proc we are writing! Let’s call it! So now, anytime we try to access a nil key in our autovivificious hash, we will create there instead of nil – another autovivificious hash!
Yeah, But Why Can’t We Give a Default Value of a Hash.new?
So now the “notauto_hash” hash will set its keys to be a new hash every time right? Sort of. It will create a new hash only for _the first level of keys. Okay, sure, so let’s go ahead and do the same for the second level of keys…
But wait… that’s only two keys deep…
This is approximately what our “tree” proc is doing up above, simply calling Hash.new each time a new key depth is needed. It just does so with a fancy looking recursive proc.
When Should I Use This?
I’m not really certain on the Ruby community’s consensus of autovivification, but I am going to go out on a limb and say don’t do this, unless there is a very good reason. From my time working mostly with Perl, I can say that it is sometimes dangerous using this feature, unless it is absolutely clear what data is entering the hash, when, and how. From Ruby, the first impression that I got was that Ruby would get very confused when you tried to treat a trailing key accessor as an array, instead, the autoviv thought it was being clever and would create a hash instead.
Some Follow Ups
The research here led me to a few interesting resources that I would like to talk about in another post. Namely: Rubinius and Ruby Facets. I would also like to mess around with overriding some defaults – Can I make the hash class autovivify by default? Can I have the curly braces take over for square braces to look more like Perl? Can I have the square braces ignored so that I don’t confuse it with an array? I do think two blog posts on autoviv are two more than necessary, so that will probably be all.
Disclaimer: The information in this blog is current as of October 18, 2013. Current policies, offerings, procedures, and programs may differ.