Parrot Virtual Machine/Squaak Tutorial/Scope and Subroutines

In Episode 5, we looked at variable declarations and implementing scope. We covered a lot of information then, but did not tell the full story, in order to keep that post short. In this episode we'll address the missing parts, which will also result in implementing subroutines.

Variables
In the previous episode, we entered local variables into the current block's symbol table. As we've seen earlier, using the do-block statement, scopes may nest. Consider this example:

do      var x = 42 do         print(x) end end

In this example, the print statement should print 42, even though x was not declared in the scope where it is referenced. How does the compiler know it's still a local variable? That's simple: it should look in all scopes, starting at the innermost scope. Only when the variable is found in any scope, should its scope be set to "lexical", so that the right instructions are being generated.

The solution I came up with is shown below. Please note that I'm not 100% sure if this is the "best" solution, but my personal understanding of the PAST compiler is limited. So, while this solution works, I may teach you the wrong "habit". Please be aware of this.

method identifier($/) { our @?BLOCK;

my $name := ~$ ; my $scope := 'package'; # default value

# go through all scopes and check if the symbol # is registered as a local. If so, set scope to       # local. for @?BLOCK { if $_.symbol($name) { $scope := 'lexical'; }       }

make PAST::Var.new( :name($name),                           :scope($scope),                            :viviself('Undef'),                            :node($/) ); }

Viviself
You might have noticed the viviself attribute before. This attribute will result in extra instructions that will initialize the variable if it doesn't exist. As you know, global variables spring into life automatically when they're used. Earlier we mentioned that uninitialized variables have a default value of "Undef": the viviself attribute does this. For local variables, we use this mechanism to set the (optional) initialization value. When the identifier is a parameter, the parameter will be initialized automatically if it doesn't receive a value when the subroutine it belongs to is invoked. Effectively this means that all parameters in Squaak are optional!

Subroutines
We already mentioned subroutines before, and introduced the  node type. We also briefly mentioned the  attribute that can be set on a   node, which indicates whether the block is to be executed immediately (for instance, a do-block or if statement) or it represents a declaration (for instance, subroutines). Let us now look at the grammar rule for subroutine definitions:

rule sub_definition { 'sub' *       'end' {*}   }

rule parameters { '(' [ [',' ]* ]? ')'       {*}    }

This is rather straightforward, and the action methods for these rules are quite simple, as you will see. First, however, let's have a look at the rule for sub definitions. Why is the sub body defined as  and not as a  ? Surely, a subroutine defines a new scope, which was already covered by Well, you're right in that. However, as we will see, by the time that a new  node would be created, we are too late! The parameters would already have been parsed, and not entered into the block's symbol table. That's a problem, because parameters are most likely to be used in the subroutine's body, and as they are not registered as local variables (which they are), any usage of parameters would not be compiled down to the right instructions to fetch any parameters.

So, how do we solve this in an efficient way?

The solution is simple. The only place where parameters live, is in the subroutine's body, represented by a  node. Why don't we create the  node in the action method for the   rule. By doing so, the block is already in place and the parameters are registered as local symbols right in time. Let's look at the action methods.

method parameters($/) { our $?BLOCK; our @?BLOCK;

my $past := PAST::Block.new( :blocktype('declaration'),                                    :node($/) );

# now add all parameters to this block for $ { my $param := $( $_ ); $param.scope('parameter'); $past.push($param);

# register the parameter as a local symbol $past.symbol($param.name, :scope('lexical')); }

# now put the block into place on the scope stack $?BLOCK := $past; @?BLOCK.unshift($past);

make $past; }

method sub_definition($/) { our $?BLOCK; our @?BLOCK;

my $past := $( $ ); my $name := $( $ );

# set the sub's name $past.name( $name.name );

# add all statements to the sub's body for $ { $past.push( $( $_ ) ); }

# and remove the block from the scope # stack and restore the current block @?BLOCK.shift; $?BLOCK := @?BLOCK[0];

make $past; }

First, let's check out the parse action for parameters. First, a new  node is created. Then, we iterate over the list of identifiers (which may be empty), each representing a parameter. After retrieving the result object for a parameter (which is just an identifier), we set its scope to "parameter", and we add it to the block object. After that, we register the parameter as a symbol in the block object, specifying the scope as "lexical". Parameters are just a special kind of local variables, and there's no difference in a parameter and a declared local variable in a subroutine, except that a parameter will usually be initialized with a value that is passed when the subroutine is invoked.

After handling the parameters, we set the current block (referred to by our package variable ) to the   node we just created, and push it on the scope stack (referred to by our package variable  ).

After the whole subroutine definition is parsed, the action method  is invoked. This will retrieve the result object for parameters, which is the  node that will represent the sub. After retrieving the result object for the sub's name, we set the name on the block node, and add all statements to the block. After this, we pop off this block node of the scope stack, and restore the current block.

Pretty easy, huh?

Subroutine invocation
Once you defined a subroutine, you'll want to invoke it. In the exercises of Episode 5, we already gave some tips on how to create the PAST nodes for a subroutine invocation. In this section, we'll give a complete description. First we'll introduce the grammar rules.

rule sub_call { {*}   }

Not only allows this to invoke subroutines by their name, you can also store the subroutines in an array or hash field, and invoke them from there. Let's take a look at the action method, which is really quite straightforward.

method sub_call($/) { my $invocant := $( $ ); my $past    := $( $ ); $past.unshift($invocant); make $past; }

method arguments($/) { my $past := PAST::Op.new( :pasttype('call'), :node($/) ); for $ { $past.push( $( $_ ) ); }       make $past; }

The result object of the sub_call method should be a PAST::Op node (of type 'call'), which contains a number of child nodes: the first one is the invocant object, and all remaining children are the arguments to that sub call.

In order to "move" the result objects of the arguments to the  method, we create the PAST::Op node in the method arguments, which is then retrieved by. In, the invocant object is set as the first child (using unshift). This is all too easy, isn't it? :-)

What's Next?
In this episode we finished the implementation of scope in Squaak, and implemented subroutines. Our language is coming along nicely! In the next episode, we'll explore how to implement operators and an operator precedence table for efficient expression parsing.

In the mean time, should you have any problems or questions, don't hesitate to leave a comment!