Raku Programming/Grammars

Grammars
Regular expressions by themselves are useful but limited. It can be difficult to reuse regexes, difficult to group them into logical bunches, and very difficult to inherit regexes from one bunch to another. This is where grammars come in. Grammars are to regexes what classes are to data and code routines. Grammars allow regexes to act like normal first-class components of the programming language and make use of the cool features of the class system. Grammars can be inherited and overloaded like classes. In fact, the Raku grammar itself can be modified to add new features to the language on the fly. We will see examples of that later.

Rules, Tokens and Protos
Grammars are broken into components called rules, tokens and protos. Tokens are like the regexes we've already seen. Rules are like subroutines because they can call other rules or tokens. Protos are like default multisubs, they define a rule prototype that can be overridden.

Tokens
Tokens are regex that don't backtrack meaning that if a portion of the expression has been matched, this portion will not be altered even if it prevents a larger portion of the expression from matching. While this sacrifices some of the flexibility of regexes, it allows more complex parsers to be created efficiently.

Rules
Rules are ways to combine tokens and other rules together. Rules are all given names, and can refer to other rules or tokens in the same grammar using  angle brackets. Like tokens they do not backtrack but spaces within them are interpreted literally instead of being ignored:

This rule matches a URL string where a protocol name such as "ftp" or "https" is followed by the literal symbol "://" and then a string representing an address. This rule depends on two sub-rules,  and. These could be defined as either tokens or rules, so long as they are in the same grammar:

Protos
Protos define a type of rules or tokens. For example, we could define a proto-token  and then define several tokens representing different protocols. Within one of these tokens, we can refer to its name as :

This would be equivalent to saying:

but is more extensible, allowing types of protocol to be specified later. For example if we wanted to define a new type of  which also supported the "spdy" protocol, we could use:

Matching Grammars
Once we have a grammar like the one defined above, we can match it with the  method:

Match Objects
A match object is a special data type that represents the parse state of a grammar. The current match object is stored in the special variable.

Parser Actions
A grammar can be turned into an interactive parser by combining it with a class of parser actions. As the grammar matches certain rules, corresponding action methods can be called with the current match object.