Grammar Rules |

The grammar rules determine the syntax of a language. A rule has the following general form:

RESULT: COMPONENTS... ;

where `RESULT` is the
nonterminal symbol that this rule describes and `COMPONENTS` are various terminal
and nonterminal symbols that are put
together by this rule. For example,

exp: exp '+' exp ;

says that two groupings of type `exp`,
with a `+` token in between,
can be combined into a larger grouping of type `exp`. Whitespace in rules is
significant only to separate symbols. You can add extra
whitespace as you wish.

Scattered among the components can be actions that determine the semantics of the rule. An action looks like this:

{Eiffel Instructions}

Usually there is only one action and it follows the components.

Multiple rules for the same `RESULT`
can be written separately or can be joined with the vertical-bar
character `|` as follows:

RESULT: RULE1-COMPONENTS... | RULE2-COMPONENTS... ... ;

They are still considered distinct rules even when joined in this way.

In order to avoid mistakes such as giving the same `RESULT` name to two unrelated
rules in the grammar, *geyacc* generates a warning
whenever rules for the same `RESULT`
have not been joined.

If `COMPONENTS` in a rule
is empty, it means that `RESULT`
can match the empty string. For example, here is how to define a
comma-separated sequence of zero or more `exp`
groupings:

expseq: -- Empty | expseq1 ; expseq1: exp | expseq1 ',' exp ;

It is customary to write a comment`
-- Empty `in each rule with no components.

A rule is called "recursive" when its RESULT
nonterminal appears also on its right hand side. Nearly all *geyacc*
grammars need to use recursion, because that is the only way to
define a sequence of any number of something. Consider this
recursive definition of a comma-separated sequence of one or more
expressions:

expseq1: exp | expseq1 ',' exp ;

Since the recursive use of `expseq1`
is the leftmost symbol in the right hand side, we call this * left
recursion*. By contrast, here the same construct is
defined using

expseq1: exp | exp ',' expseq1 ;

Any kind of sequence can be defined using either left
recursion or right recursion, but **you should always use
left recursion**, because it can parse a sequence of any
number of elements with bounded stack space. Right recursion uses
up space on the *geyacc* stack in proportion to the number
of elements in the sequence, because all the elements must be
shifted onto the stack before the rule can be applied even once.
See the parser algorithm for further
explanation of this.

* Indirect* or

expr: primary | primary '+' primary ; primary: constant | '(' expr ')' ;

defines two mutually-recursive nonterminals, since each refers to the other.

Copyright © 1998-2005, Eric
Bezault mailto:ericb@gobosoft.comhttp://www.gobosoft.comLast Updated: 22 February 2005 |