Geyacc Declarations |
The geyacc declarations subsection of a geyacc grammar defines the symbols used in formulating the grammar.
All token type names (but not single-character literal tokens such as '+' and '*') must be declared. Nonterminal symbols must be declared if the Eiffel type used for the semantic value needs to be specified. The first rule in the file also specifies the start symbol, by default. If you want some other symbol to be the start symbol, you must declare it explicitly.
The basic way to declare a token type name (terminal symbol) is as follows:
%token NAME
Geyacc will convert this into an integer constant feature in the parser, so that the routine read_token (if it is define in this class) can use the name NAME to stand for this token type's code. Any number of terminal symbols can be specified in the same %token declaration. Use spaces to separate the symbol names.
Alternatively, you can use %left, %right, or %nonassoc instead of %token, if you wish to specify precedence.
You can explicitly specify the numeric code for a token type by appending an integer value in the field immediately following the token name:
%token NUM 300
It is generally best, however, to let geyacc choose the numeric codes for all token types. Geyacc will automatically select codes that don't conflict with each other or with character codes.
When the type of semantic values needs to be specified (when it's different from the default detachable ANY), the syntax of %token may be extended to include the Eiffel type alternative (possibly an anchored type) delimited by angle-brackets:
%token <INTEGER> NUM -- Define token NUM of type INTEGER %token <STRING> NAME -- Define token NAME of type STRING %token <like token> DIGIT -- Define token DIGIT of type like token
All tokens specified in the same %token declaration will have the same value type.
A literal string token can be associated with a token type name by writing the literal string at the end of a %token declaration which declares the name. For example:
%token LE "<=" %token <OPERATOR> ASSIGN 310 ":="
Once the literal string has been associated with the token name, you can use them interchangeably in the grammar rules. Note that literal strings may contain Unicode characters.
Use the %left, %right or %nonassoc declaration to declare a token and specify its precedence and associativity, all at once. These are called precedence declarations.
The syntax of a precedence declaration is the same as that of %token:
%left SYMBOLS ...
And indeed any of these declarations serves the purposes of %token. But in addition, they specify the associativity and relative precedence for all the SYMBOLS:
You can declare the value type of each nonterminal symbol for which values are used (when it's different from the default detachable ANY). This is done with a %type declaration, like this:
%type <TYPE> NONTERMINAL ...
Here NONTERMINAL is the name of a nonterminal symbol, and TYPE is the name of the Eiffel type associated with this symbol. Anchored types such as:
%type <like foo> NONTERMINAL ...
where foo is a feature name, are also accepted. Any number of nonterminal symbols can be specified in the same %type declaration, if they have the same value type. Use spaces to separate the symbol names.
Geyacc normally warns if there are any conflicts in the grammar, but most real grammars have harmless shift/reduce conflicts which are resolved in a predictable way and would be difficult to eliminate. It is desirable to suppress the warning about these conflicts unless the number of conflicts changes. You can do this with the %expect declaration. The declaration looks like this:
%expect N
Here N is a decimal integer. The declaration says there should be no warning if there are N shift/reduce conflicts and no reduce/reduce conflicts. The usual warning is given if there are either more or fewer conflicts, or if there are any reduce/reduce conflicts. In general, using %expect involves these steps:
Now geyacc will stop annoying you about the conflicts you have checked, but it will warn you again if changes in the grammar result in additional conflicts.
Geyacc assumes by default that the start symbol for the grammar is the first nonterminal specified in the grammar specification section. The programmer may override this restriction with the %start declaration as follows:
%start SYMBOL
Here is a summary of all geyacc declarations:
Copyright © 2000-2019, Eric
Bezault mailto:ericb@gobosoft.com http://www.gobosoft.com Last Updated: 23 September 2019 |