The Generated Parser PreviousNext

For each parser description file given as input, geyacc will generate an Eiffel class as output. The deferred class YY_PARSER, which is part of Gobo Eiffel Parse Library, provides an abstraction for parsers. Every parser class generated by geyacc will be a descendant of this class. The main feature of YY_PARSER is routine parse which, when called, reads tokens, executes actions and ultimately returns when it encounters the end of input or an unrecoverable syntax error. parse sets syntax_error to false if the parsing was successful. Otherwise, if an unrecoverable error is detected, syntax_error is set to true and the error is reported by calling report_error. By default this routine just prints a message on the screen, but it can easily be redefined to suit your needs. Also of interest is feature error_count which keeps track of the number of syntax errors (recovered and fatal) detected during the last parsing.

The lexical analyzer routine, read_token, recognizes tokens from the input stream and makes them available to the parser in last_token. Geyacc does not provide a default implementation for read_token and last_token, so you must define these two deferred features. In simple programs, read_token is often defined at the end of the geyacc grammar file, in the user code section. If read_token is defined in a separate class, you need to arrange for the token-type integer constant definitions to be available there. To do this, use the option -t when you run geyacc, so that it will write these integer constant definitions along with the feature token_name into a separate class.

The value that read_token returns in last_token must be the numeric code for the type of token it has just found, or 0 for end-of-input. When a token is referred to in the grammar rules by a name, that name in the parser file becomes an integer constant whose value is the proper numeric code for that token type. So read_token can use the name to indicate that type. When a token is referred to in the grammar rules by a character literal, the numeric code for that character is also the code for the token type. So read_token can simply return that character code. The null character must not be used this way, because its code is zero and that is what signifies end-of-input. Here is an example showing these things:

read_token is
        -- Read a token from input stream.
        -- Make result available in last_token.
    local
        c: CHARACTER
    do
        ...
        if c = EOF then
                -- Detect end of file.
            last_token := 0
        elseif c = '+' or c = '-' then
                -- Assume token type for `+' is '+'.
            last_token := c.code
        elseif ... then
                 -- Return the type of the token.
             last_token := INT
        else
            ...
        end
    end

This interface has been designed so that the output from the gelex utility can be used without change as the definition of read_token.

YY_PARSER is actually a generic class whose generic parameter specifies the type of the semantic values associated with each token. When scanning the input stream for a new token, read_token must update the semantic value of the token being read in feature last_value. As for read_token and last_token, last_value is a deferred feature for which you must provide an implementation.

Class YY_PARSER is equipped with a procedure make which initializes the parser. This routine should be used as creation routine in descendant classes. Also available to descendants of YY_PARSER are a set of features which can be called from the semantic actions. An implementation for most of these routines is provided in class YY_PARSER_SKELETON.

Geyacc does not automatically generate the note, class header, formal generics, obsolete, inheritance, creation and invariant clauses. These have to be specified in Eiffel declarations in the first section and in the user code section of the parser description file. The following example shows a typical parser description file:

%{
class MY_PARSER

inherit

    YY_PARSER_SKELETON [INTEGER]
        rename
            make as make_parser_skeleton
        redefine
            report_error
        end

    MY_SCANNER
        rename
            make as make_scanner
        export
            {NONE} all
        end

create

    make
%}
%token ...

%%

...rules...

%%

feature {NONE} -- Initialization

    make is
            -- Create a new parser.
        do
            make_scanner
            make_parser_skeleton
            create error_messages.make
        end

feature -- Access

    error_messages: LINKED_LIST [STRING]
            -- Error messages reported so far

feature {NONE} -- Error reporting

    report_error (a_message: STRING) is
            -- Store error message in error_messages.
        do
            error_messages.extend (a_message)
        end

invariant

    error_messages_not_void: error_messages /= Void

end 

The generated parser class, named MY_PARSER, inherits its lexical analyzer features (read_token, last_token and last_value) from class MY_SCANNER which has probably been generated using gelex. The routine report_error, inherited from YY_PARSER_SKELETON, has been redefined to keep track of the error messages reported so far. Since the generic parameter of class YY_PARSER_SKELETON is INTEGER, the semantic values of the tokens are integers. Finally, the creation routine make initializes the lexical analyzer and the parser and makes sure that the invariant is preserved.


Copyright 1998-2005, Eric Bezault
mailto:
ericb@gobosoft.com
http:
//www.gobosoft.com
Last Updated: 16 February 2005

HomeTocPreviousNext