Options PreviousNext

Gelex provides an option mechanism for controlling the way the scanner is generated. Options are specified either in the declarations section of gelex input file or from the command line.

Declaration Options

Options are specified in the declarations section using unindented lines beginning with %option followed by a whitespace-separated list of options. One can use more than one %option directives if necessary. Most options are given simply as names, optionally preceded by the word no (with no intervening whitespace) to negate their meaning. Following is an excerpt from a scanner description file:

%option ecs meta-ecs case-insensitive nodefault
%option nowarn outfile="my_scanner.e"

The gelex options have the following meanings:

Generate backing-up information to standard output. This is a list of scanner states which require backing up and the input characters on which they do so. By adding rules one can remove backing-up states. If all backing-up states are eliminated and the full option is used, the generated scanner will run faster. Only users who wish to squeeze every last CPU cycle out of their scanners need worry about this option. [default: nobackup]
Generate a case-insensitive scanner. The case of ASCII letters (letters a to z) given in the gelex input patterns will be ignored, and tokens in the generated scanner input will be matched regardless of case. The matched text given in text will have the preserved case. [default: case-sensitive]
Cause the default rule (that unmatched scanner input is echoed to the standard output) to be suppressed. If the scanner encounters input that does not match any of its rules, it aborts with an error. This option is useful for finding holes in a scanner's rule set. [default: default]
Direct gelex to construct equivalence classes, i.e. sets of characters which have identical lexical properties (for example, if the only appearance of digits in the gelex input is in the character class [0-9] then the digits '0', '1', ..., '9' will all be put in the same equivalence class). Equivalence classes usually give dramatic reductions in the final table/object file sizes (typically a factor of 2-5) and are pretty cheap performance-wise (one array look-up per character scanned). [default: ecs]
Specify that the full scanner tables should be generated - gelex should not compress the tables by taking advantages of similar transition functions for different states. The result is large but fast. [default: nofull]
Direct gelex to generate code for line and column counting. [default: noline]
Direct gelex to construct meta-equivalence classes, which are sets of equivalence classes (or characters, if equivalence classes are not being used) that are commonly used together. Meta-equivalence classes are often a big win when using compressed tables, but they have a moderate performance impact (one or two "if" tests and one array look-up per character scanned). This option does not make sense together with option full since there is no opportunity for meta-equivalence classes if the table is not being compressed. [default: meta-ecs]
Direct gelex to write the scanner class to the file filename instead of the standard output.
Direct gelex to generate code for position counting (i.e. the number of characters read since the beginning of the input source). [default: noposition]
Specify that the feature post_action should be called after each semantic action. [default: nopost-action]
Specify that the feature post_eof_action should be called after each end-of-file (i.e. <<EOF>>) semantic action. [default: nopost-eof-action]
Specify that the feature pre_action should be called before each semantic action. [default: nopre-action]
Specify that the feature pre_eof_action should be called before each end-of-file (i.e. <<EOF>>) semantic action. [default: nopre-eof-action]
Specify that the feature reject is used in semantic actions. [default: noreject]
Unicode characters appearing in patterns other than (b:r) are internally converted to their UTF-8 sequence of bytes. Character classes are converted accordinly to the equivalent patterns. This option is useful when the file to be scanned is using the UTF-8 encoding. However it produces larger scanners (size of generated tables and size of the resulting executable), with no visible speed improvement (when measured with the Eiffel parser of gec scanning 20,000 Eiffel class text files) compared to the default mode which is also capable of scanning Unicode characters.
Suppress warning messages. [default: warn]

Command-line Options

Most of these options can also be specified from the command-line. Following is gelex command-line usage message:

gelex [--version][--help][-bcefhimsVwxz?][-a size][--array-size=size][--backup]
      [--outfile=filename][-o filename][--pragma=[no]line][--nodefault][--nowarn] filename

The command-line options have the following meanings:

Print gelex version number to the standard output and exit.
Print gelex usage message to the standard output and exit.
-a size
Some Eiffel compilers experience difficulties to process big manifest arrays. This option directs gelex to split manifest arrays with more than size elements into several smaller arrays. This option can be disabled by setting size to 0. [default: 200]
Equivalent to backup.
Equivalent to nofull.
Equivalent to ecs.
Equivalent to full.
Equivalent to case-insensitive.
The generated code uses an inspect instruction to find out which action to execute. This is the default. Otherwise, a binary-search implemented with if instructions is used.
Equivalent to meta-ecs.
-o filename
Equivalent to outfile="filename"
The code generated for the semantic actions contains (or not) indication about the line number where it originally appeared in the input description file.
Equivalent to nodefault.
Equivalent to nowarn.
Write each semantic action into a separate routine. The default is to write all actions into the same routine, which can become too large for C back-end compilers to handle.
Mark the end of the options. Useful when the scanner description filename begins with character '-'.
Name of gelex input file containing the scanner description. By convention gelex input filenames have the extension '.l' such as in my_scanner.l. One is not required to follow this convention though.

Note that options specified on the command-line will override those specified in the %option directives of the input file.

Copyright 2000-2019, Eric Bezault
Last Updated: 26 September 2019