sid
should be invoked in the following manner:
sid options input-files output-files
The options are described later. The input files should be a
number of input file names dependent upon the output language. The
first input file is the sid
grammar file. In the case of
either C dialects there should be one other input file that provides C
specific information to sid
. The number of output files
is also language specific. At present, two output files should be
specified for the C languages. The first should be a .c
file into which the parser is written; the second should be a
.h
file into which the terminal definitions and external
function declarations are written.
The options list should consist of zero or more of the following
options. There are short forms for each of these options as well; see
the sid
manual page for more information on
invocation.
--dump-file FILE
This option causes intermediate dumps of the grammar to be written to the named file. The format of the dump files is similar to the format of the grammar specification, with the following exceptions:
Predicates are written with the predicate result replaced
by the predicate identifier (this will always be zero), and
the result is followed by a ?
to indicate that it
was a predicate. As an example, the predicate:
( b, ? ) = <pred> ( a )
would be printed out as:
( b : Type1T, 0 : Type2T ) ? = <pred> ( a : Type3T )
Items that are considered to be inlinable are prefixed by
a +
. Items that are tail calls which will be
eliminated are prefixed by a *
.
Nested rules are written at the outer level, with names of
the form
outer-rule::....::inner-rule
.
Types are provided on call parameter and result tuples.
Inline rules are given a generated name, and are written out as a call to the generated rule (and a definition elsewhere).
--factor-limit LIMIT
This option limits the number of rules that can be created during the factorisation process. It is probably best not to change this.
--help
This option writes a summary of the command line options to the standard error stream.
--inline INLINES
This option controls what inlining will be done in the output parser. The inlines argument should be a comma separated list of the following words:
SINGLES
This causes single alternative rules to be inlined. This inlining is no longer performed as a modification to the grammar (it was in version 1.0).
BASICS
This causes rules that contain only basics (and no exception handlers or empty alternatives) to be inlined. The restriction on exception handlers and empty alternatives is rather arbitrary, and may be changed later.
TAIL
This causes tail recursive calls to be inlined. Without this, tail recursion elimination will not be performed.
OTHER
This causes other calls to be inlined wherever possible.
Unless the MULTI
inlining is also specified, this
will be done only for productions that are called
once.
MULTI
This causes calls to be inlined, even if the rule being
called is called more than once. Turning this inlining on
implies OTHER
. Similarly turning off
OTHER
inlining will turn off MULTI
inlining. For grammars of any size, this is probably best
avoided; if used the generated parser may be huge (e.g. a C
grammar has produced a file that was several hundred MB in
size).
ALL
This turns on all inlining.
In addition, prefixing a word with NO
turns off that
inlining phase. The words may be given in any case. They are
evaluated in the order given, so:
--inline noall,singles
would turn on single alternative rule inlining only, whilst:
--inline singles,noall
would turn off all inlining. The default is as if sid
were invoked with the option:
--inline noall,basics,tail
This option specifies the output language. Currently this should
be one of ansi-c
, pre-ansi-c
and
test
. The default is ansi-c
.
The ansi-c
and pre-ansi-c
languages
are basically the same. The only difference is that
ansi-c
initially uses function prototypes, and
pre-ansi-c
doesn't. The C language specific options
are:
prototypes, proto, no-prototypes,
no-proto
These enable or disable the use of function prototypes.
By default this is enabled for ansi-c
and
disabled for pre-ansi-c
.
numeric-ids, numeric, no-numeric-ids,
no-numeric
These enable or disable the use of numeric identifiers. Numeric identifiers replace the identifier name with a number, which is mainly of use in stopping identifier names getting too long. The disadvantage is that the code becomes less readable, and more difficult to debug. Numeric identifiers are not used by default.
casts, cast, no-casts,
no-cast
These enable or disable casting of action and assignment operator immutable parameters. If enabled, a parameter is cast to its own type when it is substituted into the action. This will cause some compilers to complain about attempts to modify the parameter (which can help pick out attempts at mutating parameters that should not be mutated). The disadvantage is that not all compilers will reject attempts at mutation, and that ISO C doesn't allow casting to structure and union types, which means that some code may be illegal. Parameter casting is disabled by default.
unreachable-macros, unreachable-macro,
unreachable-comments, unreachable-comment
These choose whether unreachable code is marked by a macro
or a comment. The default is to mark unreachable code with a
comment /*UNREACHED*/
, however a macro
UNREACHED ;
may be used instead, if
desired.
The test
language only takes one input file, and
produces no output file. It may be used to check that a grammar is
valid. In conjunction with the dump file, it may be used to check
the transformations that would be applied to the grammar. There are
no language specific options for the test
language.
--show-errors
This option writes a copy of the current error messages to the standard output. See the manual entry for more details about changing the error message content.
--switch OPTION
This passes through OPTION as a language specific option. The valid options are described above.
--tab-width NUMBER
This option specifies the number of spaces that a tab occupies. It defaults to 8. It is only used when indenting output.
--version
This option causes the version number and supported languages to be written to the standard error stream.