Module Genlex
: sigend
A generic lexical analyzer.
This module implements a simple 'standard' lexical analyzer, presented as a function from character
streams to token streams. It implements roughly the lexical conventions of OCaml, but is parameterized by
the set of keywords of your language.
Example: a lexer suitable for a desk calculator is obtained by
letlexer=make_lexer["+";"-";"*";"/";"let";"=";"(";")"]
The associated parser would be a function from tokenstream to, for instance, int , and would have rules
such as:
letrecparse_expr=parser|[<n1=parse_atom;n2=parse_remaindern1>]->n2andparse_atom=parser|[<'Intn>]->n|[<'Kwd"(";n=parse_expr;'Kwd")">]->nandparse_remaindern1=parser|[<'Kwd"+";n2=parse_expr>]->n1+n2|[<>]->n1
One should notice that the use of the parser keyword and associated notation for streams are only
available through camlp4 extensions. This means that one has to preprocess its sources e. g. by using the
"-pp" command-line switch of the compilers.
typetoken =
| Kwd ofstring
| Ident ofstring
| Int ofint
| Float offloat
| String ofstring
| Char ofchar
The type of tokens. The lexical classes are: Int and Float for integer and floating-point numbers; String
for string literals, enclosed in double quotes; Char for character literals, enclosed in single quotes;
Ident for identifiers (either sequences of letters, digits, underscores and quotes, or sequences of
'operator characters' such as + , * , etc); and Kwd for keywords (either identifiers or single 'special
characters' such as ( , } , etc).
valmake_lexer : stringlist->charStream.t->tokenStream.t
Construct the lexer function. The first argument is the list of keywords. An identifier s is returned as
Kwds if s belongs to this list, and as Idents otherwise. A special character s is returned as Kwds if
s belongs to this list, and cause a lexical error (exception Stream.Error with the offending lexeme as
its parameter) otherwise. Blanks and newlines are skipped. Comments delimited by (* and *) are skipped
as well, and can be nested. A Stream.Failure exception is raised if end of stream is unexpectedly
reached.
OCamldoc 2022-01-24 Genlex(3o)