Additionalreservedsymbolnames
Marpa::XS reserved, for its internal use, all symbol names ending with the right square bracket (""]"").
In addition, Marpa::RS reserved symbols ending with the right parenthesis ("")""), the right angle
bracket ("">""), and the right curly bracket (""}""). Any other valid Perl string remains an acceptable
symbol name.
Thereturnvalueoftheread()methodhaschanged
The return value of the Marpa::R2 recognizer's read() method differs from its Marpa::XS equivalent. In
Marpa::XS it returned the number of distinct terminals (by symbol ID) allowed in the next read(). In
Marpa::R2 it returns the number of recognizer events that occurred during the read. Examples of
recognizer events are exhaustion, the Earley sets exceeding a designated "warning" level, and other
circumstances settable by the user. For more detail, see the documentation of recognizer's "read"
method.
RuleLHS'sarenolongerasourceofactionnames
In Marpa::XS, if there was no explicit action name for a rule, Marpa would try to find a closure that had
the same name as the rule's LHS. The use of rule LHS's as action names had a potential for unpleasant
surprises. A surprise could occur if the rule's LHS coincided with a function name without the
prorgrammer realizing or intending it. This kind of 'action at a distance' bug can be very hard to
detect and trace.
It was originally thought that implicitly using the LHS as the name of an action would be convenient
enough to outweigh the dangers. But in fact, this feature wound up being little used. And accidental
resolution via a rule LHS was a danger for all users, whether they used the feature or not. For these
reasons, as well as potential optimization and efficiency considerations, Marpa::R2 no longer does
implicit action resolution using a rule LHS.
Differentruleswiththesameranknowappearinarbitraryorder
In ranking parse trees, if two rule instances are for different rules but have the same rule rank, they
will now appear in arbitrary order. This is probably the behavior that programmers have always expected.
In Marpa::XS, when the "null_ranking" named argument of rules was in use for one of the rules, specific
guarantees were made for the order in some of the cases. The intent was to be orthogonal with the
guarantees made for the ranking of null variants within the same rule. These additional guarantees
proved useless in practice, cumbersome to implement, and, when documented, opaque and unintuitive. In
Marpa::R2 they have been dropped.
Nullactionsnowcomefromtherules
In Marpa::XS null actions were specified by symbol. This created a dual semantics -- one for non-nulled
rules, and another for nulled rules. The conventions and behaviors of the two semantics were quite
dissimilar. The rules for their coordination were complicated, and it was possible for a programmer
expecting one semantics, to be surprised by a result from the other.
In Marpa::R2 the semantics of nulled rules is the same as that of non-nulled rules, and the semantics of
nulled symbols comes from the semantics of the nulled rules. This requires rule evaluation closures to
be aware they might be called for nulled rules. But it greatly simplifies the semantics conceptually.
For more detail, see Marpa::R2::Deprecated::NAIF::Semantics::Null.
Actionscannowbeconstants
If an action name resolves to a constant, that constant is the action. The effect is the same as if the
action name resolved to a function that returned that constant, except that it is more efficient.
Perl cannot reliably distinguish between non-existent symbols and symbols whose value is "undef", so
constants whose value is "undef" are not allowed. The "::undef" reserved action name can be used
instead.
Actionsnamesbeginningwith""::""arereserved
Action names which start with ""::"" are reserved. ""::undef"" is a safe way of specify a constant whose
value is "undef". Use of a reserved name which has not yet been defined causes an exception to be
thrown.
The"default_null_value"namedargumentforgrammarshasbeenremoved
Symbols no longer have null values, so the "default_null_value" named argument of grammars has been
removed.
The"null_value"symbolpropertyhasbeenremoved
Symbols no longer have null values. Use of the "null value" symbol property now causes an exception.
Thetokenvalueargumentofread()haschanged
The Marpa::R2 recognizer's read() method differs from its Marpa::XS equivalent. In Marpa::R2, If
read()'s token value argument is omitted, then the value of the token will be a Perl "undef". If
read()'s token value is given explicitly, then that explicit value will be the value of the token. In
particular, an explicit "undef" token value argument will behave differently from an omitted token value
argument. For details, see the documentation of recognizer's "read" method.
Thetokenvalueargumentofalternative()haschanged
The Marpa::R2 recognizer's alternative() method differs from its Marpa::XS equivalent. Its token value
argument must now be a reference to the token value, not the token value itself, as in Marpa::XS. If
alternative's token value argument is omitted or a Perl "undef", then the value of the token will be a
Perl "undef". If alternative's token value argument is reference to "undef", then the value of the token
is a Perl "undef". For details, see the documentation of the "alternative" method.
Marpa::R2::Recognizer::value()doesnotacceptnamedarguments
In the Marpa::XS recognizer, the new(), set() and value() methods all accepted named arguments. As of
Marpa::R2, the value() method will no longer do so.
Allowing named arguments for the value() was a holdover from a previous interface, which also seemed like
it might be a convenience. But, since it was even more important that the value() method be convenient
as the termination test controlling a loop over the parse results, a lot of special logic was added to
deal with arguments which only made sense before the first pass of the loop, etc., etc.
Eliminating named arguments from the value() method eliminates a variety of special cases and, as a
result, the documentation of the value() method is now simpler, shorter and clearer. Anything that could
be done by providing named arguments to the value() method can be done more using the recognizer's set()
method, and the code will be clearer for it.
Marpa'sgrammarrewritingisnowinvisible
Internally, Marpa rewrites its grammars. In Marpa::XS, most details of these rewrites were invisible,
but not all. In Marpa::R2, all internal rules and symbols are now completely invisible to the user, even
in the tools for debugging grammars.
Bydefault,thenon-LHSsymbolsaretheterminals
Traditionally, a symbol has been a terminal if it is not on the LHS of any rule, and vice versa. This is
now the default in Marpa::R2, replacing the more complicated, and less intuitive, scheme that was in
Marpa::XS. Marpa::R2 still allows the user to use any non-nulling symbol as a terminal, including those
symbols that appear on the LHS of a rule, but this is now an option, and never the default. For more,
see "Terminal symbols" in Marpa::R2::Deprecated::NAIF::Grammar.
Thelhs_terminalsgrammarnamedargumenthasbeeneliminated
The lhs_terminals named argument of grammar objects implemented what is now the default behavior. Since
it no longer performs a function, its use is now a fatal error.
Nullingsymbolscannotbeterminals
In Marpa::XS, it was possible for a symbol to be both nulling and a terminal. In practice that meant
that the symbol was nulling, but that, on input, that property could be overriden, and a specific
instance of the nulling symbol could be made non-nulling. This behavior was worse than useless and non-
intuitive -- it was dangerous and logically inconsistent.
Marpa::R2 will not allow a nulling symbol to be used as a terminal. To the extent that the Marpa::XS
behavior made sense, it can be duplicated by creating a symbol which is the LHS of two rules, one empty,
and the other rule with a RHS consisting of exactly one terminal symbol.
AsequencemusthaveauniqueLHS
The LHS of a sequence rule may not be on the LHS of any other rule, whether another sequence rule, or a
BNF rule. This is not as severe a restriction as it might sound -- while sequences cannot share the same
LHS with other rules directly, they can do so indirectly. For details, see "Duplicate rules" in
Marpa::R2::Deprecated::NAIF::Grammar.
In Marpa::XS, the definition of when a sequence was a duplicate was more liberal, but it was also
complicated and non-intuitive. The new definition is simpler and more intuitive, and its greater
restrictiveness is easy to work around.
Theterminalstatusofasymbolislockedonceset
Once a symbol is marked as a terminal or a non-terminal, its terminal status cannot be changed. We doubt
this will affect any actual applications. It would only affect an application that changes symbols from
their default status to non-terminal, and then only if they attempted to mark the same symbol as a
terminal at another point. Few Marpa::R2 applications change symbols from their default terminal status,
and none to my knowledge mark symbols as non-terminals.
Evaluationofinfiniteloopshasbeenchanged
Infinite loops (cycles) are still, by default, fatal errors. For those considering programming with
them, and evaluating parses from grammars with cycles, the semantics of cycles is now more closely
specified. For details of the new semantics, see Marpa::R2::Deprecated::NAIF::Semantics::Infinite.
Therangeofvaluesallowedforrankshasbeenclarified
Symbols and rules have numeric ranks. Previously, no mention was made of range of values allowed. This
is implemented-defined, except that the magnitudes of the ends of the range will always be at least the
28th power of 2, less 1. That is, numbers in the range between -134,217,727 and 134,217,727 will always
be allowed as ranks.