A big reason that parser generators such as yacc and bison
became popular is that they create parsers that are much more reliable
than handwritten parsers. If you feed a grammar to bison and it has no
conflicts, you can be completely sure that the language that the
generated parser accepts is exactly the one described by the grammar. It
won’t have any of the holes that handwritten parsers tend to have,
particularly when diagnosing erroneous input. If you use precedence
declarations sparingly to resolve conflicts in known situations,
expression grammars, and if/then/else, you can still be sure that your
parser is handling the language as you think it is.
On the other hand, if you use GLR parsing, you can hand any
grammar to bison, and it will create a parser that parses something,
resolving the conflicts at parse time. But the more conflicts it has,
the less likely it is that the language it’s parsing is the language you
want, and the less likely it is that your parser will resolve the
conflicts the way you want. Before switching to GLR, be sure you
understand why your grammar has the conflicts you’re expecting GLR to
handle and that you understand how you are resolving them. Otherwise,
you risk the embarrassing situation of finding out much later that your
parser gives up unexpectedly when it runs into a conflict you didn’t
anticipate or that because of an incorrect conflict resolution, the
language it’s parsing isn’t quite the one you wanted.
GLR parsers can in theory be extremely slow, since running N
parses in parallel is roughly N times as slow as a single parse, and a
particularly ambiguous grammar could split on each token. Useful GLR
grammars typically have only a few ambiguities that are resolved within
a few tokens, so the performance is adequate.
A normal bison LALR parser doesn’t have to deal with shift/reduce or
reduce/reduce conflicts, since any conflicts were resolved one way or the
other when the parser was built. (See Chapter 8.) But when
a GLR parser encounters a conflict, it conceptually splits and continues
both possible parses, with each parser consuming the input tokens in
parallel. When there are several conflicts, it can create a tree of
partial parses, splitting each time there is a conflict.
You are currently reading a PREVIEW of this book.
Get instant access to over
$1 million worth of books and videos.