meta data for this page
  •  

This is an old revision of the document!


Summary of the Newer Streamlined Syntax

The following is a summary of the newer streamlined syntax that was recently announced here.

I anticipate that there will soon be a utility available that automatically converts the legacy syntax to the streamlined syntax. In any case, there is no need to manually convert all of your code to the more streamlined syntax, since all the legacy syntax continues to work. Moreover, the two syntaxes can co-exist perfectly well in the same file.

This page does not describe the new SCAN construct which is meant to supersede LOOKAHEAD. That will be outlined separately.

Nonterminals

There is no need to write empty parentheses after a nonterminal for a production that takes no parameters. Thus:

Foo() Bar() Baz()

can now be written as:

Foo Bar Baz

BNF Productions

  1. There is no need to write void in front of productions with no return value.
  2. A production which takes no parameters no longer needs empty parentheses.
  3. There is no need for an empty code block, i.e. {} as the first thing in your production's definition. (I mean, assuming that you don't actually need to put some code at the top of your production.)
  4. Rather than put the definition of your production inside braces (like Java actions), they are preferably listed (with no opening delimiter) and then terminated with a semicolon.

The above four points (along with the earlier point about no-args nonterminals not needing parentheses) combine such that, where you would previously write:

void Foobar() :
{}
{
    Foo() Bar() 
}

you can now write:

Foobar : Foo Bar;

A list of lexical specifications, a.k.a. Token Productions can be written without the curly braces.

In this case, they are written with no opening { and the list is ended with a semicolon. Thus, instead of writing:

TOKEN #Delimiter :
{
    <LPAREN : "(">
    |
    <RPAREN> : ")">
    |
    <LBRACE> : "{">
    |
    <RBRACE> : "}">
}

the newer, preferable syntax is:

TOKEN #Delimiter :
  <LPAREN: "(" > 
  | 
  <RPAREN: ")" >
  | 
  <LBRACE: "{" > 
  |
  <RBRACE: "}" > 
;

This is considered preferable, not because it saves much space (it doesn't!) but because one aspect of the newer syntax is that the {...} are reserved for elements that really are embedded Java actions. As you can see, in the newer syntax for BNF productions, the only use of {...} is for actual Java code.

The Options at the top of a file do not need to be in any sort of block.

Since the options, like TREE_BUILDING_ENABLED=false can only occur at the very top of a file anyway, there is no need for them to be in some special construct Options {..}. Thus, where you would previous have:

options {
    BASE_SRC_DIR="..";
    PARSER_PACKAGE="com.acme.foolang";
}

You can now simply put:

BASE_SRC_DIR="..";
PARSER_PACKAGE="com.acme.foolang";

at the top of your file.

The syntax for INJECT is also streamlined.

N.B. This, of course, is not a change from legacy JavaCC, since legacy JavaCC never had an INJECT statement!

You can (optionally) dispense with the parentheses in: INJECT(ClassDeclaration) :

The first block after the colon does not need braces around it. Either part of the injection can be left out. Thus, if the only point is to indicate that a Node extends a class (or implements an interface or you want to use an Annotation), where you previously had to write:

  INJECT(MyNode) :
  {
      extends AbstractBaseNode
  }
  {}

(Actually, the final empty block {} has been optional for some time in these spots, but I don't believe I ever documented that! But now it is much more streamlined.)

You can now write:

 INJECT MyNode : extends AbstractBaseNode

A more complex INJECTION that does insert some code might now look something like:

 INJECT MyNode :
     import java.util.List;
     extends AbstractBaseNode
     implements Nullable
{
    private List<Foo> foos;

    public List<Foo> getFoos() {return foos;}
    
    public void setFoos(List<Foo> foos) {this.foos = foos;}
}

Note that, in the statements immediately following the colon (and immediately preceding the opening brace) the first ends with a semicolon and the other two do not. Well, the extends and implements elements in Java do not end in with a semicolon, while an import statement does. However, if the above looks funny to you, you can (optionally) end the other two lines with a semicolon and the parser will not complain!

New SCAN construct which replaces LOOKAHEAD

This has its own separate page here.

New up to here syntax

The up to here syntax provides a way to specify lookahead in a much more clean, intuitive way. See here for more information.