meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
include [2020/02/14 00:24] – revusky | include [2023/03/03 16:20] (current) – revusky | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== The INCLUDE Statement ====== | + | < |
- | JavaCC 21's **INCLUDE** statement allows you to break up your grammar file into multiple physical files. It would look like this typically: | + | # The INCLUDE |
- | INCLUDE(" | + | Congo' |
- | //This feature is not present in legacy JavaCC.// | + | INCLUDE " |
+ | |||
+ | *This feature is not present in legacy JavaCC.* | ||
The motivation behind **INCLUDE** should be obvious. By allowing you to reuse a base grammar or generally useful fragment in various files, you can avoid the copy-paste-modify *antipattern* that would have been necessary when using legacy JavaCC. Generally speaking, being able to to organize a large grammar into multiple physical files can be a big win in terms of maintainability. | The motivation behind **INCLUDE** should be obvious. By allowing you to reuse a base grammar or generally useful fragment in various files, you can avoid the copy-paste-modify *antipattern* that would have been necessary when using legacy JavaCC. Generally speaking, being able to to organize a large grammar into multiple physical files can be a big win in terms of maintainability. | ||
Line 11: | Line 13: | ||
Still, as they say, the devil is in the details, and there are some various wrinkles that need to be covered here. | Still, as they say, the devil is in the details, and there are some various wrinkles that need to be covered here. | ||
- | ===== The DEFAULT_LEXICAL_STATE setting | + | ## The DEFAULT_LEXICAL_STATE setting |
- | In legacy JavaCC, if you defined a token production without specifying a lexical state, any lexical definitions belonged to a lexical state called " | + | In legacy JavaCC, if you defined a token production without specifying a lexical state, any lexical definitions belonged to a lexical state called " |
- | Thus, JavaCC 21 introduces | + | Thus, CongoCC has a setting called **DEFAULT_LEXICAL_STATE**. That means that any lexical specifications where the lexical state is unspecified are in that state. Thus, a JSON grammar would likely have something like this at the top: |
+ | |||
+ | |||
+ | DEFAULT_LEXICAL_STATE=JSON; | ||
- | options { | ||
- | | ||
- | } | ||
| | ||
In that case, any grammar for a language that wants to handle embedded JSON data would presumably define its own " | In that case, any grammar for a language that wants to handle embedded JSON data would presumably define its own " | ||
Line 25: | Line 27: | ||
Actually, at the moment, **DEFAULT_LEXICAL_STATE** is the only setting you can put in an **INCLUDE**d grammar that has any effect. All of the other options are simply ignored, since they are presumably set in the top-level *including* grammar. In legacy JavaCC, if you defined a token production without specifying a lexical state, those patterns are matched in a lexical state called " | Actually, at the moment, **DEFAULT_LEXICAL_STATE** is the only setting you can put in an **INCLUDE**d grammar that has any effect. All of the other options are simply ignored, since they are presumably set in the top-level *including* grammar. In legacy JavaCC, if you defined a token production without specifying a lexical state, those patterns are matched in a lexical state called " | ||
- | ===== Wrinkles with Code Injection | + | ## Wrinkles with Code Injection |
- | JavaCC still supports the legacy JavaCC constructs of **PARSER_BEGIN...PARSER_END** and **TOKEN_MGR_DECLS**. (For how much longer, I am not making any promises...). However, those constructs are ignored | + | You can |
- | You can still //inject// code into the generated parser or lexer class, from within an included grammar, but you need to write something like: | + | |
- | + | ||
- | | + | |
- | { | + | |
- | ... | + | |
- | } | + | |
{ | { | ||
... | ... | ||
Line 41: | Line 38: | ||
or: | or: | ||
- | INJECT(LEXER_CLASS) : | + | INJECT LEXER_CLASS : |
- | { | + | |
- | ... | + | |
- | } | + | |
{ | { | ||
... | ... | ||
} | } | ||
- | JavaCC | + | CongoCC |
- | INJECT(JSONParser) : | + | INJECT JSONParser : |
{ | { | ||
... | ... | ||
} | } | ||
- | { | ||
- | ... | ||
- | } | ||
- | because the parser class we are generating is not '' | + | because the parser class we are generating is not JSONParser, it is FooParser! However, the person writing |
- | So, do not be surprised when the code within PARSER_BEGIN...PARSER_END is ignored if it is within | + | In fact, the aliases **PARSER_CLASS**, |
- | In fact, the aliases | + | To see a concrete example of **INCLUDE** in use, you can take a look at https:// |
- | To see a concrete example of **INCLUDE** in use, you can take a look at https:// | + | </markdown> |
===== INCLUDE with Java Source files ===== | ===== INCLUDE with Java Source files ===== | ||
Line 72: | Line 63: | ||
to only contain Java source code. Thus, writing: | to only contain Java source code. Thus, writing: | ||
- | | + | |
is exactly the same as if you wrote: | is exactly the same as if you wrote: |