meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
deprecated_settings [2020/05/10 12:42] revuskydeprecated_settings [2021/12/21 14:03] (current) revusky
Line 1: Line 1:
-====== Obsolete Settings from Legacy JavaCC ======+====== Obsolete Settings from Legacy JavaCC (and JJTree) ======
  
 As a result of quite a bit of forward evolution, some of the settings from legacy JavaCC (and JJTree) are obsolete in JavaCC21. We don't anticipate that any of them will be missed.  As a result of quite a bit of forward evolution, some of the settings from legacy JavaCC (and JJTree) are obsolete in JavaCC21. We don't anticipate that any of them will be missed. 
Line 5: Line 5:
  
   * **STATIC**: JavaCC21 does not support static parsers. Or, in other words, this is always set to false, (and thus, ignored.)   * **STATIC**: JavaCC21 does not support static parsers. Or, in other words, this is always set to false, (and thus, ignored.)
 +  * **LOOKAHEAD** Legacy JavaCC allowed you to specify a //default// numerical lookahead other than 1 token. In JavaCC21, this setting is gone and is always effectively equal to 1. Of course, you can still specify numerical lookahead other than 1 at [[choice points]] as needed.
 +  * **CHOICE_AMBIGUITY_CHECK** This was a parameter in legacy JavaCC that allowed you to specify how far to scan ahead when checking for "ambiguities" in the grammar. For now, the whole concept has been removed from JavaCC 21. Most of the conditions reported as "choice ambiguities" were not really ambiguities in the grammar anyway. The logic of JavaCC is that if more than one choice matches, the first one wins. At some point, we may put in some code to check for unreachable code (at least the simple cases that can be statically proven) but it is not a high priority since the whole thing is of very marginal use-value.
 +  * **OTHER_AMBIGUITY_CHECK** The same comments basically apply here as to **CHOICE_AMBIGUITY_CHECK**. The code for these so-called "ambiguity checks" has been ripped out. In any case, in real world praxis, nobody was ever using these settings anyway.
 +  * **FORCE_LA_CHECK** Frankly, we are unsure what this setting ever did. At least in this case, ignorance is bliss. So the setting is gone. Besides, the fact remains that lookahead was always fundamentally broken in legacy JavaCC anyway, so all of these sophisticated checks were surely all for nothing anyway!
   * **UNICODE_INPUT**: Effectively, this is now always set to true, so it is superfluous, (and thus, ignored.)   * **UNICODE_INPUT**: Effectively, this is now always set to true, so it is superfluous, (and thus, ignored.)
-  * **USER_CHAR_STREAM**: This was a setting that allowed you to define your own implementation of the ''CharStream'' interface. It seems unlikely that very many (if any) people were using this and it has been removed in order to simplify the codebase+  * **USER_CHAR_STREAM**: This was a setting that allowed you to define your own implementation of the ''CharStream'' interface. In default usage of JavaCC 21, this whole concept is irrelevant, since by default, the generated parser just slurps the whole file into memory at once anyway. See [[https://javacc.com/2020/05/05/gigabyte-is-the-new-megabyte/|The Gigabyte is the new Megabyte]]
-  * **BUILD_LEXER**: It is rather hard to fathom what the point of this setting ever was. Presumablythe case where you don't build a lexer is the one in which you define your own XXXLexer implementationHowever, the USER_DEFINED_LEXER setting (previously called USER_TOKEN_MANAGER) always existed, so it is not clear why this setting was ever needed+  * **BUILD_LEXER**: It is rather hard to fathom what the point of this setting ever was. On modern hardware, a full rebuild of both the parser and lexer is not very expensiveThis kind of thing does not seem to have any value and is really just confusing.  
-  * **BUILD_PARSER**: Another bizarre setting really. If all you want to do is build a lexer, and not a parser, then just don't define any grammatical productions in your grammar and all we build is a parser!+  * **BUILD_PARSER**: Another bizarrely pointless setting really. If all you want to do is build a lexer, and not a parser, then just don't define any grammatical productions in your grammar and all we build is a lexer! 
 +  * **DEBUG_LEXER**: This setting, along with ''DEBUG_PARSER'' are removed as of mid-December 2021. It is very hard to imagine current-day developers using this sort of approach, as opposed to using an actual debugger! 
 +  * **DEBUG_PARSER**: This is now gone. It is actually not so hard to debug the generated parser since the code is much more readable than before and contains location info to trace back where in the grammar file the generated code originated.
   * **KEEP_LINE_COL**: JavaCC21 always puts location information in Tokens and Node objects. (//Really, why would you ever want to throw away location info?//) For more thoughts on this issue, see [[https://javacc.com/2020/05/05/gigabyte-is-the-new-megabyte/|The Gigabyte is the new Megabyte]].   * **KEEP_LINE_COL**: JavaCC21 always puts location information in Tokens and Node objects. (//Really, why would you ever want to throw away location info?//) For more thoughts on this issue, see [[https://javacc.com/2020/05/05/gigabyte-is-the-new-megabyte/|The Gigabyte is the new Megabyte]].
-  * **ERROR_REPORTING**: This was an option that was true by default, but you could turn it off in order to generate a somewhat smaller .class file, except that error messages would be much less informative because of information being thrown away. I did some experimenting and found that the generated XXXParser.class was typically about 10% smaller with ERROR_REPORTING off. The tradeoff looks terrible and, as with KEEP_LINE_COL, it looks utterly foolish to ever turn this off. So, the setting is now gone and the option is always effectively on.+  * **ERROR_REPORTING**: This was an option that was true by default, but you could turn it off in order to generate a somewhat smaller .class file, except that error messages would be much less informative because of information being thrown away. I did some experimenting and found that the generated XXXParser.class was typically about 10% smaller with ERROR_REPORTING off. The tradeoff looks terrible and, as with KEEP_LINE_COL, it looks utterly foolish to ever turn this off. So, the setting is now gone and the option is always effectively on. (Further note. All the legacy error reporting code is practically rewritten anyway. The prior comment applies in any case. There is no reason for any sane person to want to turn it off.)
   * **SANITY_CHECK**: By default, the parser generator does some various sanity checks before generating the various files. This setting in the legacy JavaCC tool allowed you to turn this off. (//Why would anybody turn this off?//) This setting is gone and is now effectively always true.   * **SANITY_CHECK**: By default, the parser generator does some various sanity checks before generating the various files. This setting in the legacy JavaCC tool allowed you to turn this off. (//Why would anybody turn this off?//) This setting is gone and is now effectively always true.
-  * **CACHE_TOKENS**: I never even understood what the point of this setting was. It must have been some kind of //speculative// peephole optimization, except I don't think it was even correct. There would be problems with switches of lexical state in some cases. The setting is now gone and is always effectively false. (Which was the default before, which everybody was using anyway.)+  * **CACHE_TOKENS**: I never even understood what the point of this setting was. It must have been some kind of //speculative// peephole optimization, except I don't think it was even correct. There would be problems with switches of lexical state in some cases. Also, I doubt it offered any noticeable performance gain. The setting is now gone and is always effectively false. (Which was the default before, which everybody was using anyway.) 
 +  * **TOKEN_FACTORY** : This setting has been removed (as of 11/11/2021) since it is really not very useful now that we have INJECT. I doubt it was really very widely used (if at all).
   * **TRACK_TOKENS** : There is no real reason for this setting to exist any more, since, by default, Tokens are added to the AST and they have their line/column information. In fact, all Node objects have line/column information.   * **TRACK_TOKENS** : There is no real reason for this setting to exist any more, since, by default, Tokens are added to the AST and they have their line/column information. In fact, all Node objects have line/column information.
-  * **COMMON_TOKEN_ACTION** : This feature is still supported but the configuration setting is no longer necessary, since JavaCC21 deduces it from the presence (or absence) of the appropriately named method in your generated lexer class. +  * **USER_DEFINED_TOKEN_MANAGER** : This setting was removed in October 2021.  
-  * **NODE_SCOPE_HOOK** : As with COMMON_TOKEN_OPTION, the feature is still supported but the configuration option is no longer necessary, since JavaCC21 deduces it from the presence or absence of the appropriately named method or methods in your generated parser class. See [[Node Life Cycle Hooks]] for more information. +  * **COMMON_TOKEN_ACTION** : This feature is still supported but the configuration setting is no longer necessary, since JavaCC21 deduces it from the presence (or absence) of the appropriately named method in your generated lexer class. If you have a method with the signature ''void CommonTokenAction(Token t)'' it will be called at the appropriate point. However, you would be better off using the newer alternative, which you use by creating a method with the signature ''Token TOKEN_HOOK(Token t)''. It is more flexible because, for one thing, it allows you to define multiple token hook routines. See [[https://javacc.com/2020/10/16/token-hooks-revisited/ | here]] for more information. Also, since this method has a return value, it allows you to instantiate a new Token object (of whatever subclass) and return it. In any case, there is no need for the configuration setting, since these methods are used if present and if not, not. (//Duh!//) 
-  * **NODE_EXTENDS** : Since JavaCC21 has the [[INCLUDE]] statement, there is no need for this configuration option to exist. If you want to specify that your BaseNode class extends some specific class, simply use [[Code Injection in JavaCC 21|code injection]] to specify this.+  * **NODE_SCOPE_HOOK** : As with the ''COMMON_TOKEN_OPTION'', the feature is still supported but the configuration option is no longer necessary, since JavaCC21 deduces it from the presence or absence of the appropriately named method or methods in your generated parser class. See [[Node Life Cycle Hooks]] for more information. 
 +  * **NODE_EXTENDS** : Since JavaCC21 has ''INJECT'', there is no need for this configuration option to exist. If you want to specify that your BaseNode class extends some specific class, simply use [[Code Injection in JavaCC 21|code injection]] to specify this. Something like:
  
-<html><pre+<code
-   INJECT(BaseNode{extends SomeClass;} {} +   INJECT BaseNode : extends SomeClass 
-</pre></html>+</code>
  
-The following configuration options are still supported but are deprecated in JavaCC21:+In general, code injection can be used to specify that any generated class should extend a given class or implement whatever interface(s). There is no need for a plethora of configuration settings for this.
  
-  * **OUTPUT_DIRECTORY**: This is deprecated in favor of the new BASE_SRC_DIR option. See [[JavaCC21 Conventions]] for more information on the preferred way to specify your directory layout when using JavaCC21. +The following configuration option is still supported but is deprecated in JavaCC21:
-  * **NODE_PREFIX**Use of this is not encouraged in JavaCC21. By default, it is simply the empty string. (In JavaCC (or JJTree to be precise) it was "AST" by default.)+
  
-The following option has been renamed for consistencybut the older name is still supported:+  * **NODE_PREFIX**: Use of this is not encouraged in JavaCC21. By defaultit is simply the empty string. (In JavaCC (or JJTree to be precise) it was "AST" by default.) I guess that prefixing all the Node classes with "AST" is a (crude) way of defining a Namespace. However, one would think these people noticed that Java has this thing called ''packages''
  
-USER_TOKEN_MANAGER is now USER_DEFINED_LEXER.+The use of both ''PARSER_BEGIN....PARSER_END'' and ''TOKEN_MGR_DECLS'' is deprecated in favor of the new [[Code Injection in JavaCC 21|code injection feature]]. Injecting code into the generated parser and lexer is simply a specific case of code injection, so there is no need for these separate constructs. However, they will continue to work for the foreseeable future.
  
-The use of both ''PARSER_BEGIN....PARSER_END'' and ''TOKEN_MGR_DECLS'' is deprecated in favor of the new [[Code Injection in JavaCC 21|code injection feature]]Injecting code into the generated parser and lexer is simply a specific case of code injectionso there is no need for these separate constructs.+To specify the parser and lexer class names, you may use the ''PARSER_CLASS'' and ''LEXER_CLASS'' configuration optionsHowever, it is not mandatory, since a ''Foo.javacc'' file will automatically generate a parser class called ''FooParser'' and a lexer class called ''FooLexer''. There will rarely be any practical value in overriding that. 
 + 
 +There are a host of settings that were added //after// the FreeCC fork, which was in mid-2008. See [[ancient history]] for more information on all this. No settings added to legacy JavaCC after about 2008 are currently supported in JavaCC 21. Most of them are of very marginal value. Moreoverit is safe to say that nobody uses them because they are not documented anywhere that I can find! Just for example, the **GRAMMAR_ENCODING** option was added at some point after 2008 (I don't know when exactly) to specify what encoding your grammar file is in. I am certain that nobody uses this. (Or just about nobody surely.) Everybody stores their grammar files in the system default encoding which is ''UTF-8'' on any remotely modern system that any serious developer would be working on. Adding these kinds of options that nobody uses is actually very typical of a [[nothingburger]] project(Adding all these options and not even documenting them is nothingburger-ism squared!)
  
-To specify the parser class name, you may use the PARSER_CLASS configuration option. However, it is not mandatory, since a ''Foo.javacc'' file will automatically generate a parser class called ''FooParser'' and a lexer class called ''FooLexer''. 
  
 See [[new settings in JavaCC 21]] for information on settings introduced in JavaCC21 that were not present in legacy JavaCC. See [[new settings in JavaCC 21]] for information on settings introduced in JavaCC21 that were not present in legacy JavaCC.