meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
lookbehind [2020/09/25 10:55] – [Recap] revuskycontextual_predicates [2023/03/03 14:23] (current) revusky
Line 1: Line 1:
-====== LOOKBEHIND ======+====== Contextual Predicates ======
  
-A //lookbehind predicate// allows you to add conditions at [[choice points]] based on scanning back in the call/lookahead stack. Note that this is a completely new feature in JavaCC 21 that does not exist in the legacy JavaCC tool.+A //contextual predicate// allows you to add conditions at [[choice points]] based on scanning back in the call/lookahead stack. We are not aware of any other parser generator tool that has this feature.
  
 The easiest way to describe this is with some actual examples. The easiest way to describe this is with some actual examples.
  
-Probably the most typical usage will be to guarantee that a production is not //re-entrant//, i.e. that it is not allowed to nest recursively. This can now be expressed very cleanly with a //lookbehind predicate// as follows:+==== Specifying that a production is non-reentrant ==== 
 + 
 +Probably the most typical usage will be to guarantee that a production is not //re-entrant//, i.e. that it is not allowed to nest recursively. This can now be expressed very cleanly with a //contextual predicate// as follows:
  
 <code> <code>
Line 11: Line 13:
 </code> </code>
  
-First of all, the tilde "~" character that starts the predicate indicates negation. The above predicate indicates that we scan backward in the call stack to see whether we have previously entered a ''Foo'' production. If that is //not// the case (because the condition is negated with the "~") then we can enter the ''Foo'' production. Note that a //lookbehind predicate// starts with either a backslash "\" or a forward slash "/". The above predicate uses a backslash and that means that we scan backwards from the current production up towards the root; a forward slash means that we are scanning forward from the root. +First of all, the tilde "~" character that starts the predicate indicates negation. The above predicate indicates that we scan backward in the call stack to see whether we have previously entered a ''Foo'' production. If that is //not// the case (because the condition is negated with the "~") then we can enter the ''Foo'' production.  
 + 
 +The above sort of predicate will probably be the most commonly used pattern. However, more complex conditions can be formed. 
 + 
 +==== Scanning Forward vs. Backward, Ellipsis and Wild-card  ==== 
 + 
 +Note that the elements in a //contextual predicate// are separated either with a backslash "\" or a forward slash "/". The previous example used a backslash and that means that we scan backwards from the current production up towards the root; a forward slash means that we are scanning forward from the root. 
  
-In the above example, the ellipsis "..." that follows the backslash means that there can be an arbitrary number of intervening productions in the call stack. If, for example, we wrote:+In the above example, the ellipsis "..." that follows the backslash means that there can be an arbitrary number of intervening productions in the call stack. The //wild-card// or simply //dot// means that we match the occurrence (exactly one!) of any production. If, for example, we wrote:
  
 <code> <code>
Line 29: Line 37:
 would mean that we enter the ''Foo'' production if the parent of the current production //is// a ''Bar''. (Note that this predicate does not start with a "~", so thus is //not// negated. would mean that we enter the ''Foo'' production if the parent of the current production //is// a ''Bar''. (Note that this predicate does not start with a "~", so thus is //not// negated.
  
-So, consider the following predicate:+Now, consider the following predicate that uses a forward slash:
  
 <code> <code>
Line 36: Line 44:
  
 This means that we enter the Baz production only if the root production is a ''Foo'' and we then entered directly a ''Bar''. This means that we enter the Baz production only if the root production is a ''Foo'' and we then entered directly a ''Bar''.
 +
 +==== Optional Ending Slash ====
  
 If the predicate begins with a forward slash, it may end //optionally// with a backslash. And vice versa. If a predicate begins with a backslash, it may //optionally// end with a forward slash. For example, consider the following predicate: If the predicate begins with a forward slash, it may end //optionally// with a backslash. And vice versa. If a predicate begins with a backslash, it may //optionally// end with a forward slash. For example, consider the following predicate:
Line 55: Line 65:
 </code> </code>
  
 +==== Summary ====
  
-===== Recap ===== +A //contextual predicate// starts optionally with a tilde "~" to indicate negation. The first character after the tilde (or simply the first character if there is no tilde) must be either a backslash or a forward slash. The backslash indicates that we are scanning backwards from the current production and the forward slash means that we are scanning forward from the current production.
- +
-A //lookbehind predicate// starts optionally with a tilde "~" to indicate negation. The first character after the tilde (or simply the first character if there is no tilde) must be either a backslash or a forward slash. The backslash indicates that we are scanning backwards from the current production and the forward slash is that we are scanning forward from the current production.+
  
 An ellipsis "..." means that we can have an arbitrary number (including zero) of intervening productions. A dot "." means that we have exactly one production of any type.  An ellipsis "..." means that we can have an arbitrary number (including zero) of intervening productions. A dot "." means that we have exactly one production of any type. 
Line 78: Line 87:
 In the above we specify that Foo must be //non-reentrant// and also that the next 2 tokens must be "bar" followed by "baz", or else we jump out of the loop. In the above we specify that Foo must be //non-reentrant// and also that the next 2 tokens must be "bar" followed by "baz", or else we jump out of the loop.
  
 +NB. If you have a ''SCAN'' statement that does not specify either numerical or syntactic lookahead, then the generated code will scan ahead an //unlimited// number of tokens. (Unless the expansion to be parsed is constrained by an [[up to here]] marker.) This is a key characteristic of the newer [[scan statement]].
  
-NB. If you have a ''SCAN'' statement that does not specify either numerical or syntactic lookahead, then the generated code will scan ahead an //unlimited// number of tokens. (Unless the expansion to be parsed is constrained by an [[up-to-here]] marker.) This is a key characteristic of the newer [[scan statement]]. +Note also that //contextual predicates//, like syntactic lookahead in CongoCC, can be nested arbitrarily and work in an arbitrarily nested scanahead routine.
- +
-Note also that //lookbehind predicates//, like syntactic lookahead in JavaCC 21, can be nested arbitrarily and work in an arbitrarily nested scanahead routine.+