meta data for this page
  •  

This is an old revision of the document!


Why the Bifurcation of Effort?

Some Ancient History

JavaCC is a Java based tool of 1990's vintage. As I (Jon Revusky) understand it, it was originally developed in 1996 in a company called Metamata. JavaCC was free as in beer, but was not open source.

Now, first of all, let's put this part of the timeline in perspective. JDK 1.0, or now called Java 1, was released in January of 1996. JDK 1.1 was a year later, in February of 1997. At the risk of dating myself, I was involved with Java pretty much from the start. I believe the first JDK I downloaded (it took forever over a dial-up model connection!) was JDK 1.0.2. Inner classes were introduced a year later in JDK 1.1. But this was still Java 1. Java 2 (or JDK 1.2 it was called at the time) was released at the end of 1998.

Doubtless, the original JavaCC project could have benefited from the collections library introduced in Java 2 but that came out nearly two years later, at the end of 1998. Well, the key point to take away from all this is this: the entirety of the legacy JavaCC codebase was written against Java 1, the oldest, most primitive version of the Java language.

Now, Metamata, where JavaCC was originally developed, was acquired by some other company called Webgain, that was in turn acquired by Sun Microsystems. That is how Sun came to own the copyright to JavaCC. As best I understand, it was never internally developed at Sun. At least, nothing significant was done with it. As best I can determine, there was no active development on JavaCC after 1997, more or less. By the time Sun Microsystems released the code as open source in 2003, it was an orphan project on which nothing had been done for about 6 years.

Late 2001. My Own Involvement with JavaCC

The author's involvement with JavaCC began in 2001, just as a user. At this point in time, I became heavily involved in the FreeMarker project, a template engine used in Java mostly in the web application space. The original authors had lost interest and I saw a lot of possibilities of doing something with a template engine. The FreeMarker 1 codebase was a good proof of concept and was actually useful as is. I myself had used it in a number of projects in companies around 1999 to 2000. However, to turn it into a more solid, capable tool would require quite a bit of refactoring. The very first thing to do would be to replace the kludgy hand-written parser with one generated by a parser generator tool.

At the time, I was hardly committed to using JavaCC specifically. At the time it seemed (as it probably does now) that the two main candidates were JavaCC and ANTLR. Though I looked at both, for some reason it was just easier to get going with JavaCC. Pretty soon, I had all of the constructs of FTL (FreeMarker template language, at least, as it existed at the time) specified in a JavaCC grammar and we were able to throw away the old hand-written parser. After that (rightly or wrongly) I never paid much attention to ANTLR.

One aspect of all of this is that, though I became very comfortable using JavaCC, and becoming a “power user”, as it were, I never seriously considered using JJTree. I've tried to think back about why that is. I was certainly aware of it at the time. I think the fundamental reason was that it just did not seem like there was a clean development process for using it. It seemed to be based on some fundamental confusion about what files were generated and which ones were source files that you work on. You would generate all of the ASTXXX.java Node classes and then if you wanted to put some code in those classes, then what? Presumably you would post-edit the generated java source files. That just seemed wrong. But also, it led to a build process that seemed very baroque. You would write your JJTree grammar (which was really just a JavaCC grammar with a few extra tree-building annotations) you would run JJTree over that, and produce a JavaCC grammar, and in turn, run JavaCC over that to generate java source files that would in turn be compiled by javac.

Even though the JJTree approach, of having the tool automatically generate the code for the various ASTXXX classes seemed attractive, there just seemed to be something wrong with how it was set up. At the time, I did not realize that, some years later, I would put some serious work into remedying that whole problem.

Mid 2003, Sun releases JavaCC as open source

For the first nearly two years after I took my initial steps with JavaCC, it was still a closed source product, so I could not have taken any interest in the source code even if I had wanted to. (Probably, I would not have wanted to muck with JavaCC source code anyway, since I had my hands full with FreeMarker.)

However, in 2003, Sun decided to release the JavaCC source under a very liberal (BSD-style) open source license. And they set it up as a project on java.net. Well, to understand all of this, I guess you have to understand that Sun Microsystems was a company that wanted to position itself as the standard bearer of… well… openness… Open SystemsOpen StandardsOpen Source. As part of that whole campaign, Sun created a platform for open source Java projects called “java.net”.

Well, we now know that java.net never amounted to much. Everybody and his pet dog uses |Github. But we know that now. Obviously, the people at Sun thought that java.net would amount to some big deal and maybe they even visualized it becoming something like what Github is now, albeit (judging) mostly focused on Java technology. Of course, Java.net never became a Github. In fact, it never even became a Sourceforge.

Well, Sourceforge is still around, God bless, but Oracle (which had acquired Sun) eventually turned off the lights on Java.net – I guess, since it had become irrelevant and was more or less an embarrassment. But that was in April of 2017, so we are getting ahead of ourselves. Back in 2003, Java.net was new and was hot shit (or was supposed to be) and Sun wanted to prime it with some sexy open source Java projects and released the JavaCC code there to much fanfare. (It was not reported in Paris Match or People magazine, but all the computer rags mentioned it.) I have a hunch that the people behind this operation did not know what a god-awful mess the code was. But that is maybe not the point. If this was brought to their attention, they could respond: Well, that is the point of open-sourcing it! Now, any motivated java hacker out there can get to work cleaning up the code!

2008, the year that China holds the Olympic Games, there is a Global Financial Crisis, and Revusky downloads the JavaCC code

Now, personally speaking, in 2003, when the JavaCC source code was open-sourced, I'm not sure what my reaction was, or whether I gave the matter much thought. But maybe I did think, in the back of my mind, that this was a project that I would like to get involved with. To tell the truth, I have a hard time remembering exactly why I downloaded (actually, I checked out the CVS repository (remember CVS?) on java.net and started eyeballing the code.

One of the first things I did was to import all the JavaCC code into Eclipse as a project and many of the first refactorings of the code were not even my idea per se. You see, as I pointed out earlier, the JavaCC codebase was written against Java 1, well before most of the conventions in Java code became established. So, the Eclipse IDE emitted hundreds of warnings about problems (not errors exactly, which would prevent compilation) in the code. So, a lot of initial work on the codebase just amounted to getting rid of all (or most, anyway) of the warnings that Eclipse emitted.

So, it may well be that the way my hacking of the Java code began was just from importing all the code into a modern (for that time) IDE and trying to get rid of all the warnings messages. Of course, once I got rid of most of the warning messages, I started seeing all kinds of places in which to clean up the code. So it started taking on a life of its own.

Not long afterwards, I tried to establish contact with the JavaCC “community” thinking (naively, I suppose) that they would be eager to incorporate all of these improvements into the codebase. Of course, a lot of this would have been just completely unambiguous improvements such as moving towards using the more modern Java API's. At the time, in 2008, the current version of Java would have been Java 6. Generics, for example, had been introduced in the previous cycle, Java 5. Meanwhile, all of the existing JavaCC codebase had been written against Java 1.

My changes to the codebase were so unambiguously improvements that I believed (again, quite naively) that they would be incorporated pretty much immediately. I certainly did not anticipate the passive aggression (later open aggression) with which I was met in that community.

Of course, in retrospect, perhaps all of this is hardly surprising. I think back on this and I remember some conclusions that I had drawn from diving into the JavaCC code – I think, in the Spring of 2008.

JavaCC and the Art of the Nothingburger

Everybody uses Git nowadays but back then the standard thing was CVS, which was what Java.net used. CVS may be more crufty and capricious than Git, but you can still look at the history of a codebase pretty well with it. At some point in mid-2008, I set myself the task of figuring out what the JavaCC “community” had achieved, what work they had done on the code in the five years since it had been open-sourced on java.net.

I suppose, dear reader, you are waiting in bated breath for the answer to this, so I won't be keep you in suspense. Here is the result of my code review of the time:

To all intents and purposes, no work had been done in those five years. Nothing.

Now, to be clear, that is not to say that there was no commit record.