User Tools

Site Tools


ancient_history

JavaCC Project History

There are certain questions that I anticipate will be coming up over and over again, so I shall attempt to answer them forthrightly. In the following, I shall outline the history of all of this. People are free to dismiss this as my (Revusky's) opinionated and self-interested version of things. But I would just say this: while not everybody who comes to know me personally ends up liking me, they would (at least, if they are at all honest) have to recognize (however grudgingly) that I am scrupulously honest. There could be some slight inaccuracies below, but in that case, they are just honest mistakes and if they are brought to my attention, I will correct them.

Now, actually, the history of JavaCC is not so easy to piece together. In its early days (the only time it was actively developed) it was bounced around to a few different companies. The version history here provides a history of releases and version numbers but does not provide any dates! There is a change log but no mention of who was responsible for any of the changes. (To be clear, the last two sentences refer to development prior to mid-2003, when Sun open-sourced JavaCC on java.net.)

Since an earlier draft of this page, I had a bit of correspondence with Sriram Sankar, who was the project lead on “Jack”, the parser generator developed internally at Sun Microsystems, that would later be renamed to JavaCC. I had been trying to figure out when JJTree, the tree-building functionality, was added to the package. I had previously believed that JJTree must have been added in 1999 or so. I also had a strong suspicion that JJTree was not written by the same person or team that implemented the core JavaCC functionality. Dr. Sankar told me that JJTree was added to the package quite early, early 1997. He also confirmed my other suspicion: JJTree was mainly written by one Rob Duncan, a name I don't ever recalling hearing in connection with JavaCC development. In any case, seeing as JJTree is pretty clearly the last feature of any significance ever added to the package, that means that the time window in which JavaCC was really an actively developed project (by any reasonable definition) is quite short!

Let's start at the beginning

In its origins, the JavaCC parser generator is a Java based tool of 1990's vintage. My original understanding of things was that it was developed in 1996 in a company called Metamata. That was a misstatement that was in an earlier version of this text. I tried to double-check some of this recently and now I realize that JavaCC did begin as a research project at Sun Microsystems itself. However, it was first released under another name, “Jack”. I came across an old mailing list archive in which the man who, I believe, is really the original author mostly, Sriram Sankar, makes what seems to be a first announcement. The tool was still known as “Jack” a bit over a month later, when there was an article in Javaworld, entitled You don't know Jack. So, the JavaCC naming came a bit later. According to the Wikipedia page on JavaCC, admittedly not necessarily a very reliable information source, the Jack developers created a company called Metamata, taking Jack with them and subsequently renaming the tool to “JavaCC”.

I find this whole thing a tad confusing because it seems to me that if the “Jack” developers were on the Sun payroll when they did the original work, this would be Sun's intellectual property and I cannot understand how they were allowed to take this to a new company. However, that is not so important perhaps. What I do know is that, in this entire period, JavaCC was free for anybody to download and use. I vaguely recall downloading it myself from the Metamata site but not getting around to using it until quite a bit later. The tool was free to download and use. However, it was closed source. So, at that point, it was free as in beer, but it was not free software in the open source sense.

Now, let's put this part of the timeline in perspective. JDK 1.0 (now called Java 1 apparently) was released in January of 1996. JDK 1.1 was a year later, in February of 1997. At the risk of dating myself, I was involved with Java pretty much from the start. I believe the first JDK I downloaded (it took forever over a dial-up modem connection!) was JDK 1.0.2. This was the original Java language. Inner classes were introduced a year later in JDK 1.1. But even that was was still Java 1. Java 2 (or JDK 1.2, as it was called at the time) was released at the end of 1998.

Doubtless, the original Jack project could have benefited from the collections library introduced in Java 2 but that came out nearly two years later, at the end of 1998. Well, the key point to take away from all this is this: the entirety of the legacy JavaCC codebase was written against Java 1, the oldest, most primitive version of the Java language.

Now, Metamata, Jack's (now JavaCC's) new home, was acquired by some other company called Webgain, that went belly up after the 2001 dot-com bust. However, in the ensuing liquidation of assets (and I am hardly sure of the exact details) JavaCC ended up back at its birthplace, Sun Microsystems. As best I can deduce, after that point, JavaCC was not actively developed at Sun in any significant way. In fact, well before that, development of the tool had crawled to a near standstill. The archived Freshmeat site from that period lists two JavaCC releases, 2.0 dated 4 November 2000 and then a year and a half later, on 16 April 2002, a 2.1. The modifications in the official version history list some changes but they are pretty thin if this is supposed to be a year and a half of development effort.

It is fairly easy to deduce that Sun did not ever allocate any real development resources to JavaCC in this period. By the time Sun released the code as open source in 2003, it was really an orphan project that had not been actively developed for some time. Whoever made such decisions decided that they might as well release JavaCC as open source. In retrospect, this is probably a very typical pattern for nothingburger projects.

Late 2001. My Own Involvement with JavaCC

My own first run at using JavaCC myself was in September of 2001. I remember that fairly well because there were some other events (albeit of lesser importance) taking place around then. On 17 September 2001, I announced on the FreeMarker-devel mailing list a first version of a JavaCC grammar for FreeMarker. The JavaCC version I was using then was 2.0, which had been released nearly a year earlier (November 2000) back when it was still at Webgain.

This was actually the watershed moment when I became heavily involved with FreeMarker, a template engine used in Java mostly in the web application space. There was some community around it, but the original authors had lost interest and I saw a lot of possibilities of doing something with a template engine. Now, the FreeMarker 1 codebase was a good proof of concept and was actually useful as is; I myself had used it in a number of projects in companies around 1999 to 2000. However, I realized that, to take it to the next level, to turn it into a more solid, capable tool, would require quite a bit of refactoring. My assessment was that, in order to lay the groundwork for rapid forward progress, the very first thing to do would be to replace the kludgy hand-written parser with one generated by a parser generator tool.

Yes, something like JavaCC, of course. But, at the time, I was hardly committed to using JavaCC specifically. At the time it seemed (as it probably does now) that the two main candidates were JavaCC and ANTLR. Though I looked at both, for some reason it was just easier to get going with JavaCC. I started with some toy examples, but by a series of incremental changes, pretty soon, I had all of the constructs of FTL (FreeMarker template language, at least, as it existed at the time) specified in a JavaCC grammar and we were able to throw away the old hand-written parser in short order. After that (rightly or wrongly) I never paid much attention to ANTLR.

One aspect of all of this is that, though I became very comfortable using JavaCC – by most reasonable definitions, a “power user” – I never seriously considered using JJTree. I've tried to think back about why that is. I was certainly aware of it at the time. I think the fundamental reason was that it just did not seem like there was a clean development process for using it. It seemed to be based on some fundamental confusion about what files were generated and which ones were source files that you work on. You would generate all of the ASTXXX.java Node classes and then if you wanted to put some code in those classes, then what? Presumably you would post-edit the generated java source files. That just seemed wrong. But also, it led to a build process that seemed very baroque. You would write your JJTree grammar (which was really just a JavaCC grammar with a few extra tree-building annotations) you would run JJTree over that, and produce a JavaCC grammar, and in turn, run JavaCC over that to generate java source files that would in turn be compiled by javac.

Even though the JJTree approach, of having the tool automatically generate the code for the various ASTXXX classes seemed attractive, there just seemed to be something wrong with how it was set up. Little did I know that, some years later, I would put some serious work into remedying that whole problem.

Mid 2003, Sun releases JavaCC as open source

For nearly two years after I took my initial steps with JavaCC, it was still a closed source product, so I could not have taken any interest in the source code even if I had wanted to do so. (Probably, I would not have wanted to muck with JavaCC source code anyway, since I had my hands full with FreeMarker.)

However, in 2003, Sun decided to release the JavaCC source under a very liberal (BSD-style) open source license. And they set it up as a project on java.net. Well, to understand all of this, I guess you have to understand that Sun Microsystems was a company that wanted to position itself as the standard bearer of… well… opennessOpen SystemsOpen StandardsOpen Source. They were also pushing Java pretty hard. So, as part of that whole campaign, Sun created a platform for open source Java projects called “java.net”.

Well, we now know that java.net never amounted to much. Everybody and his pet dog uses Github, right? But we know that now. Obviously, the people at Sun thought that java.net would amount to some big deal and maybe they even visualized it becoming something like what Github is now, albeit (judging from its very name) mostly focused on Java technology. Of course, Java.net never became a Github. In fact, it never even became a Sourceforge.

Well, Sourceforge is still around, God bless, but Oracle (which had acquired Sun) eventually turned off the lights on Java.net – I guess, since it had become irrelevant and was more or less an embarrassment. But that was in April of 2017, so we are getting ahead of ourselves. Back in 2003, Java.net was new and was hot shit (or was supposed to be) and Sun wanted to prime it with some sexy open source Java projects and released the JavaCC code there to much fanfare. (This was not reported in Paris Match or People magazine, but all the computer rags mentioned it.) I have a hunch that the people behind this operation did not know what a god-awful mess the JavaCC code was. But that is maybe not the point. If this was brought to their attention, they could respond:

Well, that is the point of open-sourcing it! Now, any motivated java hacker out there can get to work cleaning up the code!

I'm sure that's what they would have said, but it would have been, at best, only a half-truth. It is true that one could now download the code. And one could clean it up. And improve it. And extend it. I did exactly that, but it turned out that there was no possibility of getting this work incorporated into the canonical project hosted on Java.net – which, let's face it, due to how the world really works, is what most people would use.

Or, in other words, you could download the code and do things, but without the right kind of people in charge of the “community” on Java.net, there would be no possibility of that work getting hardly any attention or usage. It would be like the cliche: if a tree falls in the forest and there is nobody there to hear it, was there a sound? (Well, I suppose there was, but there might as well not be!)

2008, the year that China holds the Olympic Games, there is a Global Financial Crisis, and Revusky downloads the JavaCC code

Now, personally speaking, in 2003, when the JavaCC source code was open-sourced, I'm not sure what my reaction was, or whether I gave the matter much thought. But maybe I did think, in the back of my mind, that this was a project that I would like to get involved with. To tell the truth, I have a hard time remembering exactly why I downloaded (actually, I checked out the CVS repository, remember CVS?) on java.net and started eyeballing the code. Well, I must have had some ideas of things I could do with it, but they may have been vague initially.

Also, before I checked out the code and started looking at it, I had no idea at all of what an awful mess the codebase was. Once I did, I became aware of that state of affairs quite quickly!

One of the first things I did was to import all the JavaCC code into Eclipse as a project and many of the first refactorings of the code were not even my idea per se. You see, as I pointed out earlier, the JavaCC codebase was written against Java 1, well before most of the conventions in Java code became established. So, the Eclipse IDE emitted hundreds of warnings about problems (not errors exactly, which would prevent compilation) in the code. So, a lot of initial work on the codebase just amounted to getting rid of all (or most, anyway) of the warnings that Eclipse emitted.

So, it may well be that the way my hacking of the Java code began was just from importing all the code into a modern (for that time) IDE and trying to get rid of all the warnings messages. Of course, once I got rid of most of the warning messages, I started seeing all kinds of places in which to clean up the code. And then it started taking on a life of its own.

Not long into this whole process, I tried to establish contact with the JavaCC “community” thinking (naively, in retrospect) that they would be eager to incorporate all of these improvements into the codebase. Of course, a lot of this would have been just completely unambiguous improvements such as moving towards using the more modern Java API's. At the time, in 2008, the current version of Java would have been Java 6. Generics, for example, had been introduced in the previous cycle, Java 5. Meanwhile, all of the existing JavaCC codebase had been written against Java 1. My changes to the codebase were so unambiguously improvements that I believed (again, quite naively) that they would be incorporated pretty much immediately. I certainly did not anticipate the passive aggression (later to become open aggression) with which I was met in that community.

Of course, in retrospect, perhaps all of this is hardly surprising. I don't know whether I had any specific terminology at the time for this, but nowadays, I would say that JavaCC was (and, as far as I can see, still is) a Nothingburger Project. Now, that might seem contradictory at first blush, since there was a real tool there, that was quite useful; I myself had been using it for over six years at that point. And it is still widely used to this very day. However, let's be clear: that tool was developed as closed source inside a company. What I mean is that, as an ongoing open source project and community, set up with the goal of developing and extending that initial work, JavaCC was indeed a Nothingburger.

I felt the need to write a separate article about the concept but I think a key characteristic of a Nothingburger project is this:

Even if there is an outward appearance of activity, none of it really pertains to the core codebase. It is all ancillary activity that revolves around things like setting up the build system or some other reorganization that does not touch the core of the code. In fact, the ostensible maintainers of a “Nothingburger project” tend to treat the core code as a no-go zone, a sort of black box, if you will.

This last point is quite important, I think. The typical situation is that the original author(s) (having scratched whatever itch) leaves the project and this was the only person who really understood how the code worked, how it was put together.

Then, the remaining people develop this kind of bizarre pseudo-religious reverence for the code. It is beyond the capacity of normal, mortal men to understand the core code, and certainly, to do anything with it.

When I attempted to donate work to the JavaCC Nothingburger project, it is not that they reviewed my contribution and found it wanting. They simply refused to look at it! One of them (and I will eschew any phony nice-guy-ism and name him) was a character by the name of Tim Pizey. He argued that there was no need to review my contribution because the mere fact that I considered myself capable of making any fundamental contribution itself showed how poor my judgment was and meant that any contribution I made could not be any good. Ergo, no need to review it.

In a way, Pizey's argument was actually comical. It was reminiscent of Groucho Marx's famous quip that he would never sink so low as to join the kind of shitty club that would accept a person like him as a member. Actually, this is the kind of humor that computer hackers, in particular, might appreciate, since it, of course, leads basically to an infinite loop. However, the key difference is that Groucho was making a joke, after all. Pizey, however, was dead serious. (That, in retrospect, could make it all the more funny, I guess.)

Mid 2008 to early 2009, the FreeCC "fork"

By mid-2008, I had put enough energy into the JavaCC codebase that I did not want to abandon it. The only option left to me was to create my own fork of the codebase. (In retrospect, it was not really a fork per se, but I'll get to that point later.) Initially, I named my version “KawaDD” which comes from advancing all of the consonants in “JavaCC” by one. That didn't last long because I soon decided that I didn't like the name that much. Besides, my “fork” was so intimately connected with the FreeMarker template engine, that it seemed much better to call the it “FreeCC”. I decided to put it on Google Code, even though, up until that point, I had used Sourceforge.net for any of my open source endeavors. I just figured (quite mistakenly) that something offered by Google would be reliably there in the future. Google Code was shut down in early 2016 – over a year before Oracle shut down Java.net.

By the autumn of 2008, FreeMarker's parser was being generated with FreeCC. The interesting aspect of this is that FreeCC itself used FreeMarker templates to generate code, so it was a quite nifty circular dependency, as described by my FreeMarker collaborator, Attila Szegedi here.

2009 to 2019, My extended Programming Hiatus

When I recently happened on that blog post by Attila, I could not help but feel a twinge of remorse. The overall project that FreeMarker and FreeCC together comprised was really quite appealing. There was quite a lot of possibility of doing some interesting and useful things. Yet, soon after that point, I drifted away from all of it.

It goes beyond that. From early 2009 to late 2019, to all intents and purposes, I did not write a single line of code. And it even goes beyond that. It is not just that I drifted away, but that the whole thing somehow just became aversive to me. Currently, I guess I am still trying to understand how this could happen with something that was such a big part of my life and gave me such great creative satisfaction. I think a lot of the aversive reaction was just a normal, human reaction to the level of injustice I was dealing with. Different people are different, but I think we are all wired (at least originally) with some sense of justice or fairness. I tried to donate a very significant body of work to the so-called “JavaCC community” and was met first with blatant passive aggression, then it was with overt insults, and then they started a campaign of outright character assassination. I put a brave face on this at the time, I believe, but I think it must have been deeply traumatic.

Another thing that was going on at the time – and I have only come to grips with this recently – is that my other main FreeMarker collaborator, Daniel Dekany, was actively undermining me and sabotaging my work during this period as well. I only came to realize this in the last few months because I really tried to piece together the timeline of what had happened. There is no question of this. However, I will outline it in more detail elsewhere. Though I did not openly acknowledge this situation even to myself at the time, I think I did know it subsconsciously and was repressing it.

Well, all things eventually pass. So…

Late 2019, My Return to Open source hacking

Readers (at least if you managed to get this far!) are likely wondering how I got back into all of this.

I have some other ongoing projects relating to publishing (controversial) political material on the web and, for the last couple of years, I have been thinking I would like to learn PHP and get into how things like Wordpress work. Yet, in over two years, I have not been able to sit down and learn PHP or Javascript or any of it. At some point, it dawned on me that, given the way my brain is wired together, the best way for me to learn PHP or Javascript could well be to write a formal grammar and have FreeCC generate a parser. And, it even occurred to me that a good PHP and/or Javascript parser might even have a market out there! But that was not such an important consideration either. It really just seemed that this was how I could learn those things. But perhaps the bottom line is that I was finally healing from my earlier trauma and had the itch to hack some code again!

So, I dug up my old FreeCC work and started mucking with it. One thing led to another. I realized that my old FreeCC only supported a very old version of the Java language. (JDK 1.5 basically, and truth be told, only partially.) So I set myself the task of updating it. I ran into a bug and saw that it had been reported in the old issue tracker (by one of FreeCC's handful of users back when) and hunted the bug down and nailed it. (So, you see, all this happened in the first few days or week of downloading this and hacking it.) I then went through and systematically hunted down and nailed all the other bugs in the issue tracker. (There were only 4 other unresolved bugs in the tracker.) And I set about updating Java language support to Java 8. Well, in short, I got into it again and just got hooked. On my birthday (and I won't say here how old) on 26 December of 2019, I finally got Java 8 support squared away. (By the way, since then, I did some further work on this, and at the time of this writing, JavaCC 21 supports the Java language all the way up to JDK 13.) The generated parser could successfully parse all of the over 7000 source files in the JDK 1.8 src.zip. So, finally, I just put out a release and announced it here. Quite something. The previous FreeCC release, 0.9.3 was in January of 2009 and 0.9.4 was in December of 2019! So, I had got that far and I just decided that I was going to try to reactivate all of this. A few people have asked me why. I'm not doing it in any anticipation of making money. (Though if I did, that would be okay too…) I guess, mostly it's just to get a sense of closure on all of this. Something not at all right happened and I would like to get a sense of putting it right. Well, that, and also, I do get immense creative pleasure out of this work. And that has been a revelation. For a decade or so, the whole thing had become so aversive to me that I could hardly look at a line of code without feeling queasy. It is so wonderful for me now to reclaim that part of my life.

Another name change. JavaCC (21)

Now, we get to the question that people have already asked me in private, regarding the renaming of what was previously the FreeCC project to JavaCC. (Or JavaCC 21 when it is necessary to distinguish it from the legacy JavaCC project.)

Now, first of all, to be clear about one aspect of all of this, FreeCC (now JavaCC 21) was never a “fork” in any real sense. A fork, i.e. a bifurcation, contains the implict idea that there are two (or possibly more) lines of active development. That is simply not the case here. As far as I know (and I would surely know it by now if this were not the case) the body of work that I did from Spring of 2008 to very early 2009, and am now resuming, is the only work of any significance that has been done on the JavaCC codebase that Sun open sourced back in 2003. Soon, that will have been 17 years ago and I do not believe that the amount of work done by the ostensible project maintainers amounts to what a single motivated person could do in a single month. There are some other people who got frustrated with the obstructionism of the canonical project maintainers and created their own “forks” of the codebase. However, I do not believe that any of them constitute a body of work remotely comparable to what was done on FreeCC.

As I have stated quite bluntly above, the legacy JavaCC project is just one of these Nothingburger projects. (It is not the only one out there!) Properly understood, it is not even about the people involved in the project currently. I do not recognize the names of most of the people involved in that currently. However, it doesn't matter. There is no remotely realistic prospect of them ever doing anything for a very simple reason:

Nothing significant can be done with the legacy JavaCC codebase without a massive cleanup and refactoring.

I undertook that cleanup and refactoring back in 2008 and it is largely done. The only basis for moving forward on the project is my version of the codebase, the one that has been cleaned up.

A couple of people have expressed misgivings about my taking the JavaCC name. One person said that this would “create confusion”. I agree that there is potentially an issue there. However, my position is that the people creating confusion are the people running the legacy JavaCC project. In general, the people running a Nothingburger project are the ones “creating confusion” because, like it or not, a cold-blooded analysis of the situation is that they are basically perpetrating a fraud, trying to portray an inactive project as something active.

A Nothingburger is essentially a fraud, because it amounts to artfully arranging your bun and your condiments and creating a trompe l'oeil so that people get tricked into thinking there is actually some beef in there.

Meanwhile, JavaCC 21 is what it is being presented as, the active continuation of work on the codebase that Sun open sourced back in 2003.

Other people expressed misgivings to me about trademark sorts of issues. I don't think there's any there there. I made a point of investigating this.

Nobody has ever filed a trademark in any jurisdiction on the name JavaCC.

In any case, the problem here is that this is the only feasible course of action. In the open source world, it frequently happens that people show up in a community, one of these Nothingburger projects and propose some ideas and they are arrogantly dismissed and the people are told that they are free to go “fork” the project.

Well, this is technically true. You can create your own “fork” of an open source project, but the problem is that a project of the vintage of JavaCC has such an immense visibility advantage that it is not really feasible to fork off one's one version and “compete” with that. Now, your version is bound to be technically superior. But that does not really matter. Your version will receive almost zero attention and usage. It's fairly easy to see why this would be the case. You just have to visualize some person sitting in a cubicle in the corporate world out there, who is tasked with figuring out the tool stack to use for whatever project. That person will almost never download something like JavaCC and then download FreeCC and make a technical comparison between the two. For starters, in most cases, he never will have heard of FreeCC in the first places. But, more importantly, once something is well established as a standard thing in its space, rightly or wrongly, it will be perceived in the corporate world as the “safe” option. Our imaginary cubicle occupier is taking no risk by advocating the use of JavaCC, but if he did advocate something called “FreeCC” instead, he would be sticking his neck out and people would ask him: “Why not just use the standard thing?” (Which would be JavaCC in this case.)

Moreover, all of this is like a big ball of wax. Potential collaborators are much less likely to volunteer serious time and effort to contribute work to some project nobody has heard of. You see, people, generally speaking, don't like wasting their time. They know, on the other hand, that any work they donate to something well known will receive plenty of attention and usage.

In any case, after a decade's absence, I fully intend not to repeat mistakes I made in the past. Back when, I should never have tacitly accepted, as I unfortunately did, that these Nothingburger artists had any exclusive right to the name. My work is just as much JavaCC as theirs; it is a continuation of work on the codebase that Sun Microsystems open sourced in 2003. Moreover, it is (by far) the most advanced version of JavaCC available. So, in closing, the name change to JavaCC (JavaCC 21) is, in essence, a recognition of the basic mistake I made back in 2008, that I am not going to repeat this time. I'll doubtless make some mistakes this time around, but at least they will be new ones!

ancient_history.txt · Last modified: 2020/10/02 01:21 by revusky