grammar compiler compiler for Java
My company is trying to write some software for Android. We would like to work with Java, and there is a component of the company's software that is c++ and so needs to be ported (or at least porting needs to be tried before trying NDK stuff). This code was created using Accent, and it defines a grammar grammar. As near as I can tell, the original writer (now gone) wrote a grammar to specify how to specify a grammar, then compiled a compiler-compiler with that grammar and Accent. The compiler-compiler takes a grammar of the specified format and produces a binary code to parse strings conforming to that grammar. Here's an example snippet of the grammar:
//include rules from from this file (such as <alpha>) include "alphabet.bnf" <<topSymbol>> = <alpha> <alpha> <alpha>? .//two letters with an optional third //square brackets enclose an XML statement clarifying semantics of the rule [ <topSymbol> <letter> <command val="doSomethingToLetter"/> </letter> <!--etc.--> </topSymbol> ]
My question is how to do this with Java, using Antlr or some other tool. A compiler-compiler-compiler seems rather complicated to me. Alternatively, I would like to know how to easily compile/parse this type of grammar, which contains a grammatical and semantic XML information.
If the original designer knew what he was doing, and it is warranted, then you want to preserve that concept. Going with another parser generator (or at least a parsing scheme of some kind) is the right approach. Either JavaCC or ANTLR would be fine as parser generators; you'll have to hand-translate the grammar. You might hand code a recursive descent parser if the grammar is simple enough.
If the original designer was simply over the top, then you can probably replace the grammar-driven aspect, but you won't be able to do that without understanding what he was achieving. The fact that this "seems rather complicated to me" suggests you don't really understand parsing/parser generator technology, and you are driven by a desire to do something you understand than preserve something you don't. But its a bad idea to tear apart something that is well designed/implemented just because you don't understand it. I strongly suggest you learn more about these kinds of technologies, and ask why was it implemented this way? Ultimately you may be right and should replace his approach by something else, but make that choice based on knowledge, not fear.
My question is how to do this with Java, using Antlr or some other tool. A compiler-compiler-compiler seems rather complicated to me.
It sounds complicated to me too!
Alternatively, I would like to know how to easily compile/parse this type of grammar, which contains a grammatical and semantic XML information.
No ... there is no easy answer to this. It sounds like your ex-colleague has gone over the top on the complexity front. You are going to have to:
- either get your head around what his code does, and how it does it, learn how Antlr works, and hand translate,
- or ditch his code AND design and find a simpler way to do what it is doing.
(Actually, there is a good chance that the code is not as complicated as it seems ... once you get your head around it, and compiler-compiler technology.)
Your best bet is to translate the grammar you have into ANTLR or Java CC or some other tool.
Another possibility is to call your C++ code using JNI, but that's fraught with peril.
I'm not aware of anything that can help. You'll just have to get a shovel and start digging.