ANTLR Lexer Substring

Is there a way of defining a Lexer rule like:

DESCRIPTOR :   'INIT'(.)*'END';

So DESCRIPTOR returns the content between the two labels INIT and END?

I guess I can use return values, such as:

DESCRIPTOR returns [String content]
@init {
   content="";
}: 'INIT'(.)*'END';

But I don't get how I can access to such value.

Answers


Note that lexer rules (the ones start start with a capital) cannot have a returns clause, only parser rules can.

But the rule:

DESCRIPTOR :   'INIT' (.)* 'END';

works just fine. By default, ANTLR matches .* and .+ reluctantly (ungreedily), so the rule above matches "INIT" followed by zero or more chars until the first "END".

EDIT

Ah, you want to strip the "INIT" and "END" from the token. You can do that as follows:

DESCRIPTOR 
 : 'INIT' .* 'END' {setText($text.substring(4, $text.length() - 3));}
 ;

where $text is short for getText() (i.e. the entire string the token matched).


Need Your Help

SQL: Redundant WHERE clause specifying column is > 0?

sql database where-clause

Help me understand this: In the sqlzoo tutorial for question 3a ("Find the largest country in each region"), why does attaching 'AND population > 0' to the nested SELECT statement make this correct?

jQuery ajax call is not sending japanese character in headers attribute

javascript jquery ajax post

I am using jquery.ajax call to call my service and called like