i started using antlr generate simple parser interpolated strings. input string examples follow (one per line):
hello {user.name}!
welcome on planet {getplanetname(" stupid string param :-} ")}
plain string without interpolated expression
string escaped {{ brackets }}
the grammer decide whether string (plainstring) or expression (expressionstring) follows:
grammar t; patternstring: (plainstring | expressionstring)+ ; plainstring: (cbo_escapesequence | cbc_escapesequence | plainstringliteral)+ ; expressionstring: cbo expression cbc | curlybrackets_empty ; expression: expressionsegment+ ; expressionsegment: ~('"' | '\'' | '{' | '(' | '[' | '}' | ')' | ']' | cbo_escapesequence | cbc_escapesequence)+ | '(' expressionsegment+ ')' | '(' ws ')' | '()' | '[' expressionsegment+ ']' | '[' ws ']' | '[]' | '{' expressionsegment+ '}' | curlybrackets_empty | stringliteral | charliteral ; stringliteral: '"' (~('"') | '\\"')+ '"' | '""' ; charliteral: '\'' (~('\'') | '\\\'')+ '\'' ; fragment ws: (' ' | '\r' | '\n' | '\t')+; plainstringliteral: ~('{' | '}'); curlybrackets_empty: (cbo ws cbc | cbo cbc); cbo: '{'; cbc: '}'; fragment cbo_escapesequence: '{{'; fragment cbc_escapesequence: '}}'; this working except strings following:
{{{new[]{1, 2, 3, 4}}}}
which gives me following ast
patternstring => '{{{new[]{1, 2, 3, 4}}}}' expressionstring => '{{{new[]{1, 2, 3, 4}}}}' expression => '{{new[]{1, 2, 3, 4}}}' expressionsegment => '{{new[]{1, 2, 3, 4}}}' expressionsegment => '{new[]{1, 2, 3, 4}}' expressionsegment => 'new[]' expressionsegment => '{1, 2, 3, 4}' expressionsegment => '1, 2, 3, 4' whereas expect (and want) following ast:
patternstring => '{{{new[]{1, 2, 3, 4}}}}' plainstring => '{{' expressionstring => '{new[]{1, 2, 3, 4}}' expression => 'new[]{1, 2, 3, 4}' expressionsegment => 'new[]' expressionsegment => '{1, 2, 3, 4}' expressionsegment => '1, 2, 3, 4' plainstring => '}}' meaning, plainstring should more greedy , take escaped brackets possible. how can fix in above grammar?
i think issues due explicit definition of rule open , closing curly braces, referencing them in of parser rules string literal. modifying expression segment rule reference lexer rules, issue seems resolved. please try out grammar , see if issue fixed
expressionstring: cbo expression cbc | curlybrackets_empty ; expression: expressionsegment+ ; expressionsegment: l_paren expressionsegment+ r_paren | l_bracket expressionsegment+ r_bracket | cbo expressionsegment+ cbc | l_paren ws r_paren | l_bracket ws r_bracket | l_paren r_paren | l_bracket r_bracket | curlybrackets_empty | stringliteral | charliteral | ~(double_quote | single_quote | cbc | cbo | l_paren | l_bracket | r_paren | r_bracket)+ ; stringliteral: '"' (~('"') | '\\"')+ '"' | '""' ; charliteral: '\'' (~('\'') | '\\\'')+ '\'' ; ws: (' ' | '\r' | '\n' | '\t')+; plainstringliteral: ~('{' | '}'); curlybrackets_empty: (cbo ws cbc | cbo cbc); cbo: '{'; cbc: '}'; l_paren: '('; r_paren: ')'; l_bracket: '['; r_bracket: ']'; single_quote: '\''; double_quote: '"'; as can see, parse tree seems reflect looking for

Comments
Post a Comment