The hand-coded BNF grammar in lib/Chalk/Grammar/BNF.pm is currently sensitive to trailing whitespace in epsilon productions. This causes parse failures for valid BNF syntax.
Works: Rule ->\n (no trailing whitespace)
Fails: Rule -> \n (trailing space before newline)
The GrammarRule pattern uses qr/\s*/ after ->, which matches ALL whitespace including newlines:
['GrammarRule', [
qr/[A-Z][a-zA-Z0-9_]*/, # LHS
qr/\s*/, # Optional whitespace
'->',
qr/\s*/, # ← This consumes the newline!
'RHS'
]],
When trailing whitespace exists, the qr/\s*/ matches both the space AND the newline, leaving nothing for the Line rule to match:
['Line', ['GrammarRule', "\n"]], # ← No newline left to match
grammar/perl.bnf (lines 243, 288, 371, 462)Changing qr/\s*/ to qr/[ \t]*/ (horizontal whitespace only) was attempted but caused other parsing failures. The issue is more complex than a simple pattern change.
Option 1: Restructure GrammarRule pattern
qr/(?:[^\S\n]*)/Option 2: Preprocess BNF content
Option 3: Restructure grammar rules
GrammarRule -> LHS '->' RHS | LHS '->' EpsilonMarkerStart with Option 1 (pattern fix) with comprehensive testing. The core issue is that qr/\s*/ is too greedy at the end of GrammarRule. We need a pattern that:
-># Should parse successfully with semantic actions
my $bnf = <<'BNF';
Rule1 ->
Rule2 -> Element
Rule3 -> Element1 Element2
BNF
my $g = Chalk::BNF::parse_with_semantic_actions($bnf);
# Currently fails if Rule1 has trailing space
feature/semantic-actions-architecture