capy grammar limitations โ reported, and now resolved¶
Status: all resolved upstream. This doc originally catalogued nine places where perch's surface syntax was shaped by limits in the capy grammar engine rather than by design. They were reported upstream; the engine shipped fixes for all nine (commits
e1fba0b+5102dec). perch now pinsgithub.com/olivierdevelops/capy v0.20.1-0.20260529061856-5102decfe5d0and has adopted the ones that improve the language. This page is kept as the record of what was wrong, how it was fixed, and what perch took up.
| # | Limitation | Fix in capy | Adopted by perch |
|---|---|---|---|
| 1 | No backtracking after a block-opener commits | automatic backtracking | โ (enables flat/block keyword sharing) |
| 2 | Nondeterministic candidate ordering (map iteration) | total order, name tiebreaker | โ (no more flaky parses) |
| 3 | No JSON-safe interpolation for ident-or-string | ${asString x} |
โ
(exec argv) |
| 4 | Can't lex flags/paths/globs as one token | word capture (+ tail) |
โ
(exec bare flags) |
| 5 | No context-sensitivity / lookahead | when_followed_by / when_not_followed_by indent |
โ
(bare os/arch in requires) |
| 6 | No varargs / overload-ladder boilerplate | tail capture (quote-preserving, capy โฅ ac128fb) |
โ
(one exec BIN tail function, no arity cap) |
| 7 | # line comments don't parse |
comments { line "#" } |
โ |
| 8 | try/rescue/finally don't parse |
block_sections |
โ
(try โฆ rescue โฆ finally โฆ end ships) |
| 9 | Dotted access not captured bare | dotted_ident |
โ
(bare match err.kind ships) |
What perch adopted, concretely¶
-
#comments (ยง7).lib.capynow declarescomments { line "#" }. Leading and trailing#comments parse and are ignored โ examples and user files can use them freely. -
execwith bare flags and spaced args (ยง3 + ยง4). Theexecgrammar useswordcaptures +${asString}, so a token can be a bare flag/path/glob or a quoted string with embedded spaces, each landing in exactly one argv slot:
git log --oneline -10 # bare flags โ no quotes
docker run -d --name web nginx # bare paths/names
git commit -m "fix the bug" # quoted token kept as ONE slot
This replaced the previous quote-everything ladder (exec docker "run" "-d").
-
Deterministic flat/block keyword sharing (ยง1 + ยง2 + ยง5). The
requiresblock'sos "linux"/arch "amd64"allowlist entries now share the bareos/archkeyword with theos "โฆ" โฆ end/arch "โฆ" โฆ endconditional blocks, disambiguated bywhen_not_followed_by indent(flat entry) vswhen_followed_by indent(block). This undid the earlierrun_on/run_archrename that the collision had forced. -
try/rescue/finally(ยง8) viablock_sections.tryis declared withblock_sections rescue finally closer end; the grammar reconstructs the flat_enter / _catch / _finally / _leavemarker stream the existingopTryhandler already consumes, so the interpreter was unchanged. One semantic refinement: because the_catchmarker is now always emitted,opTrytreats an empty rescue body as "no catch arm," sotry โฆ endandtry โฆ finally โฆ endcorrectly re-raise โ only a non-emptyrescueswallows. The error binding is fixed toerr(the universal convention). -
Bare
match err.kind(ยง9) viadotted_ident. Thematch-ident grammar usesdotted_ident, which captures both a plain binding (os) and a dotted member path (err.kind) as one token. Error bindings are stored under their literal dotted key, somatch err.kindresolves directly. The string formmatch "${err.kind}"still works. -
ยง6
tail(unbounded varargs). Originallytailstripped quotes when rejoining tokens, which lost the slot boundary for spaced args (exec git commit -m "fix the bug"โcommit -m fix the bug), so perch kept a cappedword-ladder. capyac128fbmadetailquote-preserving, soexeccollapsed to a singleexec BIN tailfunction (no arity cap). The argv string is shell-split at load time (loader.go shellSplitArgs) on the literal source โ before interpolation โ so the ยง3.3 keystone holds: a${x}token stays one slot even if its value has spaces. (Minor quirk: redundantly quoting single-word tokens โexec docker "run" "-d"โ confuses the splitter; just write them bare,exec docker run -d. Quote only tokens that contain spaces.)