Op catalog¶
The built-in "standard library" β every op the perch runtime can dispatch. Each op is implemented in infra/ops/ and registered in infra/capyloader/lib.capy.
Ops fall into two shapes:
- Statement ops β invoked as a body line for their side effects. e.g.
go build(a bare declared bin) ormkdir "./out". - Capturable ops β invoked via
NAME = OP ARGSto capture their return value. e.g.h = sha256_file "./bin".
Most ops support both shapes (return value is discarded if you don't assign it).
External vs pure. Ops that touch something outside the program β subprocess (
shell,pkg_install,bin_version, β¦), network (http_*,dns_lookup, β¦), filesystem (read_file,write_file,cp, β¦), or environment (get_env,set_env, β¦) β are gated by therequiresmanifest when a file declares one: each verifies its declaration immediately before executing, and undeclared access errors. Pure ops (strings, JSON, regex, hashing of in-memory values, version compare, path-string manipulation) and benign host-fact reads (get_os,hostname, dir-name helpers) are never gated. The authoritative per-op classification is in capability-gating.md.
Argument forms β quoted string vs bare ident¶
A single-arg op accepts its argument two ways:
url = get_env "API_URL"
print "${url}" # string form β interpolation
print url # bare ident β resolves the binding directly (no ${...})
body = http_get "${url}" # string form
body = http_get url # bare ident β same result
Bare idents work for plain binding names. Dotted bindings (err.kind, err.message) still need the string form, because the tokenizer treats . as a separator β use match "${err.kind}", not match err.kind. Plain match os / match status work bare.
Process & I/O¶
| Op | Signature | Notes |
|---|---|---|
print MSG |
(string) |
Prints MSG + newline to stdout. |
println MSG |
(string) |
Alias for print. |
eprintln MSG |
(string) |
Prints to stderr. |
shell CMD |
(string) |
Deprecated β prefer exec. Runs CMD via bash (POSIX) or cmd.exe (Windows). Use only when you need genuine shell features (a value that must word-split, or a one-off awk/sed chain). |
shell_output CMD |
(string) β string |
Deprecated β exec captures stdout too. Same as shell but captures stdout. |
shell_detached CMD |
(string) |
Starts and returns immediately. Use with detached modifier. |
BIN tok⦠(bare) |
(word, wordβ¦) β string |
The normal way to run a subprocess. A bare declared bin runs BIN directly (no sh -c). Each token is one argv slot β bare flags/paths/globs work unquoted (git log --oneline -10); quote a token to keep embedded spaces (git commit -m "fix it"). No word-split, no glob, no metachar surface. Streams and captures stdout. Gated by requires (bin_not_declared). |
exec BIN tok⦠|
same | Explicit form of a bare bin call. Needed only when the bin name collides with a built-in op (exec rm, exec mkdir, exec chmod). Captures work bare β h = git rev-parse HEAD. See sandboxed-by-design.md Β§3.2. |
exec a && exec b |
chain | && / \|\| / ; join exec clauses by exit status (perch operators, not shell metachars): && on success, \|\| on failure, ; always. Short-circuits; the chain raises if its last run clause fails. |
pipe β¦ end |
block β string |
Wires stdoutβstdin between exec stages with in-process pipes β no shell. out = pipe β¦ end captures the final stage. Each stage is a declared-bin exec. |
fail MSG |
(string) |
Exits non-zero with the message. |
exit N |
(int) |
Exits with code N. |
sleep SECS |
(any) |
Sleeps for SECS seconds. Accepts float. |
NAME args⦠|
bare name | Invoke another command (or expand a template) by its name β no run/call keyword. Bindings persist into the callee. Names are globally unique, so resolution is unambiguous; exec NAME forces the subprocess reading. |
list_commands |
() |
Prints the visible commands in the program. |
File system¶
| Op | Signature |
|---|---|
mkdir PATH |
(string) β creates all parent dirs |
cp SRC DST |
(string, string) |
mv SRC DST |
(string, string) |
rm PATH |
(string) β recursive |
cd PATH |
(string) β changes bindings cwd; subsequent ops use it |
chmod PATH MODE |
(string, string) β MODE is octal e.g. "0755" |
touch PATH |
(string) |
write_file PATH CONTENT |
(string, string) |
read_file PATH |
(string) β string |
exists PATH |
(string) β bool |
is_dir PATH |
(string) β bool |
is_file PATH |
(string) β bool |
file_size PATH |
(string) β int (bytes) |
Control flow (block ops)¶
Each block op wraps a body that runs only when the condition holds.
| Op | Signature |
|---|---|
if os == "darwin" β¦ end |
matches runtime.GOOS |
if arch == "arm64" β¦ end |
matches runtime.GOARCH |
if exists "path" β¦ end |
the path exists on disk |
if A == B β¦ end |
A == B (string compare) |
if A != B β¦ end |
A != B |
if A > B β¦ end |
A > B (numeric) |
if A < B β¦ end |
A < B (numeric) |
if not X β¦ end |
X is empty string |
if X β¦ end |
X is non-empty |
Strings¶
(Mostly used via NAME = op β¦ capture.)
| Op | Signature |
|---|---|
upper STR |
(string) β string |
lower STR |
(string) β string |
trim STR |
(string) β string (strips surrounding whitespace) |
capitalize STR |
(string) β string |
length STR |
(string) β int |
contains STR SUB |
(string, string) β bool |
has_prefix STR PFX |
(string, string) β bool |
has_suffix STR SFX |
(string, string) β bool |
replace STR "OLD,NEW" |
(string, string) β string β second arg is comma-separated |
split STR SEP |
(string, string) β []string |
join LIST SEP |
([]any, string) β string |
repeat STR N |
(string, int) β string |
format FMT VAL |
(string, any) β string β Go fmt.Sprintf semantics |
Line toolbox (pure)¶
Operate on captured multi-line output (e.g. from exec / pipe) as lines β the perch-native replacements for the middle stages of a shell pipeline. All pure, no capability. The text is the last argument. See sandboxed-by-design.md Β§3.5.
| Op | Signature |
|---|---|
grep PAT TEXT |
(string, string) β string β keep lines matching regex PAT |
reject PAT TEXT |
(string, string) β string β keep lines NOT matching PAT |
cut N TEXT |
(int, string) β string β Nth whitespace field (1-indexed) of each line |
head N TEXT |
(int, string) β string β first N lines |
tail N TEXT |
(int, string) β string β last N lines |
sort_lines TEXT |
(string) β string β lines sorted lexicographically |
uniq_lines TEXT |
(string) β string β collapse adjacent duplicate lines (pair with sort_lines) |
count_lines TEXT |
(string) β int β number of lines |
Hashing¶
| Op | Signature |
|---|---|
md5 STR |
(string) β string (hex) |
sha1 STR |
(string) β string |
sha256 STR |
(string) β string |
crc32 STR |
(string) β string |
md5_file PATH |
(string) β string |
sha1_file PATH |
(string) β string |
sha256_file PATH |
(string) β string |
Encoding¶
| Op | Signature |
|---|---|
base64_encode STR |
(string) β string |
base64_decode STR |
(string) β string |
hex_encode STR |
(string) β string |
hex_decode STR |
(string) β string |
url_encode STR |
(string) β string |
url_decode STR |
(string) β string |
json_parse STR |
(string) β any |
json_stringify VAL |
(any) β string |
json_get DOC PATH |
(string|any, string) β any β dot-path into a JSON document |
HTTP¶
| Op | Signature |
|---|---|
http_get URL |
(string) β string (response body) |
http_post URL BODY |
(string, string) β string |
http_put URL BODY |
(string, string) β string |
http_delete URL |
(string) β string |
download URL DST |
(string, string) β saves response body to file |
Security defaults (always-on, no flag required):
Every URL β initial request AND every redirect destination β is validated:
- No private-IP destinations. Refuses loopback (
127.0.0.0/8,::1), link-local (169.254.0.0/16β the AWS / GCP / Azure metadata service), RFC 1918 private (10/8,172.16/12,192.168/16), IPv6 ULA (fc00::/7), and unspecified addresses. Closes SSRF. - No
https β httpredirect downgrade. - Cap of 5 redirect hops.
- DNS-rebinding defense. Multi-A responses fail if any record lands in a private range.
Opt-out flags when you genuinely need a private service or legacy endpoint:
--allow-private-ipsβ permit private/loopback IPs--allow-scheme-downgradeβ permit https β http redirects--max-redirects N/--no-redirectsβ change/disable the cap
Strict host allowlist (opt-in, tightest policy):
--allow-host HOST[,HOST...] restricts every URL (initial + all redirects) to a list. Patterns: exact (api.github.com), single-label wildcard (*.s3.amazonaws.com matches one label only β api.x.com β, a.b.x.com β), host:port (localhost:8080), IP literal. Multiple flags accumulate. Composes AND-wise with the SSRF guard.
# Tight HTTP policy for an AI-agent-served .perch
perch --allow-host api.github.com,*.docker.io,registry.npmjs.org \
--no-shell --env GITHUB_TOKEN -f ops.perch
perch help --allow-host for the full story.
Compression / archives¶
| Op | Signature |
|---|---|
gzip SRC DST |
(string, string) |
ungzip SRC DST |
(string, string) |
tar_create SRC_DIR DST |
(string, string) β gzipped tarball |
tar_extract SRC DST |
(string, string) |
zip_create SRC_DIR DST |
(string, string) |
zip_extract SRC DST |
(string, string) |
Time¶
| Op | Signature |
|---|---|
now FORMAT |
(string?) β string β formats: rfc3339 (default), rfc822, unix, unix_milli, date, time, datetime, or any Go layout. |
unix_to_iso SECS |
(int) β string |
Regex¶
| Op | Signature |
|---|---|
regex_match PATTERN STR |
(string, string) β bool |
regex_replace PATTERN STR REPL |
(string, string, string) β string |
regex_find_all PATTERN STR |
(string, string) β []string |
Network¶
| Op | Signature |
|---|---|
hostname |
() β string |
dns_lookup HOST |
(string) β []string |
port_check HOST PORT |
(string, string) β bool |
System¶
| Op | Signature |
|---|---|
get_os |
() β string (darwin/linux/windows) |
get_arch |
() β string (amd64/arm64) |
get_env NAME |
(string) β string (errors with env_not_declared if a requires block is present and NAME isn't declared) |
set_env NAME VAL |
(string, string) β process-lifetime env var |
export NAME VAL |
(string, string) β alias for set_env (familiar shell verb) |
unset NAME |
(string) β remove an env var (process + binding overlay); alias unset_env |
unset_env NAME |
(string) β same as unset |
cwd |
() β string |
home_dir |
() β string |
temp_dir |
() β string |
app_data_dir |
() β string (platform-aware) |
cache_dir |
() β string |
pid |
() β int |
hostname |
() β string |
user |
() β string |
How to add an op¶
Two files:
-
Go handler in
infra/ops/<category>.go: -
Optional capy entry in
infra/capyloader/lib.capyif you want users to call it as a statement (my_op "x"). Capturable ops used only vialetneed no entry β the genericlet_1arg/let_2argspatterns match any op kind.
That's it. Tests and a doc-table row in this file are welcome.