Skip to content

๐ŸชŸ Web UI โ€” perch --server

Same .perch file, same interpreter, same restrictions โ€” rendered as a friendly localhost web app. Two products in one binary.

Who this is for

perch --server serves two distinct audiences with the same code. The pitch is different for each โ€” pick yours:

๐Ÿค– Audience 1 โ€” "I'm using AI agents and want to see what they're doing"

Agents now run operational work on behalf of non-technical teammates ("deploy my app", "restart that pod", "check the build"). The non-dev sees the agent's summary, not the actual ops. When something goes sideways, they have no surface to investigate on.

perch --server is the "shows your work" companion to perch-mcp: the same .perch file produces both the agent's MCP tool surface AND the human's audit/run UI. Open the UI in a browser tab next to the agent โ€” every op the agent fires through perch_run streams into the Run tab live. Want to know if the API is even up before letting the agent retry? Open the ๐Ÿงช Simulate tab and find out in a click. Want to know what a verb actually does before granting it to the agent? Open ๐Ÿ” Scan and see every shell call, host, write root with risk findings.

The framing: agents are great at deciding what to do; humans want to see what's happening.

๐ŸŽ›๏ธ Audience 2 โ€” "I just want to control my system from a UI"

You operate stuff โ€” Docker containers, Kubernetes clusters, your home server, a small fleet of VMs, a Python project, whatever โ€” and you'd rather click than type the same kubectl ... | jq ... | grep ... for the 400th time. You don't want to:

  • Write a frontend. (Cobra/Click/Typer don't ship one. Building one is a weekend.)
  • Stand up Retool / Backstage / a Next.js dashboard for ten internal verbs.
  • Use the cloud provider's console, which has 200 menu items and doesn't know about your workflow.
  • Run six different self-hosted UIs (Portainer for Docker, Cockpit for the box, โ€ฆ) each with their own auth story.

Declare your verbs in commands.perch. Run perch --server. That's the UI. Add a command restart_pod, refresh the page, the form is there. No app to deploy, no admin panel to maintain, no CSS to write. The file is the dashboard.

For self-hosters: pair with the 22 recipes โ€” Postgres / Redis / mailpit / observe (Prometheus+Grafana+Loki) / aistack (Ollama+ChromaDB+WebUI) / kafka-stack all become one-click verbs in your browser. No docker-compose ps followed by squinting at a 12-line table.

Why both audiences work from the same code

What you write Who consumes it Surface
command deploy_canary do โ€ฆ end An agent via MCP perch_run deploy_canary tool call โ†’ JSON in / NDJSON progress out
same line A human in the browser Form with -region, -bake_minutes, Run button โ†’ output panel with streamed โœ“/โœ—
same line CI perch deploy_canary -region=us-east-1
same line A future intern perch deploy_canary --help

One file. Four consumers. Zero duplicate schemas. Adding a verb means changing exactly one line.

Why this matters more in 2026 than it did in 2024

The "non-dev runs a runbook" use case used to be niche. With AI agents now executing operational work on behalf of non-technical teammates, it's becoming central โ€” and the visibility gap matters more than the interface gap.

The problem with agent-only workflows:

  • An agent receives "deploy my app" and runs five commands. The non-dev sees one sentence of summary.
  • Something went sideways. Was it the build? The auth token? A network policy? The agent says "I ran into an issue" and the human has no surface to investigate on.
  • The non-dev wants to check something โ€” "before the agent retries, is the staging env even reachable?" โ€” but checking means a terminal, which means asking an engineer.
  • The agent could in principle answer all of these. In practice, the loop of "ask, wait for the agent to think, parse its reply" is far slower than a 1-second click.

What perch's web UI adds to that picture:

Without the UI With perch --server
Agent runs perch_run deploy_canary via MCP; non-dev sees the agent's summary only Non-dev opens http://localhost:8080 alongside the agent โ€” every op the agent fires streams into the Run tab live
"Is the API even up before the agent retries?" โ†’ ask the agent โ†’ wait Open the ๐Ÿงช Simulate tab โ†’ paste a fixture with the API returning 500 โ†’ see exactly what would happen
"What does this command actually do?" โ†’ trust the agent's description Open ๐Ÿ” Scan โ†’ see every shell call, every host, every write root the command touches, with risk findings
"Can I run this myself instead of asking the agent?" โ†’ no The โ–ถ Run tab is literally the same verbs the agent has, just clickable

The framing: agents are great at deciding what to do; humans want to see what's happening. perch's UI is the transparency surface โ€” the same .perch file produces both the agent's MCP tools (via perch-mcp) AND the human's audit/run UI (via --server). One file, two consumers, no duplicate code, no out-of-sync schemas.

A non-dev with the UI open in a tab can: watch agent-initiated runs in real time, run pre-flight simulate / scan before granting the agent a risky verb, take over and run a verb themselves when the agent is stuck, copy the verb invocation back as a CLI command for handoff to an engineer.



TL;DR

perch -f commands.perch --server --port 8080
# โ†’ open http://127.0.0.1:8080

That's it. The UI auto-renders every declared command as a form, exposes the same pre-flight tools the CLI does (--check, --scan, simulate), and streams output live as the interpreter walks ops.

Who this is for: support engineers running runbooks; QA running canned test sequences; product / ops folks who'd rather click than perch -f deploy.perch deploy_canary -region=us-east-1 -bake_minutes=15; new hires on day one before they've set up a terminal.

Who this is NOT for: multi-tenant SaaS hosting. --server is single-tenant + localhost-bound by default; put it behind your existing reverse proxy + SSO for shared access.


Five tabs, one file

The UI is hash-routed (http://host/#run, #simulate, #scan, #check, #about) โ€” bookmark the tab you use most.

โ–ถ Run

The default view. Lists every visible command from the loaded .perch file:

  • Live search/filter across command names + descriptions. (Essential when you --include recipes/ and have 22 commands in one file.)
  • Type-aware form inputs โ€” bool args get a checkbox; int / float get number spinners; rest args get a multi-line textarea (one value per line); strings get text inputs.
  • Defaults render as placeholders, not pre-filled values. Submitting an empty field uses the runtime default; this matches the CLI's behavior exactly.
  • Mod badges show which commands are test, detached, or proxy_args.
  • Globals panel (collapsible) at the top โ€” every top-level binding in the program, with type and value.
  • Click Run โ†’ output streams live in a dark output panel (separated by out / err / status channels). Hit Clear between runs.
  • Copy as CLI button โ€” generates a shell-escaped perch -f file.perch CMD -arg=val โ€ฆ string mirroring the form so you can paste it back to a terminal for automation.

๐Ÿงช Simulate

The whole perch simulate (v2) surface, in a form:

  • One field per CLI flag โ€” --sim-os / --sim-arch (dropdowns), --sim-env, --sim-fs-read, --sim-fs-write, --sim-have-bin, --sim-allow-host.
  • Checkboxes for --sim-env-only, --sim-no-shell, --sim-no-network, --sim-no-subprocess, --sim-no-write.
  • A JSON fixture textarea โ€” paste a v2 fixture (capabilities + oracles + scenarios) and one Simulate click runs every scenario; each gets its own banner + per-op report (WILL_RUN โœ“ / WILL_FAIL โœ— / MIGHT_FAIL ?) in the output panel.
  • Status pill summarises the run: green "all clear" or red "simulated failures present."

This is the killer feature for non-devs: "what would happen on the production host if I ran this?" โ€” no terminal, no fixture file on disk, no --sim-โ€ฆ flag memorisation.

๐Ÿ” Scan

One click โ†’ the full perch --scan output:

  • Capability summary (shell? subprocess? network hosts? write roots? env vars?)
  • Per-finding severity ribbon (HIGH / MED / LOW)
  • The recommended hardened invocation (--no-subprocess --allow-bin docker,kubectl โ€ฆ)
  • A status pill: red if any HIGH finding, yellow on MED, green on none.

Useful before running anything you didn't write yourself โ€” including the 22 recipes.

โœ“ Check

One click โ†’ syntactic validation (the same perch --check runs in pre-commit). Issue list with per-severity counts. Wire this tab into your CI dashboard via the /api/check JSON endpoint.

โ„น About

Program metadata (name, version, source file path, command count) + direct links into the docs site for help.


Theming

  • Dark mode toggle in the header (๐ŸŒ“). Auto-respects prefers-color-scheme; choice persists per browser via localStorage.
  • Mobile-friendly responsive layout โ€” works on a tablet or phone for one-handed runbook execution.

JSON API

Every panel is backed by an endpoint you can drive directly โ€” handy for embedding perch in another internal tool or a Slack bot:

Method Path Body Returns
GET /api/program โ€” Program metadata + arg specs + globals + catch
POST /api/exec {command, args, allow_bin?, allow_host?, env_only?} NDJSON stream of {kind:out\|err\|status, msg}
POST /api/check โ€” {ok, errors, warnings, issues[]}
POST /api/scan โ€” {report, capabilities, findings, recommended}
POST /api/simulate {command, env, fixture?} {ok, report}

All endpoints return JSON (except /api/exec which streams NDJSON for live output). Pair with your existing dashboard, observability tool, or curl | jq workflow.


What's NOT in the UI (yet)

Honest list โ€” see the parity TODO for status:

  • Per-request --no-shell / --no-network / --no-subprocess / --no-write toggles โ€” today these have to be set when you launch perch --server; per-request overrides will land in a follow-up.
  • Live span-tree view of running ops (the --report shape rendered in real time). Today the Run panel streams the same NDJSON your CLI would print.
  • perch test panel with per-test pass/fail tree.
  • perch --build panel for non-devs to produce a portable binary.
  • Audit log download button after a run.
  • Run history in localStorage with replay.
  • REPL equivalent (perch --shell in a textarea).
  • --ask interactive prompts (needs WebSocket).
  • Authentication โ€” single-tenant by design; pair with a reverse proxy.

Security posture

Same model as perch --server always had:

  1. localhost-bound by default (--host 127.0.0.1).
  2. Single-tenant โ€” no auth layer, no user model. Put it behind your existing auth boundary (SSO via reverse proxy is the typical shape).
  3. No file upload โ€” the program is loaded from a path at launch time, not from the browser. No curl POST a .perch attack surface.
  4. Capability restrictions inherit from launch โ€” perch --no-shell --no-network --env KUBECONFIG --server produces a UI where shell ops error and HTTP is denied. The grammar is still the security boundary.
  5. Private command filter โ€” commands marked private are hidden from the UI and /api/exec rejects them.

See also