Automate any browser, behind one API¶

webtasks runs Chrome on your server and turns each automation flow into a typed HTTP endpoint. Write a short .webtask recipe — go here, wait, click, extract — and call it with plain JSON from any language. No headless browser in your app, no Selenium grid to babysit.

1 static binary REST + live SSE 38 ready examples GPL-3.0

Install in 10 seconds See it in action Star on GitHub

How webtasks fits together

The problem it solves¶

Browser automation usually means bolting a headless browser, a matching driver, and brittle scripts into every service that needs it. That's heavy to run, hard to secure, and a nightmare to keep in sync across a fleet.

webtasks flips it around: one automation server, called over HTTP. Selectors, login flows, and Chrome internals live in one place. Everything else just sends JSON and gets JSON back.

The old way	With webtasks
A headless browser + driver in every service	One server, called over HTTP from anywhere
Selenium grid / chromedriver version matching	A single static binary talks to Chrome via CDP
Selectors & logins copy-pasted across repos	Recipes live in one bundle, reused everywhere
Rebuild & redeploy to change a flow	Edit a `.webtask` file — hot-reloads on next call
Brittle, undocumented scripts	A typed HTTP API with input/output schemas

A task is a recipe¶

Drop this in tasks/crawl/hackernews-top.webtask and it instantly becomes POST /tasks/crawl/hackernews-top:

task "crawl/hackernews-top"
    pool default
    timeout 20000
    transport rest

    goto "https://news.ycombinator.com"
    wait until "tr.athing" timeout 10000

    extract stories from "tr.athing" repeat
        title text ".titleline > a"
        url   attr href on ".titleline > a"
    end
end

Call it from anything that speaks HTTP:

curlPythonJavaScript

curl -s -X POST localhost:8765/tasks/crawl/hackernews-top -d '{}'

import requests
r = requests.post("http://localhost:8765/tasks/crawl/hackernews-top", json={})
print(r.json()["data"]["stories"])

const r = await fetch("http://localhost:8765/tasks/crawl/hackernews-top", {
  method: "POST", body: "{}",
});
console.log((await r.json()).data.stories);

{ "ok": true, "data": { "stories": [ { "title": "Show HN: …", "url": "https://…" } ] } }

What you can build¶

Scrape & extract

Pull structured JSON from any page or list with CSS-selector field specs. Real sites, real data.
Drive UIs

Fill forms, click, type, scroll infinite feeds, and wait for dynamic SPA state — with native, trusted input events.
Capture artifacts

Screenshots, full-page PDFs, MHTML archives, and animated GIF / MP4 recordings of a whole flow.
Stream progress

Long jobs emit live status and progress events over Server-Sent Events — perfect for progress bars.
Inspect the network

HAR-style request capture, cookie read/write, console logs, and network-idle waits for flaky SPAs.
Stay logged in

Persistent Chrome profiles + declared secrets keep authenticated sessions alive across restarts.

Built for developers¶

Language-agnostic

It's just HTTP + JSON. Call tasks from Python, JS, Go, shell — anything. GET /tasks returns the input/output schema for every endpoint.
Readable recipes

The .webtask language reads like a checklist, not a config file. No indentation traps, no boilerplate.
Instant feedback

Hot-reload re-reads recipes on every request. Edit, re-call, done — no restart, no rebuild.
38 examples to copy

A demo bundle spanning scraping, forms, rendering, recording, and a real-world logged-in scrape.

Write your first task

Ready for production¶

One binary, zero runtime deps

A single static binary — no JVM, no chromedriver, no Selenium server. Ship it plus a zipped bundle and run anywhere Chrome is installed.
Isolated window pools

Concurrency is bounded per pool; a window is never shared by two runs at once. Crashed tabs are detected and replaced automatically.
Secrets, never inline

Credentials are declared in the bundle and resolved at startup from env, flags, or a prompt — surfaced to recipes as {{TOKEN}}, never hard-coded.
Hardened by default

Static file mounts are path-traversal–safe; per-call deadlines stop runaway runs; the server binds to localhost unless you opt out.
Observable

GET /health reports live pool occupancy and task counts; SSE streams every step as it happens.
Portable bundles

Config is a directory or .zip, loaded at runtime. The same binary serves any deployment — point it at a different bundle to change behaviour.

Deployment guide

How a request flows¶

flowchart LR
    Client["HTTP client<br/>curl · Python · JS"]
    Server["webtasks server"]
    Pool["window pool<br/>(bounded concurrency)"]
    Chrome["Chrome window<br/>chromedp / CDP"]
    Web["target site"]

    Client -->|"POST /tasks/name + JSON"| Server
    Server -->|"lease"| Pool
    Pool --> Chrome
    Chrome --> Web
    Server -->|"JSON or live SSE"| Client

Each .webtask file is one endpoint. A run leases a Chrome window for its whole duration and releases it at the end — so concurrency, sessions, and crash recovery are all handled for you.

How it works in depth

Get started¶

Install

curl … | sh, start the server, run your first task in under a minute.
Writing tasks

The complete .webtask language reference, with a build-from-scratch walkthrough.
Examples

38 runnable recipes across 11 categories — copy, tweak, ship.
Deployment

Pools, secrets, static mounts, and packaging a bundle for production.

webtasks is free software under the GNU General Public License v3.0. See License.