## Summary - add repository-wide quality tooling and verification scaffolding, including CI workflows, pnpm workspace setup, ESLint/Prettier/markdown checks, and generated-output verification helpers - reorganize skill sources and generation flow by introducing canonical `_source` variants, generator/manifests, reusable helper abstractions, and shared web-automation/browser utilities - clean up and expand documentation so the root README flows into docs and skill docs, with clearer development, reviewer, installer, and workflow guidance ## Notable changes - docs flow and consistency cleanup across `README.md`, `docs/README.md`, and related docs - new scripts for `check`, docs verification, generated-file verification, shell portability, and safe directory replacement - refactors in Atlassian and web-automation skill runtimes to reduce duplication and centralize reusable code - changelog, development documentation, and CI surface updates ## Test Plan - [ ] `pnpm run check` - [ ] review generated/manifests and skill sync outputs - [ ] smoke-check docs flow from `README.md` to `docs/README.md` to skill docs ## Notes - this branch currently includes tracked `skills/web-automation/shared/node_modules` content that should be reviewed carefully as potentially noisy/accidental committed artifacts Co-authored-by: Stefano Fiorini <stefano.fiorini@firsthorizon.com> Reviewed-on: #1
4.7 KiB
name, description
| name | description |
|---|---|
| web-automation | Browse and scrape web pages using Playwright-compatible CloakBrowser. Use when automating web workflows, extracting rendered page content, handling authenticated sessions, or running multi-step browser flows. |
Web Automation with CloakBrowser (Pi)
Automated web browsing and scraping for pi using the shared runtime bundle in scripts/.
Requirements
- Node.js 20+
pnpm- Network access to download the CloakBrowser binary on first use
First-Time Setup
Global install:
mkdir -p ~/.pi/agent/skills/web-automation
cp -R skills/web-automation/pi/* ~/.pi/agent/skills/web-automation/
cd ~/.pi/agent/skills/web-automation/scripts
pnpm install
npx cloakbrowser install
pnpm approve-builds
pnpm rebuild better-sqlite3 esbuild
Project-local install:
mkdir -p .pi/skills/web-automation
cp -R skills/web-automation/pi/* .pi/skills/web-automation/
cd .pi/skills/web-automation/scripts
pnpm install
npx cloakbrowser install
pnpm approve-builds
pnpm rebuild better-sqlite3 esbuild
Pi can also load this repo through settings or package installs as documented in docs/PI.md.
If you installed this repo from a local checkout with ./scripts/install-pi-package.sh, the runtime stays in the checkout mirror at pi-package/skills/web-automation/scripts.
Updating CloakBrowser
Run inside the installed scripts/ directory for the pi skill. The commands below work for both global and project-local installs as long as you run them from the installed scripts/ directory.
pnpm up cloakbrowser playwright-core
npx cloakbrowser install
pnpm approve-builds
pnpm rebuild better-sqlite3 esbuild
Prerequisite Check (MANDATORY)
Before running automation, verify the runtime from the location that matches your install style:
- local checkout package install:
pi-package/skills/web-automation/scripts - project-local copied install:
.pi/skills/web-automation/scripts - global copied install:
~/.pi/agent/skills/web-automation/scripts
cd pi-package/skills/web-automation/scripts
node check-install.js
If the check fails, stop and return:
Missing dependency/config: web-automation requires cloakbrowser and playwright-core with CloakBrowser-based scripts. Run setup in this skill, then retry.
If runtime fails with missing native bindings for better-sqlite3 or esbuild, run the same commands from your installed scripts/ directory:
cd pi-package/skills/web-automation/scripts
pnpm approve-builds
pnpm rebuild better-sqlite3 esbuild
When To Use Which Command
- Use
node extract.js "<URL>"for a one-shot rendered fetch with JSON output. - Use
npx tsx scrape.ts ...when you need markdown extraction, Readability cleanup, or selector-based scraping. - Use
npx tsx browse.ts ...,auth.ts, orflow.tswhen the task needs login handling, persistent sessions, clicks, typing, screenshots, or multi-step navigation. - Use
npx tsx scan-local-app.tswhen you need a configurable local-app smoke pass driven bySCAN_*andCLOAKBROWSER_*environment variables.
Quick Reference
- Install check:
node check-install.js - One-shot JSON extract:
node extract.js "https://example.com" - Browse page:
npx tsx browse.ts --url "https://example.com" - Scrape markdown:
npx tsx scrape.ts --url "https://example.com" --mode main --output page.md - Authenticate:
npx tsx auth.ts --url "https://example.com/login" - Natural-language flow:
npx tsx flow.ts --instruction 'go to https://example.com then click on "Login" then type "user@example.com" in #email then press enter' - Local app smoke scan:
SCAN_BASE_URL=http://localhost:3000 SCAN_ROUTES=/,/dashboard npx tsx scan-local-app.ts
Local App Smoke Scan
scan-local-app.ts is intentionally generic. Configure it with environment variables instead of editing the file:
SCAN_BASE_URLSCAN_LOGIN_PATHSCAN_USERNAMESCAN_PASSWORDSCAN_USERNAME_SELECTORSCAN_PASSWORD_SELECTORSCAN_SUBMIT_SELECTORSCAN_ROUTESSCAN_REPORT_PATHSCAN_HEADLESS
If SCAN_USERNAME or SCAN_PASSWORD are omitted, the script falls back to CLOAKBROWSER_USERNAME and CLOAKBROWSER_PASSWORD.
Notes
- Sessions persist in CloakBrowser profile storage.
- Use
--waitfor dynamic pages. - Use
--mode selector --selector "..."for targeted extraction. extract.jskeeps a bounded stealth/rendered fetch path without needing a long-lived automation session.- Package installs use the repo's
pi-package/skills/web-automation/mirror so the installed skill directory name matchesweb-automation.