docs: add web automation consolidation plans

This commit is contained in:
Stefano Fiorini
2026-03-10 19:24:59 -05:00
parent 6e2fd17734
commit 49d0236a52
2 changed files with 153 additions and 0 deletions

View File

@@ -0,0 +1,21 @@
# Web Automation Consolidation Design
## Goal
Consolidate `playwright-safe` into `web-automation` so the repo exposes a single web skill. Keep the proven one-shot extractor behavior, rename it to `extract.js`, and remove the separate `playwright-safe` skill and docs.
## Architecture
`web-automation` remains the only published skill. It will expose two capability bands under one skill: one-shot extraction via `scripts/extract.js`, and broader stateful automation via the existing `auth.ts`, `browse.ts`, `flow.ts`, and `scrape.ts` commands. The one-shot extractor will keep the current safe Playwright behavior: single URL, JSON output, bounded stealth/anti-bot handling, and no sandbox-disabling Chromium flags.
## Migration
- Copy the working extractor into `skills/web-automation/scripts/extract.js`
- Update `skills/web-automation/SKILL.md` and `docs/web-automation.md` to describe both one-shot extraction and full automation
- Remove `skills/playwright-safe/`
- Remove `docs/playwright-safe.md`
- Remove README/doc index references to `playwright-safe`
## Verification
- `node skills/web-automation/scripts/extract.js` -> JSON error for missing URL
- `node skills/web-automation/scripts/extract.js ftp://example.com` -> JSON error for invalid scheme
- `node skills/web-automation/scripts/extract.js https://example.com` -> valid JSON result with title/status
- Repo text scan confirms no remaining published references directing users to `playwright-safe`
- Commit, push, and clean up the worktree

View File

@@ -0,0 +1,132 @@
# Web Automation Consolidation Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Consolidate the separate `playwright-safe` skill into `web-automation` and publish a single web skill with both one-shot extraction and broader automation.
**Architecture:** Move the proven safe one-shot extractor into `skills/web-automation/scripts/extract.js`, update `web-automation` docs to expose it as the simple path, and remove the separate `playwright-safe` skill and docs. Keep the extractor behavior unchanged except for its new location/name.
**Tech Stack:** Node.js, Playwright, Camoufox skill docs, git
---
### Task 1: Create isolated worktree
**Files:**
- Modify: repo git metadata only
**Step 1: Create worktree**
Run:
```bash
git -C /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills worktree add /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/.worktrees/web-automation-consolidation -b feature/web-automation-consolidation
```
**Step 2: Verify baseline**
Run:
```bash
git -C /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/.worktrees/web-automation-consolidation status --short --branch
```
Expected: clean feature branch
### Task 2: Move the extractor into web-automation
**Files:**
- Create: `skills/web-automation/scripts/extract.js`
- Read: `skills/playwright-safe/scripts/playwright-safe.js`
**Step 1: Copy the extractor**
- Copy the proven script content into `skills/web-automation/scripts/extract.js`
- Adjust only relative paths/messages if needed
**Step 2: Preserve behavior**
- Keep JSON-only output
- Keep URL validation
- Keep stealth/anti-bot behavior
- Keep sandbox enabled
### Task 3: Update skill and docs
**Files:**
- Modify: `skills/web-automation/SKILL.md`
- Modify: `docs/web-automation.md`
- Modify: `README.md`
- Modify: `docs/README.md`
- Delete: `skills/playwright-safe/SKILL.md`
- Delete: `skills/playwright-safe/package.json`
- Delete: `skills/playwright-safe/package-lock.json`
- Delete: `skills/playwright-safe/.gitignore`
- Delete: `skills/playwright-safe/scripts/playwright-safe.js`
- Delete: `docs/playwright-safe.md`
**Step 1: Update docs**
- Make `web-automation` the only published web skill
- Document `extract.js` as the one-shot extraction path
- Remove published references to `playwright-safe`
**Step 2: Remove redundant skill**
- Delete the separate `playwright-safe` skill files and doc
### Task 4: Verify behavior
**Files:**
- Test: `skills/web-automation/scripts/extract.js`
**Step 1: Missing URL check**
Run:
```bash
cd /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/.worktrees/web-automation-consolidation && node skills/web-automation/scripts/extract.js
```
Expected: JSON error about missing URL
**Step 2: Invalid scheme check**
Run:
```bash
cd /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/.worktrees/web-automation-consolidation && node skills/web-automation/scripts/extract.js ftp://example.com
```
Expected: JSON error about only http/https URLs allowed
**Step 3: Smoke test**
Run:
```bash
cd /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/.worktrees/web-automation-consolidation && node skills/web-automation/scripts/extract.js https://example.com
```
Expected: JSON with title `Example Domain`, status `200`, and no sandbox-disabling flags in code
**Step 4: Reference scan**
Run:
```bash
cd /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/.worktrees/web-automation-consolidation && rg -n "playwright-safe" README.md docs skills
```
Expected: no remaining published references, or only intentional historical plan docs
### Task 5: Commit, push, and clean up
**Files:**
- Modify: git history only
**Step 1: Commit**
Run:
```bash
git add skills/web-automation docs README.md
git commit -m "refactor: consolidate web scraping into web-automation"
```
**Step 2: Push**
Run:
```bash
git push -u origin feature/web-automation-consolidation
```
**Step 3: Merge and cleanup**
- Fast-forward or merge to `main`
- Push `main`
- Remove the worktree
- Delete the feature branch