Files
stef-openclaw-skills/docs/playwright-safe.md
2026-03-10 19:07:30 -05:00

1.8 KiB

playwright-safe

Single-entry Playwright scraper for one-shot page extraction with JavaScript rendering and moderate anti-bot handling.

What this skill is for

  • Extracting title, visible text, and metadata from one URL
  • Pages that need client-side rendering
  • Moderate anti-bot shaping without a full browser automation workflow
  • Structured JSON output that agents can consume directly

What this skill is not for

  • Multi-step browser workflows
  • Authenticated login flows
  • Interactive click/type sequences across multiple pages

Use web-automation for those broader browser tasks.

Runtime requirements

  • Node.js 18+
  • Local Playwright install under the skill directory

First-time setup

cd ~/.openclaw/workspace/skills/playwright-safe
npm install
npx playwright install chromium

Entry point

node skills/playwright-safe/scripts/playwright-safe.js "<URL>"

Only pass a user-provided http or https URL.

Options

WAIT_TIME=5000 node skills/playwright-safe/scripts/playwright-safe.js "<URL>"
SCREENSHOT_PATH=/tmp/page.png node skills/playwright-safe/scripts/playwright-safe.js "<URL>"
SAVE_HTML=true node skills/playwright-safe/scripts/playwright-safe.js "<URL>"
HEADLESS=false node skills/playwright-safe/scripts/playwright-safe.js "<URL>"
USER_AGENT="Mozilla/5.0 ..." node skills/playwright-safe/scripts/playwright-safe.js "<URL>"

Output

The script prints JSON only. It includes:

  • requestedUrl
  • finalUrl
  • title
  • content
  • metaDescription
  • status
  • elapsedSeconds
  • challengeDetected
  • optional screenshot
  • optional htmlFile

Security posture

  • Keeps lightweight stealth and anti-bot shaping
  • Keeps the browser sandbox enabled
  • Does not use --no-sandbox
  • Does not use --disable-setuid-sandbox
  • Avoids site-specific extractors and cross-skill dependencies