feat: add safe Playwright scraper skill
This commit is contained in:
68
skills/playwright-safe/SKILL.md
Normal file
68
skills/playwright-safe/SKILL.md
Normal file
@@ -0,0 +1,68 @@
|
||||
---
|
||||
name: playwright-safe
|
||||
description: Use when a page needs JavaScript rendering or moderate anti-bot handling and the agent should use a single local Playwright scraper instead of generic web fetch tooling.
|
||||
---
|
||||
|
||||
# Playwright Safe
|
||||
|
||||
Single-entry Playwright scraper for dynamic or moderately bot-protected pages.
|
||||
|
||||
## When To Use
|
||||
|
||||
- Page content depends on client-side rendering
|
||||
- Generic `scrape` or `webfetch` is likely to miss rendered content
|
||||
- The task needs one direct page extraction with lightweight stealth behavior
|
||||
|
||||
## Do Not Use
|
||||
|
||||
- For multi-step browser workflows with login/stateful interaction
|
||||
- For site-specific automation flows
|
||||
- When the page can be handled by a simpler built-in fetch path
|
||||
|
||||
## Setup
|
||||
|
||||
```bash
|
||||
cd ~/.openclaw/workspace/skills/playwright-safe
|
||||
npm install
|
||||
npx playwright install chromium
|
||||
```
|
||||
|
||||
## Command
|
||||
|
||||
```bash
|
||||
node scripts/playwright-safe.js "<URL>"
|
||||
```
|
||||
|
||||
Only pass a user-provided `http` or `https` URL.
|
||||
|
||||
## Options
|
||||
|
||||
```bash
|
||||
WAIT_TIME=5000 node scripts/playwright-safe.js "<URL>"
|
||||
SCREENSHOT_PATH=/tmp/page.png node scripts/playwright-safe.js "<URL>"
|
||||
SAVE_HTML=true node scripts/playwright-safe.js "<URL>"
|
||||
HEADLESS=false node scripts/playwright-safe.js "<URL>"
|
||||
USER_AGENT="Mozilla/5.0 ..." node scripts/playwright-safe.js "<URL>"
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
The script prints JSON only, suitable for direct agent consumption. Fields include:
|
||||
|
||||
- `requestedUrl`
|
||||
- `finalUrl`
|
||||
- `title`
|
||||
- `content`
|
||||
- `metaDescription`
|
||||
- `status`
|
||||
- `elapsedSeconds`
|
||||
- `challengeDetected`
|
||||
- optional `screenshot`
|
||||
- optional `htmlFile`
|
||||
|
||||
## Safety Notes
|
||||
|
||||
- Stealth and anti-bot shaping are retained
|
||||
- Chromium sandbox remains enabled
|
||||
- No sandbox-disabling flags are used
|
||||
- No site-specific extractors or foreign tool dependencies are used
|
||||
Reference in New Issue
Block a user