Refresh property assessor and web automation docs
This commit is contained in:
@@ -15,6 +15,7 @@ Automated web browsing and scraping using Playwright-compatible CloakBrowser, wi
|
||||
- Use `node skills/web-automation/scripts/extract.js "<URL>"` for one-shot extraction from a single URL
|
||||
- Use `npx tsx scrape.ts ...` for markdown scraping modes
|
||||
- Use `npx tsx browse.ts ...`, `auth.ts`, or `flow.ts` for interactive or authenticated flows
|
||||
- Use `node skills/web-automation/scripts/zillow-photos.js "<listing-url>"` or `har-photos.js` for real-estate photo extraction before attempting generic gallery automation
|
||||
|
||||
## Requirements
|
||||
|
||||
@@ -59,6 +60,17 @@ pnpm rebuild better-sqlite3 esbuild
|
||||
|
||||
Without this, helper scripts may fail before launch because the native bindings are missing.
|
||||
|
||||
## Prerequisite check
|
||||
|
||||
Before running automation, verify the local install and CloakBrowser wiring:
|
||||
|
||||
```bash
|
||||
cd ~/.openclaw/workspace/skills/web-automation/scripts
|
||||
node check-install.js
|
||||
```
|
||||
|
||||
If this fails, stop and fix setup before troubleshooting site automation.
|
||||
|
||||
## Exec approvals allowlist
|
||||
|
||||
If OpenClaw keeps prompting for approval when running this skill, add a local allowlist for the main agent:
|
||||
@@ -80,13 +92,24 @@ Notes:
|
||||
- If `node` lives somewhere else, replace `/opt/homebrew/bin/node` with the output of `which node`.
|
||||
- If matching is inconsistent, replace `~/.openclaw/...` with the full absolute path for the machine.
|
||||
- Keep the allowlist scoped to the main agent unless there is a clear reason to widen it.
|
||||
- Prefer file-based commands like `node check-install.js`, `node zillow-photos.js ...`, and `node har-photos.js ...` over inline `node -e ...`. Inline interpreter eval is more likely to trigger approval friction.
|
||||
|
||||
## Common commands
|
||||
|
||||
```bash
|
||||
# Install / wiring check
|
||||
cd ~/.openclaw/workspace/skills/web-automation/scripts
|
||||
node check-install.js
|
||||
|
||||
# One-shot JSON extraction
|
||||
node skills/web-automation/scripts/extract.js "https://example.com"
|
||||
|
||||
# Zillow photo extraction
|
||||
node skills/web-automation/scripts/zillow-photos.js "https://www.zillow.com/homedetails/..."
|
||||
|
||||
# HAR photo extraction
|
||||
node skills/web-automation/scripts/har-photos.js "https://www.har.com/homedetail/..."
|
||||
|
||||
# Browse a page with persistent profile
|
||||
npx tsx browse.ts --url "https://example.com"
|
||||
|
||||
@@ -100,6 +123,58 @@ npx tsx auth.ts --url "https://example.com/login"
|
||||
npx tsx flow.ts --instruction 'go to https://search.fiorinis.com then type "pippo" then press enter then wait 2s'
|
||||
```
|
||||
|
||||
## Real-estate photo extraction
|
||||
|
||||
Use the dedicated Zillow and HAR extractors before trying a free-form gallery flow.
|
||||
|
||||
### Zillow
|
||||
|
||||
```bash
|
||||
cd ~/.openclaw/workspace/skills/web-automation/scripts
|
||||
node zillow-photos.js "https://www.zillow.com/homedetails/4141-Whiteley-Dr-Corpus-Christi-TX-78418/2103723704_zpid/"
|
||||
```
|
||||
|
||||
What it does:
|
||||
- opens the listing page with CloakBrowser
|
||||
- tries the `See all photos` / `See all X photos` entry point
|
||||
- if Zillow keeps the click path flaky, falls back to the listing's embedded `__NEXT_DATA__` payload
|
||||
- returns direct `photos.zillowstatic.com` image URLs as JSON
|
||||
|
||||
Expected success shape:
|
||||
- `complete: true`
|
||||
- `expectedPhotoCount` matches `photoCount`
|
||||
- `imageUrls` contains the listing photo set
|
||||
|
||||
### HAR
|
||||
|
||||
```bash
|
||||
cd ~/.openclaw/workspace/skills/web-automation/scripts
|
||||
node har-photos.js "https://www.har.com/homedetail/4141-whiteley-dr-corpus-christi-tx-78418/14069438"
|
||||
```
|
||||
|
||||
What it does:
|
||||
- opens the HAR listing page
|
||||
- clicks `Show all photos` / `View all photos`
|
||||
- extracts the direct `pics.harstatic.com` image URLs from the all-photos page
|
||||
|
||||
Expected success shape:
|
||||
- `complete: true`
|
||||
- `expectedPhotoCount` matches `photoCount`
|
||||
- `imageUrls` contains the listing photo set
|
||||
|
||||
### Test commands
|
||||
|
||||
From `skills/web-automation/scripts`:
|
||||
|
||||
```bash
|
||||
node check-install.js
|
||||
npm run test:photos
|
||||
node zillow-photos.js "<zillow-listing-url>"
|
||||
node har-photos.js "<har-listing-url>"
|
||||
```
|
||||
|
||||
Use the live Zillow and HAR URLs above for a known-good regression check.
|
||||
|
||||
## One-shot extraction (`extract.js`)
|
||||
|
||||
Use `extract.js` when the task is just: open one URL, render it, and return structured content.
|
||||
|
||||
Reference in New Issue
Block a user