Add Zillow and HAR photo extractors

This commit is contained in:
2026-03-27 17:35:46 -05:00
parent e7c56fe760
commit eeea0c8ef1
11 changed files with 873 additions and 8 deletions

View File

@@ -148,8 +148,16 @@ You must use **only `web-automation`** to:
If Zillow exposes a page with a scroller that shows the listing photos, that page counts as the Zillow photo source and should be used directly.
If that scroller page exposes direct image links such as `https://photos.zillowstatic.com/...`, treat those URLs as successful photo access and use that image set for review.
This is preferred over fragile modal next/previous navigation.
If the rendered Zillow listing shell itself already exposes the full direct Zillow image set and the extracted image count matches the announced photo count, that also counts as successful photo access even if the `See all photos` click path is flaky.
For this normal Zillow all-photos workflow, stay inside **`web-automation` only**.
Use the dedicated file-based extractor first:
```bash
cd ~/.openclaw/workspace/skills/web-automation/scripts
node zillow-photos.js "<zillow-listing-url>"
```
Do **not** escalate to coding-agent, ad hoc Python helpers, or extra dependency-heavy tooling just to open `See all photos`, inspect the scroller page, or extract Zillow image URLs.
Only escalate beyond `web-automation` if `web-automation` itself truly cannot access the all-photos/scroller page or the exposed image set.
@@ -165,6 +173,42 @@ The following do **not** count as a photo-source attempt by themselves:
- reading listing text that mentions photos
- capturing only the listing shell, hero image, or collage preview
### HAR fallback rule
If Zillow photo extraction does not expose usable direct image URLs, try HAR next for the same property.
For HAR listings, use **only `web-automation`** to:
1. open the HAR listing page
2. click `Show all photos` / `View all photos`
3. access the resulting all-photos page or photo view
4. extract the direct image URLs from that page
Use the dedicated HAR extractor first:
```bash
cd ~/.openclaw/workspace/skills/web-automation/scripts
node har-photos.js "<har-listing-url>"
```
If HAR exposes the direct photo URLs from the all-photos page, treat that as successful photo access and use that image set for review.
Do not stop after a failed Zillow attempt if HAR is available and exposes the listing photos more reliably.
When a dedicated extractor returns `imageUrls`, inspect the images in that returned set before making condition claims.
For smaller listings, review the full extracted set when practical; for a 20-30 photo listing, that usually means all photos.
### Approval-safe command shape
When running `web-automation` from chat-driven property assessment, prefer file-based commands under `~/.openclaw/workspace/skills/web-automation/scripts`.
Good:
- `node check-install.js`
- `node zillow-photos.js "<url>"`
- `node har-photos.js "<url>"`
Avoid approval-sensitive inline interpreter eval where possible:
- `node -e "..."`
- `node --input-type=module -e "..."`
The final assessment must explicitly include these lines in the output:
- `Photo source attempts: <action-based summary>`
- `Photo review: completed via <source>`