Add purpose-aware property assessor intake

This commit is contained in:
2026-03-27 23:01:12 -05:00
parent c58a2a43c8
commit 301986fb25
16 changed files with 841 additions and 70 deletions

View File

@@ -34,9 +34,9 @@ The wrapper script uses the skill-local Node dependencies under `node_modules/`.
## Commands ## Commands
```bash ```bash
scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --assessment-purpose "investment property"
scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --recipient-email "buyer@example.com" scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --assessment-purpose "investment property" --recipient-email "buyer@example.com"
scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --recipient-email "buyer@example.com" --output /tmp/property-assessment.pdf scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --assessment-purpose "investment property" --recipient-email "buyer@example.com" --output /tmp/property-assessment.pdf
scripts/property-assessor locate-public-records --address "4141 Whiteley Dr, Corpus Christi, TX 78418" scripts/property-assessor locate-public-records --address "4141 Whiteley Dr, Corpus Christi, TX 78418"
scripts/property-assessor locate-public-records --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --parcel-id "14069438" scripts/property-assessor locate-public-records --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --parcel-id "14069438"
@@ -61,23 +61,26 @@ Default operating sequence:
### `assess` ### `assess`
```bash ```bash
scripts/property-assessor assess --address "<street-address>" scripts/property-assessor assess --address "<street-address>" --assessment-purpose "<purpose>"
scripts/property-assessor assess --address "<street-address>" --recipient-email "<target@example.com>" scripts/property-assessor assess --address "<street-address>" --assessment-purpose "<purpose>" --recipient-email "<target@example.com>"
``` ```
Current behavior: Current behavior:
- starts from the address - starts from the address
- requires an assessment purpose for decision-grade analysis
- automatically runs public-record / appraisal-district lookup - automatically runs public-record / appraisal-district lookup
- automatically tries to discover Zillow and HAR listing URLs from the address when no listing URL is provided
- runs Zillow photo extraction first, then HAR as fallback when available
- returns a structured preliminary report payload - returns a structured preliminary report payload
- asks for recipient email(s) before PDF generation - asks for recipient email(s) before PDF generation
- renders the fixed-template PDF when recipient email(s) are present - renders the fixed-template PDF when recipient email(s) are present
Important limitation: Important limitation:
- this first implementation wires the address-first assessment spine - this implementation now wires the address-first intake, purpose-aware framing, public-record lookup, listing discovery, and photo-source extraction
- it does not yet auto-run listing extraction, photo review, comps, or carry underwriting inside the helper itself - it still does not perform full comp analysis, pricing judgment, or completed carry underwriting inside the helper itself
- those richer assessment steps are still governed by the skill workflow and the `web-automation` extractors - those deeper decision steps are still governed by the skill workflow after the helper assembles the enriched payload
## Source priority ## Source priority
@@ -205,8 +208,10 @@ For chat-driven runs, prefer file-based commands.
Good: Good:
- `scripts/property-assessor assess --address "..."` - `scripts/property-assessor assess --address "..." --assessment-purpose "..."`
- `node check-install.js` - `node check-install.js`
- `node zillow-discover.js "<street-address>"`
- `node har-discover.js "<street-address>"`
- `node zillow-photos.js "<url>"` - `node zillow-photos.js "<url>"`
- `node har-photos.js "<url>"` - `node har-photos.js "<url>"`
- `scripts/property-assessor locate-public-records --address "..."` - `scripts/property-assessor locate-public-records --address "..."`
@@ -304,13 +309,15 @@ npm test
```bash ```bash
cd ~/.openclaw/workspace/skills/property-assessor cd ~/.openclaw/workspace/skills/property-assessor
scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --assessment-purpose "investment property"
``` ```
Expected shape: Expected shape:
- `needsAssessmentPurpose: false`
- `needsRecipientEmails: true` - `needsRecipientEmails: true`
- public-record / CAD jurisdiction included in the returned payload - public-record / CAD jurisdiction included in the returned payload
- `photoReview.imageUrls` populated when Zillow or HAR extraction succeeds
- no PDF generated yet - no PDF generated yet
- explicit message telling the operator to ask for target recipient email(s) - explicit message telling the operator to ask for target recipient email(s)
@@ -332,7 +339,7 @@ Expected shape:
```bash ```bash
cd ~/.openclaw/workspace/skills/property-assessor cd ~/.openclaw/workspace/skills/property-assessor
scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --recipient-email "buyer@example.com" --output /tmp/property-assessment.pdf scripts/property-assessor assess --address "4141 Whiteley Dr, Corpus Christi, TX 78418" --assessment-purpose "investment property" --recipient-email "buyer@example.com" --output /tmp/property-assessment.pdf
``` ```
Expected result: Expected result:
@@ -368,8 +375,10 @@ When testing `property-assessor` itself, confirm the assessment:
- starts from the address when available - starts from the address when available
- uses Zillow first for photo extraction, HAR as fallback - uses Zillow first for photo extraction, HAR as fallback
- frames the analysis around the stated assessment purpose
- uses official public-record jurisdiction links when available - uses official public-record jurisdiction links when available
- does not treat listing geo IDs as assessor keys - does not treat listing geo IDs as assessor keys
- asks for assessment purpose if it was not provided
- asks for recipient email(s) if they were not provided - asks for recipient email(s) if they were not provided
- renders the final report through the fixed PDF template once recipient email(s) are known - renders the final report through the fixed PDF template once recipient email(s) are known

View File

@@ -15,6 +15,7 @@ Automated web browsing and scraping using Playwright-compatible CloakBrowser, wi
- Use `node skills/web-automation/scripts/extract.js "<URL>"` for one-shot extraction from a single URL - Use `node skills/web-automation/scripts/extract.js "<URL>"` for one-shot extraction from a single URL
- Use `npx tsx scrape.ts ...` for markdown scraping modes - Use `npx tsx scrape.ts ...` for markdown scraping modes
- Use `npx tsx browse.ts ...`, `auth.ts`, or `flow.ts` for interactive or authenticated flows - Use `npx tsx browse.ts ...`, `auth.ts`, or `flow.ts` for interactive or authenticated flows
- Use `node skills/web-automation/scripts/zillow-discover.js "<street-address>"` or `har-discover.js` to resolve a real-estate listing URL from an address
- Use `node skills/web-automation/scripts/zillow-photos.js "<listing-url>"` or `har-photos.js` for real-estate photo extraction before attempting generic gallery automation - Use `node skills/web-automation/scripts/zillow-photos.js "<listing-url>"` or `har-photos.js` for real-estate photo extraction before attempting generic gallery automation
## Requirements ## Requirements
@@ -93,6 +94,7 @@ Notes:
- If matching is inconsistent, replace `~/.openclaw/...` with the full absolute path for the machine. - If matching is inconsistent, replace `~/.openclaw/...` with the full absolute path for the machine.
- Keep the allowlist scoped to the main agent unless there is a clear reason to widen it. - Keep the allowlist scoped to the main agent unless there is a clear reason to widen it.
- Prefer file-based commands like `node check-install.js`, `node zillow-photos.js ...`, and `node har-photos.js ...` over inline `node -e ...`. Inline interpreter eval is more likely to trigger approval friction. - Prefer file-based commands like `node check-install.js`, `node zillow-photos.js ...`, and `node har-photos.js ...` over inline `node -e ...`. Inline interpreter eval is more likely to trigger approval friction.
- The same applies to `zillow-discover.js` and `har-discover.js`: keep discovery file-based, not inline.
## Common commands ## Common commands
@@ -104,6 +106,12 @@ node check-install.js
# One-shot JSON extraction # One-shot JSON extraction
node skills/web-automation/scripts/extract.js "https://example.com" node skills/web-automation/scripts/extract.js "https://example.com"
# Zillow listing discovery from address
node skills/web-automation/scripts/zillow-discover.js "4141 Whiteley Dr, Corpus Christi, TX 78418"
# HAR listing discovery from address
node skills/web-automation/scripts/har-discover.js "4141 Whiteley Dr, Corpus Christi, TX 78418"
# Zillow photo extraction # Zillow photo extraction
node skills/web-automation/scripts/zillow-photos.js "https://www.zillow.com/homedetails/..." node skills/web-automation/scripts/zillow-photos.js "https://www.zillow.com/homedetails/..."
@@ -123,9 +131,35 @@ npx tsx auth.ts --url "https://example.com/login"
npx tsx flow.ts --instruction 'go to https://search.fiorinis.com then type "pippo" then press enter then wait 2s' npx tsx flow.ts --instruction 'go to https://search.fiorinis.com then type "pippo" then press enter then wait 2s'
``` ```
## Real-estate photo extraction ## Real-estate listing discovery and photo extraction
Use the dedicated Zillow and HAR extractors before trying a free-form gallery flow. Use the dedicated Zillow and HAR discovery/photo commands before trying a free-form gallery flow.
### Zillow discovery
```bash
cd ~/.openclaw/workspace/skills/web-automation/scripts
node zillow-discover.js "4141 Whiteley Dr, Corpus Christi, TX 78418"
```
What it does:
- opens the Zillow address URL with CloakBrowser
- resolves directly to a property page when Zillow supports the address slug
- otherwise looks for a `homedetails` listing link in the rendered page
- returns the discovered listing URL as JSON
### HAR discovery
```bash
cd ~/.openclaw/workspace/skills/web-automation/scripts
node har-discover.js "4141 Whiteley Dr, Corpus Christi, TX 78418"
```
What it does:
- opens the HAR address search page
- looks for a confident `homedetail` match in rendered results
- returns the discovered listing URL when HAR exposes a strong enough match
- returns `listingUrl: null` when HAR discovery is not confident enough
### Zillow ### Zillow
@@ -169,6 +203,8 @@ From `skills/web-automation/scripts`:
```bash ```bash
node check-install.js node check-install.js
npm run test:photos npm run test:photos
node zillow-discover.js "<street-address>"
node har-discover.js "<street-address>"
node zillow-photos.js "<zillow-listing-url>" node zillow-photos.js "<zillow-listing-url>"
node har-photos.js "<har-listing-url>" node har-photos.js "<har-listing-url>"
``` ```

View File

@@ -12,7 +12,10 @@ Start from the property address when possible. Treat listing URLs as supporting
Accept any of: Accept any of:
- a street address - a street address
- one or more listing URLs - one or more listing URLs
- an address plus user constraints such as investment only, owner-occupant, long-term rental, STR, or distance to a target location - an address plus the reason for the assessment, such as investment property, vacation home, owner-occupant, long-term rental, STR, or housing for a child in college
The assessment purpose is required for a decision-grade result.
If the user does not say why they want the property assessed, stop and ask before finalizing the analysis.
## Core workflow ## Core workflow
@@ -73,14 +76,18 @@ Quick command summary:
cd ~/.openclaw/workspace/skills/property-assessor cd ~/.openclaw/workspace/skills/property-assessor
npm install npm install
scripts/property-assessor assess --address "<street-address>" scripts/property-assessor assess --address "<street-address>"
scripts/property-assessor assess --address "<street-address>" --recipient-email "<target@example.com>" scripts/property-assessor assess --address "<street-address>" --assessment-purpose "<purpose>"
scripts/property-assessor assess --address "<street-address>" --assessment-purpose "<purpose>" --recipient-email "<target@example.com>"
scripts/property-assessor locate-public-records --address "<street-address>" scripts/property-assessor locate-public-records --address "<street-address>"
scripts/property-assessor render-report --input "<report-payload-json>" --output "<output-pdf>" scripts/property-assessor render-report --input "<report-payload-json>" --output "<output-pdf>"
``` ```
`assess` is the address-first entrypoint. It should: `assess` is the address-first entrypoint. It should:
- require the assessment purpose
- resolve official public-record jurisdiction automatically from the address - resolve official public-record jurisdiction automatically from the address
- build the report payload skeleton - try to discover Zillow and HAR listing URLs from the address when no listing URL is provided
- run the approval-safe Zillow/HAR photo extractor chain automatically
- build a purpose-aware report payload
- stop and ask for recipient email(s) before rendering the PDF - stop and ask for recipient email(s) before rendering the PDF
- render the PDF only after recipient email(s) are known - render the PDF only after recipient email(s) are known
@@ -112,6 +119,12 @@ scripts/property-assessor assess --address "<street-address>"
``` ```
This command should automatically include the public-record jurisdiction result in the returned assessment payload. This command should automatically include the public-record jurisdiction result in the returned assessment payload.
When `--assessment-purpose` is present, it should also:
- frame the analysis around that stated purpose
- try Zillow discovery from the address
- try HAR discovery from the address
- run Zillow photo extraction first when available, then HAR as fallback
- include the extracted `imageUrls` in `photoReview` when successful
This command currently: This command currently:
- resolves the address through the official Census geocoder - resolves the address through the official Census geocoder

View File

@@ -2,6 +2,7 @@
"recipientEmails": [ "recipientEmails": [
"buyer@example.com" "buyer@example.com"
], ],
"assessmentPurpose": "investment property",
"reportTitle": "Property Assessment Report", "reportTitle": "Property Assessment Report",
"subtitle": "Sample property assessment payload", "subtitle": "Sample property assessment payload",
"subjectProperty": { "subjectProperty": {

View File

@@ -1,11 +1,14 @@
import os from "node:os"; import os from "node:os";
import path from "node:path"; import path from "node:path";
import { discoverListingSources, type ListingDiscoveryResult } from "./listing-discovery.js";
import { extractPhotoData, type PhotoExtractionResult, type PhotoSource } from "./photo-review.js";
import { resolvePublicRecords, type PublicRecordsResolution } from "./public-records.js"; import { resolvePublicRecords, type PublicRecordsResolution } from "./public-records.js";
import { renderReportPdf, type ReportPayload } from "./report-pdf.js"; import { renderReportPdf, type ReportPayload } from "./report-pdf.js";
export interface AssessPropertyOptions { export interface AssessPropertyOptions {
address: string; address: string;
assessmentPurpose?: string;
recipientEmails?: string[] | string; recipientEmails?: string[] | string;
output?: string; output?: string;
parcelId?: string; parcelId?: string;
@@ -15,16 +18,30 @@ export interface AssessPropertyOptions {
export interface AssessPropertyResult { export interface AssessPropertyResult {
ok: true; ok: true;
needsAssessmentPurpose: boolean;
needsRecipientEmails: boolean; needsRecipientEmails: boolean;
message: string; message: string;
outputPath: string | null; outputPath: string | null;
reportPayload: ReportPayload; reportPayload: ReportPayload | null;
publicRecords: PublicRecordsResolution; publicRecords: PublicRecordsResolution | null;
} }
interface AssessPropertyDeps { interface AssessPropertyDeps {
resolvePublicRecordsFn?: typeof resolvePublicRecords; resolvePublicRecordsFn?: typeof resolvePublicRecords;
renderReportPdfFn?: typeof renderReportPdf; renderReportPdfFn?: typeof renderReportPdf;
discoverListingSourcesFn?: typeof discoverListingSources;
extractPhotoDataFn?: typeof extractPhotoData;
}
interface PurposeGuidance {
label: string;
snapshot: string;
like: string;
caution: string;
comp: string;
carry: string;
diligence: string;
verdict: string;
} }
function asStringArray(value: unknown): string[] { function asStringArray(value: unknown): string[] {
@@ -95,36 +112,206 @@ function buildPublicRecordLinks(
return links; return links;
} }
function normalizePurpose(value: string): string {
return value.trim().replace(/\s+/g, " ");
}
function getPurposeGuidance(purpose: string): PurposeGuidance {
const normalized = purpose.toLowerCase();
if (/(invest|rental|cash[\s-]?flow|income|flip|str|airbnb)/i.test(normalized)) {
return {
label: purpose,
snapshot: `Purpose fit: evaluate this as an income property first, not a generic owner-occupant home.`,
like: "Purpose-aligned upside will depend on durable rent support, manageable make-ready, and exit liquidity.",
caution: "An investment property is not attractive just because the address clears basic public-record checks; rent support and margin still decide the deal.",
comp: "Comp work should focus on resale liquidity and true rent support for an income property, not only nearby asking prices.",
carry: "Carry view should underwrite this as an income property with taxes, insurance, HOA, maintenance, vacancy, and management friction.",
diligence: "Confirm realistic rent, reserve for turns and repairs, and whether the building/submarket has enough liquidity for an investment property exit.",
verdict: `Assessment purpose: ${purpose}. Do not issue a buy/pass conclusion until the property clears rent support, carry, and make-ready standards for an investment property.`
};
}
if (/(vacation|second home|weekend|personal use|beach|getaway)/i.test(normalized)) {
return {
label: purpose,
snapshot: "Purpose fit: evaluate this as a vacation home with personal-use fit and carrying-cost tolerance in mind.",
like: "A vacation-home decision can justify paying for lifestyle fit, but only if ongoing friction is acceptable.",
caution: "Vacation home ownership can hide real recurring cost drag when insurance, HOA, storm exposure, and deferred maintenance are under-modeled.",
comp: "Comp work should focus on lifestyle alternatives, micro-location quality, and whether the premium over substitutes is defensible for a vacation home.",
carry: "Carry view should stress-test second-home costs, especially insurance, HOA, special assessments, and low-utilization months.",
diligence: "Confirm rules, reserves, storm or flood exposure, and whether the property still makes sense if usage ends up lower than expected.",
verdict: `Assessment purpose: ${purpose}. The final call should weigh personal-use fit against ongoing friction, not just headline list price.`
};
}
if (/(daughter|son|college|student|school|campus)/i.test(normalized)) {
return {
label: purpose,
snapshot: "Purpose fit: evaluate this as practical housing for a student, with safety, commute, durability, and oversight burden in mind.",
like: "This purpose is best served by simple logistics, durable finishes, and a layout that is easy to live in and maintain.",
caution: "Student-use housing can become a bad fit when commute friction, fragile finishes, or management burden are underestimated.",
comp: "Comp work should emphasize campus proximity, day-to-day practicality, and whether comparable student-friendly options exist at lower all-in cost.",
carry: "Carry view should account for parent-risk factors, furnishing/setup costs, upkeep burden, and whether renting may still be the lower-risk option.",
diligence: "Confirm commute, safety, parking, roommate practicality if relevant, and whether the property is easy to maintain from a distance.",
verdict: `Assessment purpose: ${purpose}. The final recommendation should prioritize practicality, safety, and management burden over generic resale talking points.`
};
}
return {
label: purpose,
snapshot: `Purpose fit: ${purpose}. The final recommendation should be explicitly tested against that goal.`,
like: "The assessment should stay anchored to the stated purpose rather than defaulting to generic market commentary.",
caution: "Even a clean public-record and photo intake is not enough if the property does not fit the stated purpose.",
comp: "Comp work should compare against alternatives that solve the same purpose, not just nearby listings.",
carry: "Carry view should reflect the stated purpose and the real friction it implies.",
diligence: "Purpose-specific diligence should be listed explicitly before a final buy/pass/offer recommendation.",
verdict: `Assessment purpose: ${purpose}. The final conclusion must be explained in terms of that stated objective.`
};
}
function inferSourceFromUrl(rawUrl: string): PhotoSource | null {
try {
const url = new URL(rawUrl);
const host = url.hostname.toLowerCase();
if (host.includes("zillow.com")) return "zillow";
if (host.includes("har.com")) return "har";
return null;
} catch {
return null;
}
}
async function resolvePhotoReview(
options: AssessPropertyOptions,
discoverListingSourcesFn: typeof discoverListingSources,
extractPhotoDataFn: typeof extractPhotoData
): Promise<{
listingUrls: Array<{ label: string; url: string }>;
photoReview: Record<string, unknown>;
}> {
const attempts: string[] = [];
const listingUrls: Array<{ label: string; url: string }> = [];
const addListingUrl = (label: string, url: string | null | undefined): void => {
if (!url) return;
if (!listingUrls.some((item) => item.url === url)) {
listingUrls.push({ label, url });
}
};
let zillowUrl: string | null = null;
let harUrl: string | null = null;
if (options.listingSourceUrl) {
const explicitSource = inferSourceFromUrl(options.listingSourceUrl);
addListingUrl("Explicit Listing Source", options.listingSourceUrl);
if (explicitSource === "zillow") {
zillowUrl = options.listingSourceUrl;
attempts.push(`Using explicit Zillow listing URL: ${options.listingSourceUrl}`);
} else if (explicitSource === "har") {
harUrl = options.listingSourceUrl;
attempts.push(`Using explicit HAR listing URL: ${options.listingSourceUrl}`);
} else {
attempts.push(
`Explicit listing URL was provided but is not a supported Zillow/HAR photo source: ${options.listingSourceUrl}`
);
}
}
if (!zillowUrl && !harUrl) {
const discovered = await discoverListingSourcesFn(options.address);
attempts.push(...discovered.attempts);
zillowUrl = discovered.zillowUrl;
harUrl = discovered.harUrl;
addListingUrl("Discovered Zillow Listing", zillowUrl);
addListingUrl("Discovered HAR Listing", harUrl);
}
const candidates: Array<{ source: PhotoSource; url: string }> = [];
if (zillowUrl) candidates.push({ source: "zillow", url: zillowUrl });
if (harUrl) candidates.push({ source: "har", url: harUrl });
let extracted: PhotoExtractionResult | null = null;
for (const candidate of candidates) {
try {
const result = await extractPhotoDataFn(candidate.source, candidate.url);
extracted = result;
attempts.push(
`${candidate.source} photo extraction succeeded with ${result.photoCount} photos.`
);
if (result.notes.length) {
attempts.push(...result.notes);
}
break;
} catch (error) {
attempts.push(
`${candidate.source} photo extraction failed: ${error instanceof Error ? error.message : String(error)}`
);
}
}
if (extracted) {
return {
listingUrls,
photoReview: {
status: "completed",
source: extracted.source,
photoCount: extracted.photoCount,
expectedPhotoCount: extracted.expectedPhotoCount ?? null,
imageUrls: extracted.imageUrls,
attempts,
summary:
"Photo URLs were collected successfully. A decision-grade condition read still requires reviewing the extracted image set.",
}
};
}
if (!candidates.length) {
attempts.push(
"No supported Zillow or HAR listing URL was available for photo extraction."
);
}
return {
listingUrls,
photoReview: {
status: "not completed",
source: candidates.length ? "listing source attempted" : "no supported listing source",
attempts,
summary:
"Condition review is incomplete until Zillow or HAR photos are extracted and inspected."
}
};
}
export function buildAssessmentReportPayload( export function buildAssessmentReportPayload(
options: AssessPropertyOptions, options: AssessPropertyOptions,
publicRecords: PublicRecordsResolution publicRecords: PublicRecordsResolution,
listingUrls: Array<{ label: string; url: string }>,
photoReview: Record<string, unknown>
): ReportPayload { ): ReportPayload {
const recipientEmails = asStringArray(options.recipientEmails); const recipientEmails = asStringArray(options.recipientEmails);
const matchedAddress = publicRecords.matchedAddress || options.address; const matchedAddress = publicRecords.matchedAddress || options.address;
const publicRecordLinks = buildPublicRecordLinks(publicRecords); const publicRecordLinks = buildPublicRecordLinks(publicRecords);
const sourceLinks = [...publicRecordLinks]; const sourceLinks = [...publicRecordLinks];
const purpose = normalizePurpose(options.assessmentPurpose || "");
const purposeGuidance = getPurposeGuidance(purpose);
pushLink(sourceLinks, "Listing Source", options.listingSourceUrl); for (const item of listingUrls) {
pushLink(sourceLinks, item.label, item.url);
}
const jurisdiction = const jurisdiction =
publicRecords.county.name && publicRecords.appraisalDistrict publicRecords.county.name && publicRecords.appraisalDistrict
? `${publicRecords.county.name} Appraisal District` ? `${publicRecords.county.name} Appraisal District`
: publicRecords.county.name || publicRecords.state.name || "Official public-record jurisdiction"; : publicRecords.county.name || publicRecords.state.name || "Official public-record jurisdiction";
const photoAttempts = options.listingSourceUrl
? [
`Listing source captured: ${options.listingSourceUrl}`,
"Photo review has not been run yet. Use the approval-safe Zillow/HAR extractor flow before making condition claims."
]
: [
"No listing source URL was provided to the assess helper.",
"Photo review has not been run yet. Use the approval-safe Zillow/HAR extractor flow before making condition claims."
];
return { return {
recipientEmails, recipientEmails,
assessmentPurpose: purposeGuidance.label,
reportTitle: "Property Assessment Report", reportTitle: "Property Assessment Report",
subtitle: "Preliminary address-first intake with public-record enrichment.", subtitle: "Address-first intake with public-record enrichment and approval-safe photo-source orchestration.",
subjectProperty: { subjectProperty: {
address: matchedAddress, address: matchedAddress,
county: publicRecords.county.name || "N/A", county: publicRecords.county.name || "N/A",
@@ -134,38 +321,32 @@ export function buildAssessmentReportPayload(
verdict: { verdict: {
decision: "pending", decision: "pending",
fairValueRange: "Not established", fairValueRange: "Not established",
offerGuidance: offerGuidance: purposeGuidance.verdict
"Preliminary intake only. Official public-record jurisdiction is identified, but listing, photo, comp, and carry analysis still required before a buy/pass/offer conclusion."
}, },
snapshot: [ snapshot: [
`Matched address: ${matchedAddress}`, `Matched address: ${matchedAddress}`,
`Jurisdiction anchor: ${publicRecords.county.name || "Unknown county"}, ${publicRecords.state.code || publicRecords.state.name || "Unknown state"}.`, `Jurisdiction anchor: ${publicRecords.county.name || "Unknown county"}, ${publicRecords.state.code || publicRecords.state.name || "Unknown state"}.`,
publicRecords.geoid ? `Census block GEOID: ${publicRecords.geoid}.` : "Census block GEOID not returned." publicRecords.geoid ? `Census block GEOID: ${publicRecords.geoid}.` : "Census block GEOID not returned.",
purposeGuidance.snapshot
], ],
whatILike: [ whatILike: [
"Address-first flow avoids treating listing-site geo IDs as assessor record keys.", "Address-first flow avoids treating listing-site geo IDs as assessor record keys.",
publicRecords.appraisalDistrict publicRecords.appraisalDistrict
? "Official appraisal-district contact and website were identified from public records." ? "Official appraisal-district contact and website were identified from public records."
: "Official public-record geography was identified." : "Official public-record geography was identified.",
purposeGuidance.like
], ],
whatIDontLike: [ whatIDontLike: [
"This first-pass assess helper does not yet include listing facts, comp analysis, or a completed photo review.", "This assess helper still needs listing facts, comp analysis, and a human/model review of the extracted photo set before any final valuation claim.",
"Do not make valuation or condition claims from this preliminary output alone." purposeGuidance.caution
], ],
compView: [ compView: [purposeGuidance.comp],
"Comp analysis not yet run. Pull same-building or nearby comps before setting fair value." carryView: [purposeGuidance.carry],
risksAndDiligence: [
...publicRecords.lookupRecommendations,
purposeGuidance.diligence
], ],
carryView: [ photoReview,
"Carry-cost underwriting not yet run. Add taxes, HOA, insurance, maintenance, and vacancy assumptions before decisioning."
],
risksAndDiligence: publicRecords.lookupRecommendations,
photoReview: {
status: "not completed",
source: options.listingSourceUrl ? "listing source pending review" : "no listing source provided",
attempts: photoAttempts,
summary:
"Condition review is incomplete until listing photos are actually extracted and inspected."
},
publicRecords: { publicRecords: {
jurisdiction, jurisdiction,
accountNumber: options.parcelId || publicRecords.sourceIdentifierHints.parcelId, accountNumber: options.parcelId || publicRecords.sourceIdentifierHints.parcelId,
@@ -184,8 +365,24 @@ export async function assessProperty(
options: AssessPropertyOptions, options: AssessPropertyOptions,
deps: AssessPropertyDeps = {} deps: AssessPropertyDeps = {}
): Promise<AssessPropertyResult> { ): Promise<AssessPropertyResult> {
const purpose = normalizePurpose(options.assessmentPurpose || "");
if (!purpose) {
return {
ok: true,
needsAssessmentPurpose: true,
needsRecipientEmails: false,
message:
"Missing assessment purpose. Stop and ask the user why they want this property assessed before producing a decision-grade analysis.",
outputPath: null,
reportPayload: null,
publicRecords: null
};
}
const resolvePublicRecordsFn = deps.resolvePublicRecordsFn || resolvePublicRecords; const resolvePublicRecordsFn = deps.resolvePublicRecordsFn || resolvePublicRecords;
const renderReportPdfFn = deps.renderReportPdfFn || renderReportPdf; const renderReportPdfFn = deps.renderReportPdfFn || renderReportPdf;
const discoverListingSourcesFn = deps.discoverListingSourcesFn || discoverListingSources;
const extractPhotoDataFn = deps.extractPhotoDataFn || extractPhotoData;
const publicRecords = await resolvePublicRecordsFn(options.address, { const publicRecords = await resolvePublicRecordsFn(options.address, {
parcelId: options.parcelId, parcelId: options.parcelId,
@@ -193,12 +390,24 @@ export async function assessProperty(
listingSourceUrl: options.listingSourceUrl listingSourceUrl: options.listingSourceUrl
}); });
const reportPayload = buildAssessmentReportPayload(options, publicRecords); const photoResolution = await resolvePhotoReview(
{ ...options, assessmentPurpose: purpose },
discoverListingSourcesFn,
extractPhotoDataFn
);
const reportPayload = buildAssessmentReportPayload(
{ ...options, assessmentPurpose: purpose },
publicRecords,
photoResolution.listingUrls,
photoResolution.photoReview
);
const recipientEmails = asStringArray(options.recipientEmails); const recipientEmails = asStringArray(options.recipientEmails);
if (!recipientEmails.length) { if (!recipientEmails.length) {
return { return {
ok: true, ok: true,
needsAssessmentPurpose: false,
needsRecipientEmails: true, needsRecipientEmails: true,
message: message:
"Missing target email. Stop and ask the user for target email address(es) before generating or sending the property assessment PDF.", "Missing target email. Stop and ask the user for target email address(es) before generating or sending the property assessment PDF.",
@@ -218,6 +427,7 @@ export async function assessProperty(
const renderedPath = await renderReportPdfFn(reportPayload, outputPath); const renderedPath = await renderReportPdfFn(reportPayload, outputPath);
return { return {
ok: true, ok: true,
needsAssessmentPurpose: false,
needsRecipientEmails: false, needsRecipientEmails: false,
message: `Property assessment PDF rendered: ${renderedPath}`, message: `Property assessment PDF rendered: ${renderedPath}`,
outputPath: renderedPath, outputPath: renderedPath,

View File

@@ -9,7 +9,7 @@ import { ReportValidationError, loadReportPayload, renderReportPdf } from "./rep
function usage(): void { function usage(): void {
process.stdout.write(`property-assessor\n process.stdout.write(`property-assessor\n
Commands: Commands:
assess --address "<address>" [--recipient-email "<email>"] [--output "<report.pdf>"] [--parcel-id "<id>"] [--listing-geo-id "<id>"] [--listing-source-url "<url>"] assess --address "<address>" --assessment-purpose "<purpose>" [--recipient-email "<email>"] [--output "<report.pdf>"] [--parcel-id "<id>"] [--listing-geo-id "<id>"] [--listing-source-url "<url>"]
locate-public-records --address "<address>" [--parcel-id "<id>"] [--listing-geo-id "<id>"] [--listing-source-url "<url>"] locate-public-records --address "<address>" [--parcel-id "<id>"] [--listing-geo-id "<id>"] [--listing-source-url "<url>"]
render-report --input "<payload.json>" --output "<report.pdf>" render-report --input "<payload.json>" --output "<report.pdf>"
`); `);
@@ -17,7 +17,16 @@ Commands:
async function main(): Promise<void> { async function main(): Promise<void> {
const argv = minimist(process.argv.slice(2), { const argv = minimist(process.argv.slice(2), {
string: ["address", "parcel-id", "listing-geo-id", "listing-source-url", "input", "output"], string: [
"address",
"assessment-purpose",
"recipient-email",
"parcel-id",
"listing-geo-id",
"listing-source-url",
"input",
"output"
],
alias: { alias: {
h: "help" h: "help"
} }
@@ -35,6 +44,7 @@ async function main(): Promise<void> {
} }
const payload = await assessProperty({ const payload = await assessProperty({
address: argv.address, address: argv.address,
assessmentPurpose: argv["assessment-purpose"],
recipientEmails: argv["recipient-email"], recipientEmails: argv["recipient-email"],
output: argv.output, output: argv.output,
parcelId: argv["parcel-id"], parcelId: argv["parcel-id"],

View File

@@ -0,0 +1,66 @@
import { execFile } from "node:child_process";
import { promisify } from "node:util";
const execFileAsync = promisify(execFile);
export interface ListingDiscoveryResult {
attempts: string[];
zillowUrl: string | null;
harUrl: string | null;
}
function parseJsonOutput(raw: string, context: string): any {
const text = raw.trim();
if (!text) {
throw new Error(`${context} produced no JSON output.`);
}
return JSON.parse(text);
}
async function runDiscoveryScript(
scriptName: string,
address: string
): Promise<{ listingUrl: string | null; attempts: string[] }> {
const scriptPath = `/Users/stefano/.openclaw/workspace/skills/web-automation/scripts/${scriptName}`;
const { stdout } = await execFileAsync(process.execPath, [scriptPath, address], {
timeout: 120000,
maxBuffer: 2 * 1024 * 1024
});
const payload = parseJsonOutput(stdout, scriptName);
return {
listingUrl: typeof payload.listingUrl === "string" && payload.listingUrl.trim() ? payload.listingUrl.trim() : null,
attempts: Array.isArray(payload.attempts) ? payload.attempts.map(String) : []
};
}
export async function discoverListingSources(address: string): Promise<ListingDiscoveryResult> {
const attempts: string[] = [];
let zillowUrl: string | null = null;
let harUrl: string | null = null;
try {
const result = await runDiscoveryScript("zillow-discover.js", address);
zillowUrl = result.listingUrl;
attempts.push(...result.attempts);
} catch (error) {
attempts.push(
`Zillow discovery failed: ${error instanceof Error ? error.message : String(error)}`
);
}
try {
const result = await runDiscoveryScript("har-discover.js", address);
harUrl = result.listingUrl;
attempts.push(...result.attempts);
} catch (error) {
attempts.push(
`HAR discovery failed: ${error instanceof Error ? error.message : String(error)}`
);
}
return {
attempts,
zillowUrl,
harUrl
};
}

View File

@@ -0,0 +1,56 @@
import { execFile } from "node:child_process";
import { promisify } from "node:util";
const execFileAsync = promisify(execFile);
export type PhotoSource = "zillow" | "har";
export interface PhotoExtractionResult {
source: PhotoSource;
requestedUrl: string;
finalUrl?: string;
expectedPhotoCount?: number | null;
complete?: boolean;
photoCount: number;
imageUrls: string[];
notes: string[];
}
export interface PhotoReviewResolution {
review: Record<string, unknown>;
discoveredListingUrls: Array<{ label: string; url: string }>;
}
function parseJsonOutput(raw: string, context: string): any {
const text = raw.trim();
if (!text) {
throw new Error(`${context} produced no JSON output.`);
}
return JSON.parse(text);
}
export async function extractPhotoData(
source: PhotoSource,
url: string
): Promise<PhotoExtractionResult> {
const scriptMap: Record<PhotoSource, string> = {
zillow: "zillow-photos.js",
har: "har-photos.js"
};
const scriptPath = `/Users/stefano/.openclaw/workspace/skills/web-automation/scripts/${scriptMap[source]}`;
const { stdout } = await execFileAsync(process.execPath, [scriptPath, url], {
timeout: 180000,
maxBuffer: 5 * 1024 * 1024
});
const payload = parseJsonOutput(stdout, scriptMap[source]);
return {
source,
requestedUrl: String(payload.requestedUrl || url),
finalUrl: typeof payload.finalUrl === "string" ? payload.finalUrl : undefined,
expectedPhotoCount: typeof payload.expectedPhotoCount === "number" ? payload.expectedPhotoCount : null,
complete: Boolean(payload.complete),
photoCount: Number(payload.photoCount || 0),
imageUrls: Array.isArray(payload.imageUrls) ? payload.imageUrls.map(String) : [],
notes: Array.isArray(payload.notes) ? payload.notes.map(String) : []
};
}

View File

@@ -6,6 +6,7 @@ export class ReportValidationError extends Error {}
export interface ReportPayload { export interface ReportPayload {
recipientEmails?: string[] | string; recipientEmails?: string[] | string;
assessmentPurpose?: string;
reportTitle?: string; reportTitle?: string;
subtitle?: string; subtitle?: string;
generatedAt?: string; generatedAt?: string;
@@ -258,6 +259,7 @@ export async function renderReportPdf(
drawKeyValueTable(doc, [ drawKeyValueTable(doc, [
["Address", String(subject.address || "N/A")], ["Address", String(subject.address || "N/A")],
["Assessment Purpose", String(payload.assessmentPurpose || "N/A")],
["Ask / Last Price", currency(subject.listingPrice)], ["Ask / Last Price", currency(subject.listingPrice)],
["Type", String(subject.propertyType || "N/A")], ["Type", String(subject.propertyType || "N/A")],
["Beds / Baths", `${subject.beds ?? "N/A"} / ${subject.baths ?? "N/A"}`], ["Beds / Baths", `${subject.beds ?? "N/A"} / ${subject.baths ?? "N/A"}`],

View File

@@ -47,32 +47,120 @@ const samplePublicRecords: PublicRecordsResolution = {
} }
}; };
test("assessProperty auto-enriches public-record data from address and asks for recipient email", async () => { test("assessProperty asks for assessment purpose before building a decision-grade assessment", async () => {
const result = await assessProperty( const result = await assessProperty(
{ {
address: "4141 Whiteley Dr, Corpus Christi, TX 78418", address: "4141 Whiteley Dr, Corpus Christi, TX 78418"
listingSourceUrl: "https://www.zillow.com/homedetails/example",
listingGeoId: "233290",
parcelId: "14069438"
},
{
resolvePublicRecordsFn: async () => samplePublicRecords
} }
); );
assert.equal(result.ok, true); assert.equal(result.ok, true);
assert.equal(result.needsAssessmentPurpose, true);
assert.equal(result.needsRecipientEmails, false);
assert.equal(result.outputPath, null);
assert.match(result.message, /assessment purpose/i);
assert.equal(result.reportPayload, null);
});
test("assessProperty auto-discovers listing sources, runs Zillow photos first, and asks for recipient email", async () => {
const result = await assessProperty(
{
address: "4141 Whiteley Dr, Corpus Christi, TX 78418",
assessmentPurpose: "investment property",
listingGeoId: "233290",
parcelId: "14069438"
},
{
resolvePublicRecordsFn: async () => samplePublicRecords,
discoverListingSourcesFn: async () => ({
attempts: [
"Zillow discovery located a property page from the address.",
"HAR discovery located a property page from the address."
],
zillowUrl:
"https://www.zillow.com/homedetails/4141-Whiteley-Dr-Corpus-Christi-TX-78418/2103723704_zpid/",
harUrl:
"https://www.har.com/homedetail/4141-whiteley-dr-corpus-christi-tx-78418/14069438"
}),
extractPhotoDataFn: async (source, url) => ({
source,
requestedUrl: url,
finalUrl: url,
expectedPhotoCount: 29,
complete: true,
photoCount: 29,
imageUrls: ["https://photos.example/1.jpg", "https://photos.example/2.jpg"],
notes: [`${source} extractor succeeded.`]
})
}
);
assert.equal(result.ok, true);
assert.equal(result.needsAssessmentPurpose, false);
assert.equal(result.needsRecipientEmails, true); assert.equal(result.needsRecipientEmails, true);
assert.equal(result.outputPath, null); assert.equal(result.outputPath, null);
assert.match(result.message, /target email/i); assert.match(result.message, /target email/i);
assert.equal(result.reportPayload.subjectProperty?.address, samplePublicRecords.matchedAddress); assert.equal(result.reportPayload?.subjectProperty?.address, samplePublicRecords.matchedAddress);
assert.equal(result.reportPayload.publicRecords?.jurisdiction, "Nueces County Appraisal District"); assert.equal(result.reportPayload?.publicRecords?.jurisdiction, "Nueces County Appraisal District");
assert.equal(result.reportPayload.publicRecords?.accountNumber, "14069438"); assert.equal(result.reportPayload?.publicRecords?.accountNumber, "14069438");
assert.equal(result.reportPayload.photoReview?.status, "not completed"); assert.equal(result.reportPayload?.photoReview?.status, "completed");
assert.equal(result.reportPayload?.photoReview?.source, "zillow");
assert.equal(result.reportPayload?.photoReview?.photoCount, 29);
assert.match( assert.match(
String(result.reportPayload.verdict?.offerGuidance), String(result.reportPayload?.verdict?.offerGuidance),
/listing, photo, comp, and carry analysis still required/i /investment property/i
);
assert.match(
String(result.reportPayload?.carryView?.[0]),
/income property/i
);
assert.deepEqual(result.reportPayload?.recipientEmails, []);
});
test("assessProperty falls back to HAR when Zillow photo extraction fails", async () => {
const result = await assessProperty(
{
address: "4141 Whiteley Dr, Corpus Christi, TX 78418",
assessmentPurpose: "vacation home"
},
{
resolvePublicRecordsFn: async () => samplePublicRecords,
discoverListingSourcesFn: async () => ({
attempts: ["Address-based discovery found Zillow and HAR candidates."],
zillowUrl:
"https://www.zillow.com/homedetails/4141-Whiteley-Dr-Corpus-Christi-TX-78418/2103723704_zpid/",
harUrl:
"https://www.har.com/homedetail/4141-whiteley-dr-corpus-christi-tx-78418/14069438"
}),
extractPhotoDataFn: async (source, url) => {
if (source === "zillow") {
throw new Error(`zillow failed for ${url}`);
}
return {
source,
requestedUrl: url,
finalUrl: url,
expectedPhotoCount: 29,
complete: true,
photoCount: 29,
imageUrls: ["https://photos.har.example/1.jpg"],
notes: ["HAR extractor succeeded after Zillow failed."]
};
}
}
);
assert.equal(result.ok, true);
assert.equal(result.reportPayload?.photoReview?.status, "completed");
assert.equal(result.reportPayload?.photoReview?.source, "har");
assert.match(
String(result.reportPayload?.photoReview?.attempts?.join(" ")),
/zillow/i
);
assert.match(
String(result.reportPayload?.verdict?.offerGuidance),
/vacation home/i
); );
assert.deepEqual(result.reportPayload.recipientEmails, []);
}); });
test("assessProperty renders a PDF when recipient email is present", async () => { test("assessProperty renders a PDF when recipient email is present", async () => {
@@ -80,15 +168,22 @@ test("assessProperty renders a PDF when recipient email is present", async () =>
const result = await assessProperty( const result = await assessProperty(
{ {
address: "4141 Whiteley Dr, Corpus Christi, TX 78418", address: "4141 Whiteley Dr, Corpus Christi, TX 78418",
assessmentPurpose: "rental for my daughter in college",
recipientEmails: ["buyer@example.com"], recipientEmails: ["buyer@example.com"],
output: outputPath output: outputPath
}, },
{ {
resolvePublicRecordsFn: async () => samplePublicRecords resolvePublicRecordsFn: async () => samplePublicRecords,
discoverListingSourcesFn: async () => ({
attempts: ["No listing sources discovered from the address."],
zillowUrl: null,
harUrl: null
})
} }
); );
assert.equal(result.ok, true); assert.equal(result.ok, true);
assert.equal(result.needsAssessmentPurpose, false);
assert.equal(result.needsRecipientEmails, false); assert.equal(result.needsRecipientEmails, false);
assert.equal(result.outputPath, outputPath); assert.equal(result.outputPath, outputPath);
const stat = await fs.promises.stat(outputPath); const stat = await fs.promises.stat(outputPath);

View File

@@ -8,6 +8,7 @@ import { ReportValidationError, renderReportPdf } from "../src/report-pdf.js";
const samplePayload = { const samplePayload = {
recipientEmails: ["buyer@example.com"], recipientEmails: ["buyer@example.com"],
assessmentPurpose: "investment property",
reportTitle: "Property Assessment Report", reportTitle: "Property Assessment Report",
subjectProperty: { subjectProperty: {
address: "4141 Whiteley Dr, Corpus Christi, TX 78418", address: "4141 Whiteley Dr, Corpus Christi, TX 78418",

View File

@@ -1,3 +1,7 @@
node_modules/ node_modules/
*.log *.log
.DS_Store .DS_Store
scripts/page-*.html
scripts/tmp_*.mjs
scripts/expedia_dump.mjs
scripts/pnpm-workspace.yaml

View File

@@ -66,6 +66,8 @@ pnpm rebuild better-sqlite3 esbuild
## Quick Reference ## Quick Reference
- Install check: `node check-install.js` - Install check: `node check-install.js`
- Zillow listing discovery from address: `node scripts/zillow-discover.js "4141 Whiteley Dr, Corpus Christi, TX 78418"`
- HAR listing discovery from address: `node scripts/har-discover.js "4141 Whiteley Dr, Corpus Christi, TX 78418"`
- One-shot JSON extract: `node scripts/extract.js "https://example.com"` - One-shot JSON extract: `node scripts/extract.js "https://example.com"`
- Zillow photo URLs: `node scripts/zillow-photos.js "https://www.zillow.com/homedetails/..."` - Zillow photo URLs: `node scripts/zillow-photos.js "https://www.zillow.com/homedetails/..."`
- HAR photo URLs: `node scripts/har-photos.js "https://www.har.com/homedetail/..."` - HAR photo URLs: `node scripts/har-photos.js "https://www.har.com/homedetail/..."`
@@ -134,10 +136,17 @@ npx tsx flow.ts --instruction 'go to https://search.fiorinis.com then type "pipp
Use the dedicated extractors before trying a free-form gallery flow. Use the dedicated extractors before trying a free-form gallery flow.
- Zillow discovery: `node scripts/zillow-discover.js "<street-address>"`
- HAR discovery: `node scripts/har-discover.js "<street-address>"`
- Zillow: `node scripts/zillow-photos.js "<listing-url>"` - Zillow: `node scripts/zillow-photos.js "<listing-url>"`
- HAR: `node scripts/har-photos.js "<listing-url>"` - HAR: `node scripts/har-photos.js "<listing-url>"`
These scripts are purpose-built for the common `See all photos` / `Show all photos` workflow: The discovery scripts are purpose-built for the common address-to-listing workflow:
- open the site search or address URL
- resolve or identify a matching listing page when possible
- return the direct listing URL as JSON
The photo scripts are purpose-built for the common `See all photos` / `Show all photos` workflow:
- open the listing page - open the listing page
- click the all-photos entry point - click the all-photos entry point
- wait for the resulting photo page or scroller view - wait for the resulting photo page or scroller view

View File

@@ -0,0 +1,139 @@
#!/usr/bin/env node
import {
createPageSession,
dismissCommonOverlays,
fail,
gotoListing,
sleep,
} from "./real-estate-photo-common.js";
function parseAddress(rawAddress) {
const address = String(rawAddress || "").trim();
if (!address) {
fail("Missing address.");
}
return address;
}
function buildSearchUrl(address) {
return `https://www.har.com/search/?q=${encodeURIComponent(address)}`;
}
function buildAddressTokens(address) {
return address
.toLowerCase()
.replace(/[^a-z0-9\s]/g, " ")
.split(/\s+/)
.filter(Boolean)
.filter((token) => !new Set(["tx", "dr", "st", "rd", "ave", "blvd", "ct", "ln", "cir"]).has(token));
}
function normalizeListingUrl(url) {
try {
const parsed = new URL(url);
parsed.search = "";
parsed.hash = "";
return parsed.toString();
} catch {
return null;
}
}
async function collectListingUrl(page) {
return page.evaluate(() => {
const toAbsolute = (href) => {
try {
return new URL(href, location.href).toString();
} catch {
return null;
}
};
const candidates = [];
for (const anchor of document.querySelectorAll('a[href*="/homedetail/"]')) {
const href = anchor.getAttribute("href");
if (!href) continue;
const absolute = toAbsolute(href);
if (!absolute) continue;
const text = (anchor.textContent || "").replace(/\s+/g, " ").trim();
const parentText = (anchor.parentElement?.textContent || "").replace(/\s+/g, " ").trim();
candidates.push({
url: absolute,
text,
parentText,
});
}
const unique = [];
for (const candidate of candidates) {
if (!unique.some((item) => item.url === candidate.url)) unique.push(candidate);
}
return unique;
});
}
async function main() {
const address = parseAddress(process.argv[2]);
const searchUrl = buildSearchUrl(address);
const { context, page } = await createPageSession({ headless: process.env.HEADLESS !== "false" });
try {
const attempts = [`Opened HAR search URL: ${searchUrl}`];
await gotoListing(page, searchUrl, 2500);
await dismissCommonOverlays(page);
await sleep(1500);
let listingUrl = null;
const addressTokens = buildAddressTokens(address);
if (page.url().includes("/homedetail/")) {
listingUrl = normalizeListingUrl(page.url());
attempts.push("HAR search URL resolved directly to a property page.");
} else {
const discovered = await collectListingUrl(page);
const scored = discovered
.map((candidate) => {
const haystack = `${candidate.url} ${candidate.text} ${candidate.parentText}`.toLowerCase();
const score = addressTokens.reduce(
(total, token) => total + (haystack.includes(token) ? 1 : 0),
0
);
return { ...candidate, score };
})
.sort((a, b) => b.score - a.score);
if (scored[0] && scored[0].score >= Math.min(3, addressTokens.length)) {
listingUrl = normalizeListingUrl(scored[0].url);
attempts.push(`HAR search results exposed a matching homedetail link with score ${scored[0].score}.`);
} else {
attempts.push("HAR discovery did not expose a confident homedetail match for this address.");
}
}
process.stdout.write(
`${JSON.stringify(
{
source: "har",
address,
searchUrl,
finalUrl: page.url(),
title: await page.title(),
listingUrl,
attempts,
},
null,
2
)}\n`
);
await context.close();
} catch (error) {
try {
await context.close();
} catch {
// Ignore close errors after the primary failure.
}
fail("HAR discovery failed.", error instanceof Error ? error.message : String(error));
}
}
main();

View File

@@ -6,10 +6,12 @@
"scripts": { "scripts": {
"check-install": "node check-install.js", "check-install": "node check-install.js",
"extract": "node extract.js", "extract": "node extract.js",
"har-discover": "node har-discover.js",
"har-photos": "node har-photos.js", "har-photos": "node har-photos.js",
"browse": "tsx browse.ts", "browse": "tsx browse.ts",
"scrape": "tsx scrape.ts", "scrape": "tsx scrape.ts",
"test:photos": "node --test real-estate-photo-common.test.mjs zillow-photo-data.test.mjs", "test:photos": "node --test real-estate-photo-common.test.mjs zillow-photo-data.test.mjs",
"zillow-discover": "node zillow-discover.js",
"zillow-photos": "node zillow-photos.js", "zillow-photos": "node zillow-photos.js",
"fetch-browser": "npx cloakbrowser install" "fetch-browser": "npx cloakbrowser install"
}, },

View File

@@ -0,0 +1,118 @@
#!/usr/bin/env node
import {
createPageSession,
dismissCommonOverlays,
fail,
gotoListing,
sleep,
} from "./real-estate-photo-common.js";
function parseAddress(rawAddress) {
const address = String(rawAddress || "").trim();
if (!address) {
fail("Missing address.");
}
return address;
}
function buildSearchUrl(address) {
const slug = address
.replace(/,/g, "")
.replace(/#/g, "")
.trim()
.split(/\s+/)
.join("-");
return `https://www.zillow.com/homes/${encodeURIComponent(slug)}_rb/`;
}
function normalizeListingUrl(url) {
try {
const parsed = new URL(url);
parsed.search = "";
parsed.hash = "";
return parsed.toString();
} catch {
return null;
}
}
async function collectListingUrl(page) {
return page.evaluate(() => {
const toAbsolute = (href) => {
try {
return new URL(href, location.href).toString();
} catch {
return null;
}
};
const candidates = [];
for (const anchor of document.querySelectorAll('a[href*="/homedetails/"]')) {
const href = anchor.getAttribute("href");
if (!href) continue;
const absolute = toAbsolute(href);
if (!absolute) continue;
candidates.push(absolute);
}
const unique = [];
for (const candidate of candidates) {
if (!unique.includes(candidate)) unique.push(candidate);
}
return unique[0] || null;
});
}
async function main() {
const address = parseAddress(process.argv[2]);
const searchUrl = buildSearchUrl(address);
const { context, page } = await createPageSession({ headless: process.env.HEADLESS !== "false" });
try {
const attempts = [`Opened Zillow address search URL: ${searchUrl}`];
await gotoListing(page, searchUrl, 2500);
await dismissCommonOverlays(page);
await sleep(1500);
let listingUrl = null;
if (page.url().includes("/homedetails/")) {
listingUrl = normalizeListingUrl(page.url());
attempts.push("Zillow search URL resolved directly to a property page.");
} else {
const discovered = await collectListingUrl(page);
if (discovered) {
listingUrl = normalizeListingUrl(discovered);
attempts.push("Zillow search results exposed a homedetails link.");
} else {
attempts.push("Zillow discovery did not expose a homedetails link for this address.");
}
}
process.stdout.write(
`${JSON.stringify(
{
source: "zillow",
address,
searchUrl,
finalUrl: page.url(),
title: await page.title(),
listingUrl,
attempts,
},
null,
2
)}\n`
);
await context.close();
} catch (error) {
try {
await context.close();
} catch {
// Ignore close errors after the primary failure.
}
fail("Zillow discovery failed.", error instanceof Error ? error.message : String(error));
}
}
main();