1.9 KiB
1.9 KiB
Compliance And Failure Modes
This reference is operational guidance, not legal advice. The operator is responsible for making sure a run complies with Amazon terms, robots directives, local law, and account obligations.
Required Guardrails
- Fetch and evaluate
https://www.amazon.com/robots.txtbefore live scraping planned Amazon paths. - Stop if the effective rules disallow the planned search or detail paths.
- Do not automate sign-in, checkout, cart, wishlist, review submission, customer-review pages, reviewer profiles, or any disallowed path.
- Do not bypass CAPTCHA, bot checks, blocked pages, or access-denied pages.
- Do not print cookies, profile state, session storage, or account/location-specific browser data.
Allowed Scope
Allowed behavior is bounded read-only product research over search result pages and normalized product detail pages:
/s?k=<query>search results./dp/<ASIN>product details./gp/product/<ASIN>product details.
Review data is limited to visible summary ratings/counts and visible histogram rows on search/detail pages. Do not navigate to /product-reviews, /review, /gp/customer-reviews, or review AJAX endpoints.
Failure Modes
Return a structured warning and do not claim success when any of these happen:
- CAPTCHA or bot-check page.
- Sign-in wall.
- HTTP 429 or 503 that remains after the bounded retry budget.
- Robots rules disallow a planned path.
- Product markup changes enough that required fields cannot be found.
- Amazon returns localized, personalized, or ZIP/session-dependent delivery text that cannot be verified.
Output Rules
- Unknown fields stay unknown.
- Partial extraction is acceptable only when the response includes warnings and missing-field notes.
- Sponsored products can be returned by default but must be labeled.
- Counts above 30 require operator confirmation or batch splitting.