8.6 KiB
us-cpa
us-cpa is a Python CLI plus OpenClaw skill wrapper for U.S. federal individual tax work.
Standalone package usage
From skills/us-cpa/:
pip install -e .[dev]
us-cpa --help
Without installing, the repo-local wrapper works directly:
skills/us-cpa/scripts/us-cpa --help
OpenClaw installation
To install the skill for OpenClaw itself, copy the repo skill into the workspace skill directory and install its Python dependencies there.
- Sync the repo copy into the workspace:
rsync -a --delete \
~/.openclaw/workspace/projects/stef-openclaw-skills/skills/us-cpa/ \
~/.openclaw/workspace/skills/us-cpa/
- Create a workspace-local virtualenv and install the package:
cd ~/.openclaw/workspace/skills/us-cpa
python3 -m venv .venv
. .venv/bin/activate
pip install -e .[dev]
- Verify the installed workspace wrapper:
~/.openclaw/workspace/skills/us-cpa/scripts/us-cpa --help
The wrapper prefers .venv/bin/python inside the skill directory when present, so OpenClaw can run the workspace copy without relying on global Python packages.
Current Milestone
Current implementation now includes:
- deterministic cache layout under
~/.cache/us-cpaby default fetch-yeardownload flow for the bootstrap IRS corpus- source manifest with URL, hash, authority rank, and local path traceability
- primary-law URL building for IRC and Treasury regulation escalation
- case-folder intake, document registration, and machine-usable fact extraction from JSON, text, and PDF inputs
- question workflow with conversation and memo output
- prepare workflow for the current supported multi-form 1040 package
- review workflow with findings-first output
- fillable-PDF first rendering with overlay fallback
- e-file-ready draft export payload generation
CLI Surface
skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025
skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025 --style memo --format markdown
skills/us-cpa/scripts/us-cpa prepare --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa review --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025
skills/us-cpa/scripts/us-cpa extract-docs --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe --create-case --case-label "Jane Doe" --facts-json ./facts.json
skills/us-cpa/scripts/us-cpa render-forms --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa export-efile-ready --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
Tax-Year Cache
Default cache root:
~/.cache/us-cpa
Override for isolated runs:
US_CPA_CACHE_DIR=/tmp/us-cpa-cache skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025
Current fetch-year bootstrap corpus for tax year 2025 is verified against live IRS irs-prior PDFs for:
- Form 1040
- Schedules 1, 2, 3, A, B, C, D, E, SE, and 8812
- Forms 8949, 4562, 4797, 6251, 8606, 8863, 8889, 8959, 8960, 8995, 8995-A, 5329, 5695, and 1116
- General Form 1040 instructions and selected schedule/form instructions
Current bundled tax-year computation data:
- 2024
- 2025
Other years fetch/source correctly, but deterministic return calculations currently stop with an explicit unsupported-year error until rate tables are added.
Adding a new supported year is a deliberate data-table change in tax_years.py, not an automatic runtime discovery step. That is intentional for tax-engine correctness.
Interaction Model
question- stateless by default
- optional case context
prepare- requires a case directory
- if none exists, OpenClaw should ask whether to create one and where
review- requires a case directory
- can operate on an existing or newly-created review case
Planned Case Layout
<case-dir>/
input/
extracted/
return/
output/
reports/
issues/
sources/
Current implementation writes:
case-manifest.jsonextracted/facts.jsonissues/open-issues.json
Intake Flow
Current extract-docs supports:
--create-case--case-label--facts-json <path>- repeated
--input-file <path>
Behavior:
- creates the full case directory layout when
--create-caseis used - copies input documents into
input/ - stores normalized facts with source metadata in
extracted/facts.json - extracts machine-usable facts from JSON/text/PDF documents where supported
- appends document registry entries to
case-manifest.json - stops with a structured issue and non-zero exit if a new fact conflicts with an existing stored fact
Output Contract
- JSON by default
- markdown available with
--format markdown questionsupports:--style conversation--style memo
questionemits answered analysis outputprepareemits a prepared return package summaryexport-efile-readyemits a draft e-file-ready payloadreviewemits a findings-first review resultfetch-yearemits a downloaded manifest location and source count
Question Engine
Current question implementation:
- loads the cached tax-year corpus
- searches a small IRS-first topical rule set
- returns one canonical analysis object
- renders that analysis as:
- conversational output
- memo output
- marks questions outside the current topical rule set as requiring primary-law escalation
Current implemented topics:
- standard deduction
- Schedule C / sole proprietorship reporting trigger
- Schedule D / capital gains reporting trigger
- Schedule E / rental income reporting trigger
Form Rendering
Current rendering path:
- official IRS PDFs from the cached tax-year corpus
- deterministic field-fill when usable AcroForm fields are present
- overlay rendering onto those official PDFs using
reportlab+pypdfas fallback - artifact manifest written to
output/artifacts.json
Current rendered form support:
- field-fill support for known mapped fillable forms
- overlay generation for the current required-form set resolved by the return model
Current review rule:
- field-filled artifacts are not automatically flagged for review
- overlay-rendered artifacts are marked
reviewRequired: true
Overlay coordinates are currently a fallback heuristic and are not treated as line-perfect authoritative field maps. Overlay output must be visually reviewed before any filing/export handoff.
Preparation Workflow
Current prepare implementation:
- loads case facts from
extracted/facts.json - normalizes them into the current supported federal return model
- preserves source provenance for normalized values
- computes the current supported 1040 package
- resolves required forms across the current supported subset
- writes:
return/normalized-return.jsonoutput/artifacts.jsonreports/prepare-summary.json
Current supported calculation inputs:
filingStatusspouse.fullNamedependentswagestaxableInterestbusinessIncomecapitalGainLossrentalIncomefederalWithholdingitemizedDeductionshsaContributioneducationCreditforeignTaxCreditqualifiedBusinessIncometraditionalIraBasisadditionalMedicareTaxnetInvestmentIncomeTaxalternativeMinimumTaxadditionalTaxPenaltyenergyCreditdepreciationExpensesection1231GainLoss
E-file-ready Export
export-efile-ready writes:
output/efile-ready.json
Current export behavior:
- draft-only
- includes required forms
- includes refund or balance due summary
- includes attachment manifest
- includes unresolved issues
Review Workflow
Current review implementation:
- recomputes the return from current case facts
- compares stored normalized return values to recomputed values
- flags source-fact mismatches for key income fields
- flags likely omitted income when document-extracted facts support an amount the stored return omits
- checks whether required rendered artifacts are present
- flags high-complexity forms for specialist follow-up
- flags overlay-rendered artifacts as requiring human review
- sorts findings by severity
Current render modes:
--style conversation--style memo
Scope Rules
- U.S. federal individual returns only in v1
- official IRS artifacts are the target output for compiled forms
- conflicting facts must stop the workflow for user resolution
Authority Ranking
Current authority classes are ranked to preserve source hierarchy:
- IRS forms
- IRS instructions
- IRS publications
- IRS FAQs
- Internal Revenue Code
- Treasury regulations
- other primary authority
Later research and review flows should consume this ranking rather than inventing their own.