299 lines
8.6 KiB
Markdown
299 lines
8.6 KiB
Markdown
# us-cpa
|
|
|
|
`us-cpa` is a Python CLI plus OpenClaw skill wrapper for U.S. federal individual tax work.
|
|
|
|
## Standalone package usage
|
|
|
|
From `skills/us-cpa/`:
|
|
|
|
```bash
|
|
pip install -e .[dev]
|
|
us-cpa --help
|
|
```
|
|
|
|
Without installing, the repo-local wrapper works directly:
|
|
|
|
```bash
|
|
skills/us-cpa/scripts/us-cpa --help
|
|
```
|
|
|
|
## OpenClaw installation
|
|
|
|
To install the skill for OpenClaw itself, copy the repo skill into the workspace skill directory and install its Python dependencies there.
|
|
|
|
1. Sync the repo copy into the workspace:
|
|
|
|
```bash
|
|
rsync -a --delete \
|
|
/Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/skills/us-cpa/ \
|
|
/Users/stefano/.openclaw/workspace/skills/us-cpa/
|
|
```
|
|
|
|
2. Create a workspace-local virtualenv and install the package:
|
|
|
|
```bash
|
|
cd /Users/stefano/.openclaw/workspace/skills/us-cpa
|
|
python3 -m venv .venv
|
|
. .venv/bin/activate
|
|
pip install -e .[dev]
|
|
```
|
|
|
|
3. Verify the installed workspace wrapper:
|
|
|
|
```bash
|
|
~/.openclaw/workspace/skills/us-cpa/scripts/us-cpa --help
|
|
```
|
|
|
|
The wrapper prefers `.venv/bin/python` inside the skill directory when present, so OpenClaw can run the workspace copy without relying on global Python packages.
|
|
|
|
## Current Milestone
|
|
|
|
Current implementation now includes:
|
|
|
|
- deterministic cache layout under `~/.cache/us-cpa` by default
|
|
- `fetch-year` download flow for the bootstrap IRS corpus
|
|
- source manifest with URL, hash, authority rank, and local path traceability
|
|
- primary-law URL building for IRC and Treasury regulation escalation
|
|
- case-folder intake, document registration, and machine-usable fact extraction from JSON, text, and PDF inputs
|
|
- question workflow with conversation and memo output
|
|
- prepare workflow for the current supported multi-form 1040 package
|
|
- review workflow with findings-first output
|
|
- fillable-PDF first rendering with overlay fallback
|
|
- e-file-ready draft export payload generation
|
|
|
|
## CLI Surface
|
|
|
|
```bash
|
|
skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025
|
|
skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025 --style memo --format markdown
|
|
skills/us-cpa/scripts/us-cpa prepare --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
|
|
skills/us-cpa/scripts/us-cpa review --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
|
|
skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025
|
|
skills/us-cpa/scripts/us-cpa extract-docs --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe --create-case --case-label "Jane Doe" --facts-json ./facts.json
|
|
skills/us-cpa/scripts/us-cpa render-forms --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
|
|
skills/us-cpa/scripts/us-cpa export-efile-ready --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
|
|
```
|
|
|
|
## Tax-Year Cache
|
|
|
|
Default cache root:
|
|
|
|
```text
|
|
~/.cache/us-cpa
|
|
```
|
|
|
|
Override for isolated runs:
|
|
|
|
```bash
|
|
US_CPA_CACHE_DIR=/tmp/us-cpa-cache skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025
|
|
```
|
|
|
|
Current `fetch-year` bootstrap corpus for tax year `2025` is verified against live IRS `irs-prior` PDFs for:
|
|
|
|
- Form 1040
|
|
- Schedules 1, 2, 3, A, B, C, D, E, SE, and 8812
|
|
- Forms 8949, 4562, 4797, 6251, 8606, 8863, 8889, 8959, 8960, 8995, 8995-A, 5329, 5695, and 1116
|
|
- General Form 1040 instructions and selected schedule/form instructions
|
|
|
|
Current bundled tax-year computation data:
|
|
|
|
- 2024
|
|
- 2025
|
|
|
|
Other years fetch/source correctly, but deterministic return calculations currently stop with an explicit unsupported-year error until rate tables are added.
|
|
|
|
Adding a new supported year is a deliberate data-table change in `tax_years.py`, not an automatic runtime discovery step. That is intentional for tax-engine correctness.
|
|
|
|
## Interaction Model
|
|
|
|
- `question`
|
|
- stateless by default
|
|
- optional case context
|
|
- `prepare`
|
|
- requires a case directory
|
|
- if none exists, OpenClaw should ask whether to create one and where
|
|
- `review`
|
|
- requires a case directory
|
|
- can operate on an existing or newly-created review case
|
|
|
|
## Planned Case Layout
|
|
|
|
```text
|
|
<case-dir>/
|
|
input/
|
|
extracted/
|
|
return/
|
|
output/
|
|
reports/
|
|
issues/
|
|
sources/
|
|
```
|
|
|
|
Current implementation writes:
|
|
|
|
- `case-manifest.json`
|
|
- `extracted/facts.json`
|
|
- `issues/open-issues.json`
|
|
|
|
## Intake Flow
|
|
|
|
Current `extract-docs` supports:
|
|
|
|
- `--create-case`
|
|
- `--case-label`
|
|
- `--facts-json <path>`
|
|
- repeated `--input-file <path>`
|
|
|
|
Behavior:
|
|
|
|
- creates the full case directory layout when `--create-case` is used
|
|
- copies input documents into `input/`
|
|
- stores normalized facts with source metadata in `extracted/facts.json`
|
|
- extracts machine-usable facts from JSON/text/PDF documents where supported
|
|
- appends document registry entries to `case-manifest.json`
|
|
- stops with a structured issue and non-zero exit if a new fact conflicts with an existing stored fact
|
|
|
|
## Output Contract
|
|
|
|
- JSON by default
|
|
- markdown available with `--format markdown`
|
|
- `question` supports:
|
|
- `--style conversation`
|
|
- `--style memo`
|
|
- `question` emits answered analysis output
|
|
- `prepare` emits a prepared return package summary
|
|
- `export-efile-ready` emits a draft e-file-ready payload
|
|
- `review` emits a findings-first review result
|
|
- `fetch-year` emits a downloaded manifest location and source count
|
|
|
|
## Question Engine
|
|
|
|
Current `question` implementation:
|
|
|
|
- loads the cached tax-year corpus
|
|
- searches a small IRS-first topical rule set
|
|
- returns one canonical analysis object
|
|
- renders that analysis as:
|
|
- conversational output
|
|
- memo output
|
|
- marks questions outside the current topical rule set as requiring primary-law escalation
|
|
|
|
Current implemented topics:
|
|
|
|
- standard deduction
|
|
- Schedule C / sole proprietorship reporting trigger
|
|
- Schedule D / capital gains reporting trigger
|
|
- Schedule E / rental income reporting trigger
|
|
|
|
## Form Rendering
|
|
|
|
Current rendering path:
|
|
|
|
- official IRS PDFs from the cached tax-year corpus
|
|
- deterministic field-fill when usable AcroForm fields are present
|
|
- overlay rendering onto those official PDFs using `reportlab` + `pypdf` as fallback
|
|
- artifact manifest written to `output/artifacts.json`
|
|
|
|
Current rendered form support:
|
|
|
|
- field-fill support for known mapped fillable forms
|
|
- overlay generation for the current required-form set resolved by the return model
|
|
|
|
Current review rule:
|
|
|
|
- field-filled artifacts are not automatically flagged for review
|
|
- overlay-rendered artifacts are marked `reviewRequired: true`
|
|
|
|
Overlay coordinates are currently a fallback heuristic and are not treated as line-perfect authoritative field maps. Overlay output must be visually reviewed before any filing/export handoff.
|
|
|
|
## Preparation Workflow
|
|
|
|
Current `prepare` implementation:
|
|
|
|
- loads case facts from `extracted/facts.json`
|
|
- normalizes them into the current supported federal return model
|
|
- preserves source provenance for normalized values
|
|
- computes the current supported 1040 package
|
|
- resolves required forms across the current supported subset
|
|
- writes:
|
|
- `return/normalized-return.json`
|
|
- `output/artifacts.json`
|
|
- `reports/prepare-summary.json`
|
|
|
|
Current supported calculation inputs:
|
|
|
|
- `filingStatus`
|
|
- `spouse.fullName`
|
|
- `dependents`
|
|
- `wages`
|
|
- `taxableInterest`
|
|
- `businessIncome`
|
|
- `capitalGainLoss`
|
|
- `rentalIncome`
|
|
- `federalWithholding`
|
|
- `itemizedDeductions`
|
|
- `hsaContribution`
|
|
- `educationCredit`
|
|
- `foreignTaxCredit`
|
|
- `qualifiedBusinessIncome`
|
|
- `traditionalIraBasis`
|
|
- `additionalMedicareTax`
|
|
- `netInvestmentIncomeTax`
|
|
- `alternativeMinimumTax`
|
|
- `additionalTaxPenalty`
|
|
- `energyCredit`
|
|
- `depreciationExpense`
|
|
- `section1231GainLoss`
|
|
|
|
## E-file-ready Export
|
|
|
|
`export-efile-ready` writes:
|
|
|
|
- `output/efile-ready.json`
|
|
|
|
Current export behavior:
|
|
|
|
- draft-only
|
|
- includes required forms
|
|
- includes refund or balance due summary
|
|
- includes attachment manifest
|
|
- includes unresolved issues
|
|
|
|
## Review Workflow
|
|
|
|
Current `review` implementation:
|
|
|
|
- recomputes the return from current case facts
|
|
- compares stored normalized return values to recomputed values
|
|
- flags source-fact mismatches for key income fields
|
|
- flags likely omitted income when document-extracted facts support an amount the stored return omits
|
|
- checks whether required rendered artifacts are present
|
|
- flags high-complexity forms for specialist follow-up
|
|
- flags overlay-rendered artifacts as requiring human review
|
|
- sorts findings by severity
|
|
|
|
Current render modes:
|
|
|
|
- `--style conversation`
|
|
- `--style memo`
|
|
|
|
## Scope Rules
|
|
|
|
- U.S. federal individual returns only in v1
|
|
- official IRS artifacts are the target output for compiled forms
|
|
- conflicting facts must stop the workflow for user resolution
|
|
|
|
## Authority Ranking
|
|
|
|
Current authority classes are ranked to preserve source hierarchy:
|
|
|
|
- IRS forms
|
|
- IRS instructions
|
|
- IRS publications
|
|
- IRS FAQs
|
|
- Internal Revenue Code
|
|
- Treasury regulations
|
|
- other primary authority
|
|
|
|
Later research and review flows should consume this ranking rather than inventing their own.
|