Files
stef-openclaw-skills/docs/us-cpa.md
2026-03-15 03:31:52 -05:00

8.6 KiB

us-cpa

us-cpa is a Python CLI plus OpenClaw skill wrapper for U.S. federal individual tax work.

Standalone package usage

From skills/us-cpa/:

pip install -e .[dev]
us-cpa --help

Without installing, the repo-local wrapper works directly:

skills/us-cpa/scripts/us-cpa --help

OpenClaw installation

To install the skill for OpenClaw itself, copy the repo skill into the workspace skill directory and install its Python dependencies there.

  1. Sync the repo copy into the workspace:
rsync -a --delete \
  /Users/stefano/.openclaw/workspace/projects/stef-openclaw-skills/skills/us-cpa/ \
  /Users/stefano/.openclaw/workspace/skills/us-cpa/
  1. Create a workspace-local virtualenv and install the package:
cd /Users/stefano/.openclaw/workspace/skills/us-cpa
python3 -m venv .venv
. .venv/bin/activate
pip install -e .[dev]
  1. Verify the installed workspace wrapper:
~/.openclaw/workspace/skills/us-cpa/scripts/us-cpa --help

The wrapper prefers .venv/bin/python inside the skill directory when present, so OpenClaw can run the workspace copy without relying on global Python packages.

Current Milestone

Current implementation now includes:

  • deterministic cache layout under ~/.cache/us-cpa by default
  • fetch-year download flow for the bootstrap IRS corpus
  • source manifest with URL, hash, authority rank, and local path traceability
  • primary-law URL building for IRC and Treasury regulation escalation
  • case-folder intake, document registration, and machine-usable fact extraction from JSON, text, and PDF inputs
  • question workflow with conversation and memo output
  • prepare workflow for the current supported multi-form 1040 package
  • review workflow with findings-first output
  • fillable-PDF first rendering with overlay fallback
  • e-file-ready draft export payload generation

CLI Surface

skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025
skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025 --style memo --format markdown
skills/us-cpa/scripts/us-cpa prepare --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa review --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025
skills/us-cpa/scripts/us-cpa extract-docs --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe --create-case --case-label "Jane Doe" --facts-json ./facts.json
skills/us-cpa/scripts/us-cpa render-forms --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa export-efile-ready --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe

Tax-Year Cache

Default cache root:

~/.cache/us-cpa

Override for isolated runs:

US_CPA_CACHE_DIR=/tmp/us-cpa-cache skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025

Current fetch-year bootstrap corpus for tax year 2025 is verified against live IRS irs-prior PDFs for:

  • Form 1040
  • Schedules 1, 2, 3, A, B, C, D, E, SE, and 8812
  • Forms 8949, 4562, 4797, 6251, 8606, 8863, 8889, 8959, 8960, 8995, 8995-A, 5329, 5695, and 1116
  • General Form 1040 instructions and selected schedule/form instructions

Current bundled tax-year computation data:

  • 2024
  • 2025

Other years fetch/source correctly, but deterministic return calculations currently stop with an explicit unsupported-year error until rate tables are added.

Adding a new supported year is a deliberate data-table change in tax_years.py, not an automatic runtime discovery step. That is intentional for tax-engine correctness.

Interaction Model

  • question
    • stateless by default
    • optional case context
  • prepare
    • requires a case directory
    • if none exists, OpenClaw should ask whether to create one and where
  • review
    • requires a case directory
    • can operate on an existing or newly-created review case

Planned Case Layout

<case-dir>/
  input/
  extracted/
  return/
  output/
  reports/
  issues/
  sources/

Current implementation writes:

  • case-manifest.json
  • extracted/facts.json
  • issues/open-issues.json

Intake Flow

Current extract-docs supports:

  • --create-case
  • --case-label
  • --facts-json <path>
  • repeated --input-file <path>

Behavior:

  • creates the full case directory layout when --create-case is used
  • copies input documents into input/
  • stores normalized facts with source metadata in extracted/facts.json
  • extracts machine-usable facts from JSON/text/PDF documents where supported
  • appends document registry entries to case-manifest.json
  • stops with a structured issue and non-zero exit if a new fact conflicts with an existing stored fact

Output Contract

  • JSON by default
  • markdown available with --format markdown
  • question supports:
    • --style conversation
    • --style memo
  • question emits answered analysis output
  • prepare emits a prepared return package summary
  • export-efile-ready emits a draft e-file-ready payload
  • review emits a findings-first review result
  • fetch-year emits a downloaded manifest location and source count

Question Engine

Current question implementation:

  • loads the cached tax-year corpus
  • searches a small IRS-first topical rule set
  • returns one canonical analysis object
  • renders that analysis as:
    • conversational output
    • memo output
  • marks questions outside the current topical rule set as requiring primary-law escalation

Current implemented topics:

  • standard deduction
  • Schedule C / sole proprietorship reporting trigger
  • Schedule D / capital gains reporting trigger
  • Schedule E / rental income reporting trigger

Form Rendering

Current rendering path:

  • official IRS PDFs from the cached tax-year corpus
  • deterministic field-fill when usable AcroForm fields are present
  • overlay rendering onto those official PDFs using reportlab + pypdf as fallback
  • artifact manifest written to output/artifacts.json

Current rendered form support:

  • field-fill support for known mapped fillable forms
  • overlay generation for the current required-form set resolved by the return model

Current review rule:

  • field-filled artifacts are not automatically flagged for review
  • overlay-rendered artifacts are marked reviewRequired: true

Overlay coordinates are currently a fallback heuristic and are not treated as line-perfect authoritative field maps. Overlay output must be visually reviewed before any filing/export handoff.

Preparation Workflow

Current prepare implementation:

  • loads case facts from extracted/facts.json
  • normalizes them into the current supported federal return model
  • preserves source provenance for normalized values
  • computes the current supported 1040 package
  • resolves required forms across the current supported subset
  • writes:
    • return/normalized-return.json
    • output/artifacts.json
    • reports/prepare-summary.json

Current supported calculation inputs:

  • filingStatus
  • spouse.fullName
  • dependents
  • wages
  • taxableInterest
  • businessIncome
  • capitalGainLoss
  • rentalIncome
  • federalWithholding
  • itemizedDeductions
  • hsaContribution
  • educationCredit
  • foreignTaxCredit
  • qualifiedBusinessIncome
  • traditionalIraBasis
  • additionalMedicareTax
  • netInvestmentIncomeTax
  • alternativeMinimumTax
  • additionalTaxPenalty
  • energyCredit
  • depreciationExpense
  • section1231GainLoss

E-file-ready Export

export-efile-ready writes:

  • output/efile-ready.json

Current export behavior:

  • draft-only
  • includes required forms
  • includes refund or balance due summary
  • includes attachment manifest
  • includes unresolved issues

Review Workflow

Current review implementation:

  • recomputes the return from current case facts
  • compares stored normalized return values to recomputed values
  • flags source-fact mismatches for key income fields
  • flags likely omitted income when document-extracted facts support an amount the stored return omits
  • checks whether required rendered artifacts are present
  • flags high-complexity forms for specialist follow-up
  • flags overlay-rendered artifacts as requiring human review
  • sorts findings by severity

Current render modes:

  • --style conversation
  • --style memo

Scope Rules

  • U.S. federal individual returns only in v1
  • official IRS artifacts are the target output for compiled forms
  • conflicting facts must stop the workflow for user resolution

Authority Ranking

Current authority classes are ranked to preserve source hierarchy:

  • IRS forms
  • IRS instructions
  • IRS publications
  • IRS FAQs
  • Internal Revenue Code
  • Treasury regulations
  • other primary authority

Later research and review flows should consume this ranking rather than inventing their own.