Files
stef-openclaw-skills/docs/us-cpa.md
2026-03-15 03:11:24 -05:00

7.8 KiB

us-cpa

us-cpa is a Python CLI plus OpenClaw skill wrapper for U.S. federal individual tax work.

Standalone package usage

From skills/us-cpa/:

pip install -e .[dev]
us-cpa --help

Without installing, the repo-local wrapper works directly:

skills/us-cpa/scripts/us-cpa --help

Current Milestone

Current implementation now includes:

  • deterministic cache layout under ~/.cache/us-cpa by default
  • fetch-year download flow for the bootstrap IRS corpus
  • source manifest with URL, hash, authority rank, and local path traceability
  • primary-law URL building for IRC and Treasury regulation escalation
  • case-folder intake, document registration, and machine-usable fact extraction from JSON, text, and PDF inputs
  • question workflow with conversation and memo output
  • prepare workflow for the current supported multi-form 1040 package
  • review workflow with findings-first output
  • fillable-PDF first rendering with overlay fallback
  • e-file-ready draft export payload generation

CLI Surface

skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025
skills/us-cpa/scripts/us-cpa question --question "What is the standard deduction?" --tax-year 2025 --style memo --format markdown
skills/us-cpa/scripts/us-cpa prepare --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa review --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025
skills/us-cpa/scripts/us-cpa extract-docs --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe --create-case --case-label "Jane Doe" --facts-json ./facts.json
skills/us-cpa/scripts/us-cpa render-forms --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe
skills/us-cpa/scripts/us-cpa export-efile-ready --tax-year 2025 --case-dir ~/tax-cases/2025-jane-doe

Tax-Year Cache

Default cache root:

~/.cache/us-cpa

Override for isolated runs:

US_CPA_CACHE_DIR=/tmp/us-cpa-cache skills/us-cpa/scripts/us-cpa fetch-year --tax-year 2025

Current fetch-year bootstrap corpus for tax year 2025 is verified against live IRS irs-prior PDFs for:

  • Form 1040
  • Schedules 1, 2, 3, A, B, C, D, E, SE, and 8812
  • Forms 8949, 4562, 4797, 6251, 8606, 8863, 8889, 8959, 8960, 8995, 8995-A, 5329, 5695, and 1116
  • General Form 1040 instructions and selected schedule/form instructions

Current bundled tax-year computation data:

  • 2024
  • 2025

Other years fetch/source correctly, but deterministic return calculations currently stop with an explicit unsupported-year error until rate tables are added.

Adding a new supported year is a deliberate data-table change in tax_years.py, not an automatic runtime discovery step. That is intentional for tax-engine correctness.

Interaction Model

  • question
    • stateless by default
    • optional case context
  • prepare
    • requires a case directory
    • if none exists, OpenClaw should ask whether to create one and where
  • review
    • requires a case directory
    • can operate on an existing or newly-created review case

Planned Case Layout

<case-dir>/
  input/
  extracted/
  return/
  output/
  reports/
  issues/
  sources/

Current implementation writes:

  • case-manifest.json
  • extracted/facts.json
  • issues/open-issues.json

Intake Flow

Current extract-docs supports:

  • --create-case
  • --case-label
  • --facts-json <path>
  • repeated --input-file <path>

Behavior:

  • creates the full case directory layout when --create-case is used
  • copies input documents into input/
  • stores normalized facts with source metadata in extracted/facts.json
  • extracts machine-usable facts from JSON/text/PDF documents where supported
  • appends document registry entries to case-manifest.json
  • stops with a structured issue and non-zero exit if a new fact conflicts with an existing stored fact

Output Contract

  • JSON by default
  • markdown available with --format markdown
  • question supports:
    • --style conversation
    • --style memo
  • question emits answered analysis output
  • prepare emits a prepared return package summary
  • export-efile-ready emits a draft e-file-ready payload
  • review emits a findings-first review result
  • fetch-year emits a downloaded manifest location and source count

Question Engine

Current question implementation:

  • loads the cached tax-year corpus
  • searches a small IRS-first topical rule set
  • returns one canonical analysis object
  • renders that analysis as:
    • conversational output
    • memo output
  • marks questions outside the current topical rule set as requiring primary-law escalation

Current implemented topics:

  • standard deduction
  • Schedule C / sole proprietorship reporting trigger
  • Schedule D / capital gains reporting trigger
  • Schedule E / rental income reporting trigger

Form Rendering

Current rendering path:

  • official IRS PDFs from the cached tax-year corpus
  • deterministic field-fill when usable AcroForm fields are present
  • overlay rendering onto those official PDFs using reportlab + pypdf as fallback
  • artifact manifest written to output/artifacts.json

Current rendered form support:

  • field-fill support for known mapped fillable forms
  • overlay generation for the current required-form set resolved by the return model

Current review rule:

  • field-filled artifacts are not automatically flagged for review
  • overlay-rendered artifacts are marked reviewRequired: true

Overlay coordinates are currently a fallback heuristic and are not treated as line-perfect authoritative field maps. Overlay output must be visually reviewed before any filing/export handoff.

Preparation Workflow

Current prepare implementation:

  • loads case facts from extracted/facts.json
  • normalizes them into the current supported federal return model
  • preserves source provenance for normalized values
  • computes the current supported 1040 package
  • resolves required forms across the current supported subset
  • writes:
    • return/normalized-return.json
    • output/artifacts.json
    • reports/prepare-summary.json

Current supported calculation inputs:

  • filingStatus
  • spouse.fullName
  • dependents
  • wages
  • taxableInterest
  • businessIncome
  • capitalGainLoss
  • rentalIncome
  • federalWithholding
  • itemizedDeductions
  • hsaContribution
  • educationCredit
  • foreignTaxCredit
  • qualifiedBusinessIncome
  • traditionalIraBasis
  • additionalMedicareTax
  • netInvestmentIncomeTax
  • alternativeMinimumTax
  • additionalTaxPenalty
  • energyCredit
  • depreciationExpense
  • section1231GainLoss

E-file-ready Export

export-efile-ready writes:

  • output/efile-ready.json

Current export behavior:

  • draft-only
  • includes required forms
  • includes refund or balance due summary
  • includes attachment manifest
  • includes unresolved issues

Review Workflow

Current review implementation:

  • recomputes the return from current case facts
  • compares stored normalized return values to recomputed values
  • flags source-fact mismatches for key income fields
  • flags likely omitted income when document-extracted facts support an amount the stored return omits
  • checks whether required rendered artifacts are present
  • flags high-complexity forms for specialist follow-up
  • flags overlay-rendered artifacts as requiring human review
  • sorts findings by severity

Current render modes:

  • --style conversation
  • --style memo

Scope Rules

  • U.S. federal individual returns only in v1
  • official IRS artifacts are the target output for compiled forms
  • conflicting facts must stop the workflow for user resolution

Authority Ranking

Current authority classes are ranked to preserve source hierarchy:

  • IRS forms
  • IRS instructions
  • IRS publications
  • IRS FAQs
  • Internal Revenue Code
  • Treasury regulations
  • other primary authority

Later research and review flows should consume this ranking rather than inventing their own.