asset-history — Design Spec
Section titled “asset-history — Design Spec”Date: 2026-03-23
Status: Approved
Project location: ~/projects/asset-history/ (standalone — not in monorepo)
Purpose
Section titled “Purpose”asset-history is a standalone Python project that fetches, stores, maintains, and reconciles historical end-of-day (EOD) financial asset data — prices and dividends — for equity indices and ETFs. It is generic financial infrastructure, not SD App-specific. Multiple future projects will consume it.
Primary consumers (initially):
- SD App historical analysis tool (
packages/sd-math/tools/generate_historical.py) — feeds pre-calculated JSON snapshots for the “If I had used $MART DEBT” renderer - Any future FSS financial application requiring historical return data
Non-Goals
Section titled “Non-Goals”- No math or analysis —
asset-historyis a data layer only - No browser/PWA component — all consumers query via Python import or CLI export
- No real-time or intraday data — EOD only
- No user authentication or multi-user support
Project Structure
Section titled “Project Structure”~/projects/asset-history/ src/asset_history/ sources/ base.py # Abstract base class all source adapters implement yahoo.py # yfinance — primary source for all tickers bank_of_canada.py # CA price validation (valet API, government-independent) tiingo.py # Dividend cross-reference + US validation (free tier) fred.py # US price validation (FRED API, government-independent) db.py # SQLite schema, CRUD, migrations reconcile.py # Conflict detection, flagging, report generation notify.py # Thin wrapper around notify_manager utility cli.py # CLI entrypoints via Typer tests/ fixtures/ xic_sample.json # Synthetic price+dividend data for unit tests (5 years) spy_sample.json # Same for US test_db.py test_reconcile.py test_sources_mock.py test_cli.py test_sources_live.py # @pytest.mark.slow — requires network + API keys data/ asset_history.db # gitignored — generated artifact conflicts.log # gitignored — generated artifact pyproject.toml .env .env.example .gitignore README.mdPython Environment
Section titled “Python Environment”pyproject.toml dependencies:
[project]requires-python = ">=3.11"
dependencies = [ "yfinance>=0.2", "requests>=2.31", "typer>=0.12", "python-dotenv>=1.0", "pyyaml>=6.0", # required by notify_manager]
[project.optional-dependencies]dev = [ "pytest>=8.0", "pytest-mock>=3.12",]
[project.scripts]asset-history = "asset_history.cli:app"Virtual environment: python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"
Database Schema
Section titled “Database Schema”tickers — registered assets
Section titled “tickers — registered assets”CREATE TABLE tickers ( ticker TEXT PRIMARY KEY, name TEXT, country TEXT, -- 'ca' | 'us' asset_type TEXT, -- 'equity-index' | 'etf' | ... sources TEXT, -- JSON array: ["yahoo","bank_of_canada"] added_at TIMESTAMP NOT NULL);price_sources — raw source data (append-only audit trail)
Section titled “price_sources — raw source data (append-only audit trail)”CREATE TABLE price_sources ( ticker TEXT NOT NULL, date DATE NOT NULL, source TEXT NOT NULL, -- 'yahoo' | 'bank_of_canada' | 'tiingo' | 'fred' close REAL, adj_close REAL, -- split/dividend-adjusted closing price dividend_amount REAL DEFAULT 0.0, fetched_at TIMESTAMP NOT NULL, PRIMARY KEY (ticker, date, source));prices — reconciled SSoT (what all consumers query)
Section titled “prices — reconciled SSoT (what all consumers query)”CREATE TABLE prices ( ticker TEXT NOT NULL, date DATE NOT NULL, close REAL NOT NULL, adj_close REAL NOT NULL, dividend_amount REAL NOT NULL DEFAULT 0.0, primary_source TEXT NOT NULL, confidence TEXT NOT NULL CHECK (confidence IN ('verified', 'single-source', 'conflict')), PRIMARY KEY (ticker, date));Key decisions:
adj_closeis the series used for total-return analysis (accounts for splits + dividend reinvestment).dividend_amountis the raw distribution — needed separately for tax calculations (dividend income vs. capital gains).price_sourcesis append-only — never deleted, full audit trail.confidence = 'conflict'rows are written topricesbut flagged; consumers filter or warn on them.- Both
data/asset_history.dbanddata/conflicts.logare gitignored.
Data Sources & Independence
Section titled “Data Sources & Independence”| Source | Data lineage | CA data | US data | Dividends | Date range |
|---|---|---|---|---|---|
Yahoo Finance (yfinance) | Thomson Reuters feed | XIC.TO ✓ | SPY ✓ | ✓ | XIC ~2001, SPY ~1993 |
| Bank of Canada (valet API) | Government — fully independent | TSX Composite ✓ | ✗ | ✗ | 1977-present |
| Tiingo (free tier) | Independent feed agreements | Limited | SPY ✓ | ✓ (excellent) | SPY ~1993 |
| FRED (St. Louis Fed API) | US Government — fully independent | ✗ | S&P 500 ✓ | ✗ | 2012-present (daily) |
Source pairings:
XIC.TO: Yahoo Finance (primary) + Bank of Canada (independent price validation)SPY: Yahoo Finance (primary) + Tiingo (dividend cross-reference) + FRED (price validation)
Excluded: Stooq — data lineage opaque, may proxy Yahoo Finance for some tickers.
Specific API series
Section titled “Specific API series”Bank of Canada — TSX Composite:
- Base URL:
https://www.bankofcanada.ca/valet/ - Pre-condition for Chunk 2 (implementer action): Before writing
bank_of_canada.py, run:curl "https://www.bankofcanada.ca/valet/lists/series/json" | python3 -m json.tool | grep -i -A2 "tsx\|equity\|stock"Identify the daily TSX Composite series ID (e.g.FXVECTOR-style code), hardcode it in the adapter, and document it in a comment. - Observations endpoint:
GET /valet/observations/{seriesName}/json?start_date=YYYY-MM-DD - Returns daily index level (not ETF price) — used for return-rate cross-check, not raw price matching
- If BoC does not offer a suitable daily equity series, replace with Stooq
^GSPTSEafter confirming Stooq’s data lineage is independent of Yahoo Finance for this specific index
FRED — S&P 500:
- Series ID:
SP500(daily close, 2012-present) - API endpoint:
https://api.stlouisfed.org/fred/series/observations?series_id=SP500&api_key={key}&file_type=json - Free API key: register at
https://fred.stlouisfed.org/docs/api/api_key.html - Returns daily index level — used for return-rate cross-check against SPY adj_close
Tiingo — SPY:
- API endpoint:
https://api.tiingo.com/tiingo/daily/{ticker}/prices?startDate=YYYY-MM-DD&token={key} - Free tier: 500 requests/day, full SPY history to 1993 available
- Includes
adjCloseanddivCashfields — used for dividend cross-reference - Rate limit handling: on HTTP 429, wait 60 seconds and retry once; if still failing, log and skip (do not crash)
- API key: free registration at
https://www.tiingo.com/
Source Adapter Interface (base.py)
Section titled “Source Adapter Interface (base.py)”All source adapters implement this interface:
from abc import ABC, abstractmethodfrom dataclasses import dataclassfrom datetime import date
@dataclassclass PriceRecord: ticker: str date: date close: float | None adj_close: float | None dividend_amount: float
class BaseSource(ABC): source_name: str # class attribute — 'yahoo' | 'bank_of_canada' | 'tiingo' | 'fred'
@abstractmethod def fetch_full(self, ticker: str) -> list[PriceRecord]: """Fetch all available history for ticker.""" ...
@abstractmethod def fetch_since(self, ticker: str, since: date) -> list[PriceRecord]: """Fetch records from `since` date to today (inclusive). Used for incremental updates.""" ...Note: close and adj_close may be None for government index sources (BoC, FRED) that provide index levels rather than ETF prices. Reconciliation handles None fields by skipping raw price comparison and using only return-rate comparison for those sources.
Ticker-to-series mapping
Section titled “Ticker-to-series mapping”Government sources serve index series, not ETF tickers. Each adapter hardcodes a mapping:
TICKER_TO_BOC_SERIES: dict[str, str] = { "XIC.TO": "<series_id>", # filled in during Chunk 2 pre-condition step above}
# fred.pyTICKER_TO_FRED_SERIES: dict[str, str] = { "SPY": "SP500", "IVV": "SP500",}If a ticker is not in the mapping, fetch_full() and fetch_since() raise ValueError(f"No {source_name} series mapping for ticker {ticker}"). Add new mappings as new tickers are registered.
Conflict Detection & Reconciliation
Section titled “Conflict Detection & Reconciliation”asset-history reconcile <ticker>
Section titled “asset-history reconcile <ticker>”Reads all rows from price_sources for the ticker. For each date where multiple sources have data:
Price check (adj_close available from both sources):
- Compute 1-day return for each source:
r = adj_close[t] / adj_close[t-1] - 1 - Skip the first row in each series (no prior day available — silently omit, no error)
- Flag if
|r_source_a - r_source_b| > 0.005(0.5%) on any single day
Return-rate check (when one source provides index level, not ETF price):
- Compute 1-day return from each series independently
- Flag if single-day returns diverge by > 0.5%
- Additionally compute rolling 30-day cumulative return:
prod(1+r) - 1for each 30-day window - Flag if cumulative returns diverge by > 1.0%
- Window slides one day at a time; comparison uses the same calendar dates in both series
Dividend check:
- Flag any date where one source records
dividend_amount > 0and another records0 - Flag if amounts differ by > 1% (rounding tolerance)
On conflict detected:
- Write
confidence = 'conflict'topricesrow (primary_source = the primary source for that ticker) - Append to
data/conflicts.log: timestamp, ticker, date, conflict type, values from each source - Call
notify.notify_conflicts(ticker, conflict_descriptions)
On no conflict:
- Write
confidence = 'verified'(multiple sources agree) or'single-source'(only one source has data for that date)
asset-history resolve <ticker> <date> --accept <source>
Section titled “asset-history resolve <ticker> <date> --accept <source>”- Look up the accepted source’s row in
price_sourcesfor (ticker, date) - Update
pricesrow: set values from accepted source, setconfidence = 'verified', setprimary_source = accepted_source - Append resolution record to
conflicts.log: timestamp, ticker, date, accepted source, resolved by human
Incremental Update Logic (asset-history update)
Section titled “Incremental Update Logic (asset-history update)”For each registered ticker, for each of its configured sources:
- Query
MAX(date)inprice_sourcesWHERE ticker = t AND source = s - If no rows exist: run
fetch_full(ticker)(first-time backfill) - If rows exist: run
fetch_since(ticker, max_date + 1 day) - If gap > 5 business days between max_date and today: log a WARNING (do not fail — gaps can occur on holidays or data outages)
- Insert new rows into
price_sources - Run
reconcile(ticker)automatically for any dates with new data - Print summary:
XIC.TO: +N new rows, K conflicts(or0 conflicts)
Export Format (asset-history export)
Section titled “Export Format (asset-history export)”asset-history export XIC.TO --format json --since 2015-01-01 --out xic-history.jsonOutput JSON schema:
{ "ticker": "XIC.TO", "country": "ca", "exported_at": "2026-03-23", "since": "2015-01-01", "rows": [ { "date": "2015-01-02", "adj_close": 22.14, "dividend_amount": 0.0, "confidence": "verified" } ]}- Only rows with
confidence != 'conflict'are included by default --include-conflictsflag includes conflict rows; they appear in the same structure with"confidence": "conflict"— no additional field needed, theconfidencevalue is the marker- Date format: ISO 8601 (
YYYY-MM-DD) - Rows sorted ascending by date
Notification Integration
Section titled “Notification Integration”notify.py wraps the existing notify_manager utility:
import osimport sys
def _get_send_alert(): """Import notify_manager.send_alert via sys.path. Returns None if unavailable.""" nm_path = os.getenv("NOTIFY_MANAGER_PATH", "/mnt/d/FSS/Software/Utils/PythonUtils") sys.path.insert(0, nm_path) try: from notify_manager.notify_manager import send_alert return send_alert except ImportError: return None
def notify_conflicts(ticker: str, conflicts: list[str]) -> None: send_alert = _get_send_alert() if send_alert: send_alert(tool_name="asset-history", errors=conflicts, level="WARNING") else: # Graceful fallback — always prints, even without notify_manager print(f"[asset-history] WARNING: {len(conflicts)} conflict(s) for {ticker}") for c in conflicts: print(f" {c}").env configuration:
NOTIFY_MANAGER_PATH=/mnt/d/FSS/Software/Utils/PythonUtilsEMAIL_PASSWORD=<same value as notify_manager's .env>TIINGO_API_KEY=<from tiingo.com free registration>FRED_API_KEY=<from fred.stlouisfed.org free registration>EMAIL_PASSWORD is shared — same credential notify_manager already uses. python-dotenv loads it automatically when asset-history runs.
CLI Commands (full reference)
Section titled “CLI Commands (full reference)”# Register a ticker for trackingasset-history add XIC.TO --name "iShares TSX Composite" --country ca --sources yahoo,bank_of_canada
# Initial full backfill from a specific source# Writes all fetched rows into price_sources (UPSERT — re-running updates fetched_at)# Does NOT auto-reconcile; run `reconcile` separately after all sources are fetchedasset-history fetch XIC.TO --source yahooasset-history fetch XIC.TO --source bank_of_canada
# Reconcile sources → write to prices table, flag conflictsasset-history reconcile XIC.TO
# Weekly incremental update (all tickers, all sources, auto-reconcile)asset-history update
# Show DB health: coverage dates, conflict count, last fetch per ticker/sourceasset-history status
# List all unresolved conflictsasset-history conflicts
# Resolve a conflict: accept one source's value as canonicalasset-history resolve XIC.TO 2019-03-15 --accept yahoo
# Export clean data for downstream consumersasset-history export XIC.TO --format json --since 2015-01-01 --out xic-history.jsonasset-history export XIC.TO --format json --since 2015-01-01 --include-conflicts --out xic-full.jsonsd-app Integration
Section titled “sd-app Integration”asset-history has no dependency on sd-app. The data flows one-way:
asset-history DB ↓ (export or direct import)packages/sd-math/tools/generate_historical.py ← Phase 3, separate task ↓apps/sd-app/static/snapshots/historical-ca.json ↓SvelteKit renderer /ca/en/historical/The existing sd_math.calculate() accepts returns: list[float] — feeding it a historical annual return series requires no changes to sd-math core. ReturnProfile and TaxProfile already handle tax-correct analysis.
Testing
Section titled “Testing”pytest # unit tests only (no network required)pytest -m "not slow" # same (explicit)pytest -m slow # integration tests (network + API keys required)| Test file | What it covers |
|---|---|
test_db.py | Schema creation, CRUD, constraint enforcement, confidence CHECK constraint |
test_reconcile.py | Conflict detection with synthetic xic_sample.json / spy_sample.json data |
test_sources_mock.py | Each adapter’s fetch_full() and fetch_since() — mocked HTTP, verifies parsing |
test_cli.py | All CLI commands against a temp SQLite DB (no network) |
test_sources_live.py | Fetches 30 days of real data per source; verifies schema compliance only, not specific values |
Phased Delivery (Cursor Task Chunks)
Section titled “Phased Delivery (Cursor Task Chunks)”Chunk 1 — Scaffold + Schema (no network required)
Section titled “Chunk 1 — Scaffold + Schema (no network required)”Deliverables:
pyproject.tomlwith dependencies listed abovesrc/asset_history/__init__.py,db.py— full schema creation,create_tables(), CRUD helpers for all three tablestests/test_db.py— tests for: table creation, insert/querytickers, insert/queryprice_sources, insert/queryprices,confidenceCHECK constraint enforcementtickerstable pre-populated with XIC.TO and SPY via adb.seed_tickers()helper:XIC.TO: country=ca, asset_type=etf, sources=["yahoo","bank_of_canada"]SPY: country=us, asset_type=etf, sources=["yahoo","tiingo","fred"]
.gitignore—data/directory excluded,.envexcluded.env.example— all four env vars documented
Gate: pytest tests/test_db.py passes with zero network access.
Chunk 2 — Source Adapters + Reconcile + CLI + Notify
Section titled “Chunk 2 — Source Adapters + Reconcile + CLI + Notify”Deliverables:
sources/base.py—PriceRecorddataclass +BaseSourceABC as specified abovesources/yahoo.py— implementsfetch_full()andfetch_since()viayfinancesources/bank_of_canada.py— implements return-rate validation;adj_close = None,close = index_levelsources/tiingo.py— implements dividend cross-reference; handles 429 retrysources/fred.py— implements return-rate validation;adj_close = None,close = index_levelreconcile.py— full conflict detection algorithm as specified; writes toprices, appends todata/conflicts.lognotify.py—notify_managerwrapper with graceful fallbackcli.py— all commands wired via Typer:add,fetch,reconcile,update,status,conflicts,resolve,exporttests/test_sources_mock.py— mock HTTP, verifyPriceRecordparsing for each adaptertests/test_reconcile.py— synthetic conflict scenarios using fixture datatests/test_cli.py— all commands against temp DBtests/test_sources_live.py— marked@pytest.mark.slow
Gate: asset-history fetch XIC.TO --source yahoo completes end-to-end; asset-history status shows correct coverage; asset-history update runs incrementally; pytest -m "not slow" passes.
Key Decisions
Section titled “Key Decisions”| Decision | Choice | Rationale |
|---|---|---|
| Project location | ~/projects/asset-history/ (standalone) | Generic infrastructure — not SD App-specific; multiple future consumers |
| Storage | SQLite | Scales to dozens of tickers; richer querying than JSON; built-in Python |
| DB gitignore | Yes — data/ excluded | Binary diffs meaningless; JSON exports committed per consuming project |
| Source strategy | Multi-source, independent lineage required | Government sources (BoC, FRED) for CA/US validation; Stooq excluded (opaque lineage) |
| Conflict resolution | Manual — flag + notify, human resolves | SSoT discipline; no silent auto-resolution |
| Notification | Delegates to existing notify_manager utility | Avoids duplicating notification infrastructure |
| Return type stored | adj_close + explicit dividend_amount | adj_close for total return; dividend_amount for tax analysis |
| Update cadence | Weekly incremental | Daily not critical; weekly sufficient for snapshot use case |
| CLI framework | Typer | Clean ergonomics, type-safe, auto-generates --help |