Skip to content

Date: 2026-03-23 Status: Approved Project location: ~/projects/asset-history/ (standalone — not in monorepo)


asset-history is a standalone Python project that fetches, stores, maintains, and reconciles historical end-of-day (EOD) financial asset data — prices and dividends — for equity indices and ETFs. It is generic financial infrastructure, not SD App-specific. Multiple future projects will consume it.

Primary consumers (initially):

  • SD App historical analysis tool (packages/sd-math/tools/generate_historical.py) — feeds pre-calculated JSON snapshots for the “If I had used $MART DEBT” renderer
  • Any future FSS financial application requiring historical return data

  • No math or analysis — asset-history is a data layer only
  • No browser/PWA component — all consumers query via Python import or CLI export
  • No real-time or intraday data — EOD only
  • No user authentication or multi-user support

~/projects/asset-history/
src/asset_history/
sources/
base.py # Abstract base class all source adapters implement
yahoo.py # yfinance — primary source for all tickers
bank_of_canada.py # CA price validation (valet API, government-independent)
tiingo.py # Dividend cross-reference + US validation (free tier)
fred.py # US price validation (FRED API, government-independent)
db.py # SQLite schema, CRUD, migrations
reconcile.py # Conflict detection, flagging, report generation
notify.py # Thin wrapper around notify_manager utility
cli.py # CLI entrypoints via Typer
tests/
fixtures/
xic_sample.json # Synthetic price+dividend data for unit tests (5 years)
spy_sample.json # Same for US
test_db.py
test_reconcile.py
test_sources_mock.py
test_cli.py
test_sources_live.py # @pytest.mark.slow — requires network + API keys
data/
asset_history.db # gitignored — generated artifact
conflicts.log # gitignored — generated artifact
pyproject.toml
.env
.env.example
.gitignore
README.md

pyproject.toml dependencies:

[project]
requires-python = ">=3.11"
dependencies = [
"yfinance>=0.2",
"requests>=2.31",
"typer>=0.12",
"python-dotenv>=1.0",
"pyyaml>=6.0", # required by notify_manager
]
[project.optional-dependencies]
dev = [
"pytest>=8.0",
"pytest-mock>=3.12",
]
[project.scripts]
asset-history = "asset_history.cli:app"

Virtual environment: python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"


CREATE TABLE tickers (
ticker TEXT PRIMARY KEY,
name TEXT,
country TEXT, -- 'ca' | 'us'
asset_type TEXT, -- 'equity-index' | 'etf' | ...
sources TEXT, -- JSON array: ["yahoo","bank_of_canada"]
added_at TIMESTAMP NOT NULL
);

price_sources — raw source data (append-only audit trail)

Section titled “price_sources — raw source data (append-only audit trail)”
CREATE TABLE price_sources (
ticker TEXT NOT NULL,
date DATE NOT NULL,
source TEXT NOT NULL, -- 'yahoo' | 'bank_of_canada' | 'tiingo' | 'fred'
close REAL,
adj_close REAL, -- split/dividend-adjusted closing price
dividend_amount REAL DEFAULT 0.0,
fetched_at TIMESTAMP NOT NULL,
PRIMARY KEY (ticker, date, source)
);

prices — reconciled SSoT (what all consumers query)

Section titled “prices — reconciled SSoT (what all consumers query)”
CREATE TABLE prices (
ticker TEXT NOT NULL,
date DATE NOT NULL,
close REAL NOT NULL,
adj_close REAL NOT NULL,
dividend_amount REAL NOT NULL DEFAULT 0.0,
primary_source TEXT NOT NULL,
confidence TEXT NOT NULL
CHECK (confidence IN ('verified', 'single-source', 'conflict')),
PRIMARY KEY (ticker, date)
);

Key decisions:

  • adj_close is the series used for total-return analysis (accounts for splits + dividend reinvestment). dividend_amount is the raw distribution — needed separately for tax calculations (dividend income vs. capital gains).
  • price_sources is append-only — never deleted, full audit trail.
  • confidence = 'conflict' rows are written to prices but flagged; consumers filter or warn on them.
  • Both data/asset_history.db and data/conflicts.log are gitignored.

SourceData lineageCA dataUS dataDividendsDate range
Yahoo Finance (yfinance)Thomson Reuters feedXIC.TO ✓SPY ✓XIC ~2001, SPY ~1993
Bank of Canada (valet API)Government — fully independentTSX Composite ✓1977-present
Tiingo (free tier)Independent feed agreementsLimitedSPY ✓✓ (excellent)SPY ~1993
FRED (St. Louis Fed API)US Government — fully independentS&P 500 ✓2012-present (daily)

Source pairings:

  • XIC.TO: Yahoo Finance (primary) + Bank of Canada (independent price validation)
  • SPY: Yahoo Finance (primary) + Tiingo (dividend cross-reference) + FRED (price validation)

Excluded: Stooq — data lineage opaque, may proxy Yahoo Finance for some tickers.

Bank of Canada — TSX Composite:

  • Base URL: https://www.bankofcanada.ca/valet/
  • Pre-condition for Chunk 2 (implementer action): Before writing bank_of_canada.py, run: curl "https://www.bankofcanada.ca/valet/lists/series/json" | python3 -m json.tool | grep -i -A2 "tsx\|equity\|stock" Identify the daily TSX Composite series ID (e.g. FXVECTOR-style code), hardcode it in the adapter, and document it in a comment.
  • Observations endpoint: GET /valet/observations/{seriesName}/json?start_date=YYYY-MM-DD
  • Returns daily index level (not ETF price) — used for return-rate cross-check, not raw price matching
  • If BoC does not offer a suitable daily equity series, replace with Stooq ^GSPTSE after confirming Stooq’s data lineage is independent of Yahoo Finance for this specific index

FRED — S&P 500:

  • Series ID: SP500 (daily close, 2012-present)
  • API endpoint: https://api.stlouisfed.org/fred/series/observations?series_id=SP500&api_key={key}&file_type=json
  • Free API key: register at https://fred.stlouisfed.org/docs/api/api_key.html
  • Returns daily index level — used for return-rate cross-check against SPY adj_close

Tiingo — SPY:

  • API endpoint: https://api.tiingo.com/tiingo/daily/{ticker}/prices?startDate=YYYY-MM-DD&token={key}
  • Free tier: 500 requests/day, full SPY history to 1993 available
  • Includes adjClose and divCash fields — used for dividend cross-reference
  • Rate limit handling: on HTTP 429, wait 60 seconds and retry once; if still failing, log and skip (do not crash)
  • API key: free registration at https://www.tiingo.com/

All source adapters implement this interface:

from abc import ABC, abstractmethod
from dataclasses import dataclass
from datetime import date
@dataclass
class PriceRecord:
ticker: str
date: date
close: float | None
adj_close: float | None
dividend_amount: float
class BaseSource(ABC):
source_name: str # class attribute — 'yahoo' | 'bank_of_canada' | 'tiingo' | 'fred'
@abstractmethod
def fetch_full(self, ticker: str) -> list[PriceRecord]:
"""Fetch all available history for ticker."""
...
@abstractmethod
def fetch_since(self, ticker: str, since: date) -> list[PriceRecord]:
"""Fetch records from `since` date to today (inclusive). Used for incremental updates."""
...

Note: close and adj_close may be None for government index sources (BoC, FRED) that provide index levels rather than ETF prices. Reconciliation handles None fields by skipping raw price comparison and using only return-rate comparison for those sources.

Government sources serve index series, not ETF tickers. Each adapter hardcodes a mapping:

bank_of_canada.py
TICKER_TO_BOC_SERIES: dict[str, str] = {
"XIC.TO": "<series_id>", # filled in during Chunk 2 pre-condition step above
}
# fred.py
TICKER_TO_FRED_SERIES: dict[str, str] = {
"SPY": "SP500",
"IVV": "SP500",
}

If a ticker is not in the mapping, fetch_full() and fetch_since() raise ValueError(f"No {source_name} series mapping for ticker {ticker}"). Add new mappings as new tickers are registered.


Reads all rows from price_sources for the ticker. For each date where multiple sources have data:

Price check (adj_close available from both sources):

  • Compute 1-day return for each source: r = adj_close[t] / adj_close[t-1] - 1
  • Skip the first row in each series (no prior day available — silently omit, no error)
  • Flag if |r_source_a - r_source_b| > 0.005 (0.5%) on any single day

Return-rate check (when one source provides index level, not ETF price):

  • Compute 1-day return from each series independently
  • Flag if single-day returns diverge by > 0.5%
  • Additionally compute rolling 30-day cumulative return: prod(1+r) - 1 for each 30-day window
  • Flag if cumulative returns diverge by > 1.0%
  • Window slides one day at a time; comparison uses the same calendar dates in both series

Dividend check:

  • Flag any date where one source records dividend_amount > 0 and another records 0
  • Flag if amounts differ by > 1% (rounding tolerance)

On conflict detected:

  1. Write confidence = 'conflict' to prices row (primary_source = the primary source for that ticker)
  2. Append to data/conflicts.log: timestamp, ticker, date, conflict type, values from each source
  3. Call notify.notify_conflicts(ticker, conflict_descriptions)

On no conflict:

  • Write confidence = 'verified' (multiple sources agree) or 'single-source' (only one source has data for that date)

asset-history resolve <ticker> <date> --accept <source>

Section titled “asset-history resolve <ticker> <date> --accept <source>”
  1. Look up the accepted source’s row in price_sources for (ticker, date)
  2. Update prices row: set values from accepted source, set confidence = 'verified', set primary_source = accepted_source
  3. Append resolution record to conflicts.log: timestamp, ticker, date, accepted source, resolved by human

Incremental Update Logic (asset-history update)

Section titled “Incremental Update Logic (asset-history update)”

For each registered ticker, for each of its configured sources:

  1. Query MAX(date) in price_sources WHERE ticker = t AND source = s
  2. If no rows exist: run fetch_full(ticker) (first-time backfill)
  3. If rows exist: run fetch_since(ticker, max_date + 1 day)
  4. If gap > 5 business days between max_date and today: log a WARNING (do not fail — gaps can occur on holidays or data outages)
  5. Insert new rows into price_sources
  6. Run reconcile(ticker) automatically for any dates with new data
  7. Print summary: XIC.TO: +N new rows, K conflicts (or 0 conflicts)

Terminal window
asset-history export XIC.TO --format json --since 2015-01-01 --out xic-history.json

Output JSON schema:

{
"ticker": "XIC.TO",
"country": "ca",
"exported_at": "2026-03-23",
"since": "2015-01-01",
"rows": [
{
"date": "2015-01-02",
"adj_close": 22.14,
"dividend_amount": 0.0,
"confidence": "verified"
}
]
}
  • Only rows with confidence != 'conflict' are included by default
  • --include-conflicts flag includes conflict rows; they appear in the same structure with "confidence": "conflict" — no additional field needed, the confidence value is the marker
  • Date format: ISO 8601 (YYYY-MM-DD)
  • Rows sorted ascending by date

notify.py wraps the existing notify_manager utility:

src/asset_history/notify.py
import os
import sys
def _get_send_alert():
"""Import notify_manager.send_alert via sys.path. Returns None if unavailable."""
nm_path = os.getenv("NOTIFY_MANAGER_PATH", "/mnt/d/FSS/Software/Utils/PythonUtils")
sys.path.insert(0, nm_path)
try:
from notify_manager.notify_manager import send_alert
return send_alert
except ImportError:
return None
def notify_conflicts(ticker: str, conflicts: list[str]) -> None:
send_alert = _get_send_alert()
if send_alert:
send_alert(tool_name="asset-history", errors=conflicts, level="WARNING")
else:
# Graceful fallback — always prints, even without notify_manager
print(f"[asset-history] WARNING: {len(conflicts)} conflict(s) for {ticker}")
for c in conflicts:
print(f" {c}")

.env configuration:

NOTIFY_MANAGER_PATH=/mnt/d/FSS/Software/Utils/PythonUtils
EMAIL_PASSWORD=<same value as notify_manager's .env>
TIINGO_API_KEY=<from tiingo.com free registration>
FRED_API_KEY=<from fred.stlouisfed.org free registration>

EMAIL_PASSWORD is shared — same credential notify_manager already uses. python-dotenv loads it automatically when asset-history runs.


Terminal window
# Register a ticker for tracking
asset-history add XIC.TO --name "iShares TSX Composite" --country ca --sources yahoo,bank_of_canada
# Initial full backfill from a specific source
# Writes all fetched rows into price_sources (UPSERT — re-running updates fetched_at)
# Does NOT auto-reconcile; run `reconcile` separately after all sources are fetched
asset-history fetch XIC.TO --source yahoo
asset-history fetch XIC.TO --source bank_of_canada
# Reconcile sources → write to prices table, flag conflicts
asset-history reconcile XIC.TO
# Weekly incremental update (all tickers, all sources, auto-reconcile)
asset-history update
# Show DB health: coverage dates, conflict count, last fetch per ticker/source
asset-history status
# List all unresolved conflicts
asset-history conflicts
# Resolve a conflict: accept one source's value as canonical
asset-history resolve XIC.TO 2019-03-15 --accept yahoo
# Export clean data for downstream consumers
asset-history export XIC.TO --format json --since 2015-01-01 --out xic-history.json
asset-history export XIC.TO --format json --since 2015-01-01 --include-conflicts --out xic-full.json

asset-history has no dependency on sd-app. The data flows one-way:

asset-history DB
↓ (export or direct import)
packages/sd-math/tools/generate_historical.py ← Phase 3, separate task
apps/sd-app/static/snapshots/historical-ca.json
SvelteKit renderer /ca/en/historical/

The existing sd_math.calculate() accepts returns: list[float] — feeding it a historical annual return series requires no changes to sd-math core. ReturnProfile and TaxProfile already handle tax-correct analysis.


Terminal window
pytest # unit tests only (no network required)
pytest -m "not slow" # same (explicit)
pytest -m slow # integration tests (network + API keys required)
Test fileWhat it covers
test_db.pySchema creation, CRUD, constraint enforcement, confidence CHECK constraint
test_reconcile.pyConflict detection with synthetic xic_sample.json / spy_sample.json data
test_sources_mock.pyEach adapter’s fetch_full() and fetch_since() — mocked HTTP, verifies parsing
test_cli.pyAll CLI commands against a temp SQLite DB (no network)
test_sources_live.pyFetches 30 days of real data per source; verifies schema compliance only, not specific values

Chunk 1 — Scaffold + Schema (no network required)

Section titled “Chunk 1 — Scaffold + Schema (no network required)”

Deliverables:

  • pyproject.toml with dependencies listed above
  • src/asset_history/__init__.py, db.py — full schema creation, create_tables(), CRUD helpers for all three tables
  • tests/test_db.py — tests for: table creation, insert/query tickers, insert/query price_sources, insert/query prices, confidence CHECK constraint enforcement
  • tickers table pre-populated with XIC.TO and SPY via a db.seed_tickers() helper:
    • XIC.TO: country=ca, asset_type=etf, sources=["yahoo","bank_of_canada"]
    • SPY: country=us, asset_type=etf, sources=["yahoo","tiingo","fred"]
  • .gitignoredata/ directory excluded, .env excluded
  • .env.example — all four env vars documented

Gate: pytest tests/test_db.py passes with zero network access.

Chunk 2 — Source Adapters + Reconcile + CLI + Notify

Section titled “Chunk 2 — Source Adapters + Reconcile + CLI + Notify”

Deliverables:

  • sources/base.pyPriceRecord dataclass + BaseSource ABC as specified above
  • sources/yahoo.py — implements fetch_full() and fetch_since() via yfinance
  • sources/bank_of_canada.py — implements return-rate validation; adj_close = None, close = index_level
  • sources/tiingo.py — implements dividend cross-reference; handles 429 retry
  • sources/fred.py — implements return-rate validation; adj_close = None, close = index_level
  • reconcile.py — full conflict detection algorithm as specified; writes to prices, appends to data/conflicts.log
  • notify.pynotify_manager wrapper with graceful fallback
  • cli.py — all commands wired via Typer: add, fetch, reconcile, update, status, conflicts, resolve, export
  • tests/test_sources_mock.py — mock HTTP, verify PriceRecord parsing for each adapter
  • tests/test_reconcile.py — synthetic conflict scenarios using fixture data
  • tests/test_cli.py — all commands against temp DB
  • tests/test_sources_live.py — marked @pytest.mark.slow

Gate: asset-history fetch XIC.TO --source yahoo completes end-to-end; asset-history status shows correct coverage; asset-history update runs incrementally; pytest -m "not slow" passes.


DecisionChoiceRationale
Project location~/projects/asset-history/ (standalone)Generic infrastructure — not SD App-specific; multiple future consumers
StorageSQLiteScales to dozens of tickers; richer querying than JSON; built-in Python
DB gitignoreYes — data/ excludedBinary diffs meaningless; JSON exports committed per consuming project
Source strategyMulti-source, independent lineage requiredGovernment sources (BoC, FRED) for CA/US validation; Stooq excluded (opaque lineage)
Conflict resolutionManual — flag + notify, human resolvesSSoT discipline; no silent auto-resolution
NotificationDelegates to existing notify_manager utilityAvoids duplicating notification infrastructure
Return type storedadj_close + explicit dividend_amountadj_close for total return; dividend_amount for tax analysis
Update cadenceWeekly incrementalDaily not critical; weekly sufficient for snapshot use case
CLI frameworkTyperClean ergonomics, type-safe, auto-generates --help