Cron Job Monitor
Section titled “Cron Job Monitor”Decision
Section titled “Decision”Built Option B (custom Python utility) over Option A (healthchecks.io SaaS). Rationale: WSL always runs, external dependency concern, notify_manager already in stack, richer diagnostics (exit code + staleness vs. just “missed ping”).
What Was Built
Section titled “What Was Built”/home/ta/utils/system/job_monitor/ — standalone uv Python package, git repo initialized.
| File | Purpose |
|---|---|
heartbeat.sh | Called by each cron job: ; heartbeat.sh <name> $? |
src/job_monitor/main.py | Checks staleness + exit codes, fires CRITICAL alert, writes status.json |
config.yaml | 7 jobs with max_age_hours thresholds |
README.md | Setup + “adding a new job” guide |
Installed as CLI tool: job-monitor in PATH.
Jobs Monitored
Section titled “Jobs Monitored”| Job | Max Age |
|---|---|
| virus_scan | 192h (8 days) |
| create_system_image | 840h (35 days) |
| my_backup_full_maintenance | 840h (35 days) |
| send_status_report | 192h (8 days) |
| asset_history_update | 192h (8 days) |
| mbr_health_check | 26h |
| mbr_daily_run | 26h |
Crontab Changes
Section titled “Crontab Changes”All 7 jobs got ; heartbeat.sh <name> $? appended. New daily monitor entry:
0 6 * * * /home/ta/.local/bin/job-monitor >> /home/ta/utils/system/job_monitor/logs/job_monitor.log 2>&1Dashboard
Section titled “Dashboard”Every run writes status.json (overall: ok/degraded, per-job status/exit_code/issue). Follow-on task drafted to add Cron Health widget to MBR ops dashboard: mbr-ops-dashboard-cron-health.md.
Cold-Start Note
Section titled “Cold-Start Note”First 6 AM run alerts all 7 jobs “never ran” — expected, not a failure. Heartbeats populate as jobs run on their normal schedules.
Commits
Section titled “Commits”job_monitorrepo:00d5ed2— initial utility- KB Business:
3da44d9— task files