Hardening round 2: healthcheck, audit anchor, return_4h, exec config, signals

Sei interventi MEDIA priorità sul sistema. 323 test pass, mypy strict
pulito, ruff clean.

1. Docker HEALTHCHECK + cerbero-bite healthcheck:
   - nuovo subcommand che esce 0 se kill_switch=0 e last_health_check
     entro --max-staleness-s (default 600s);
   - HEALTHCHECK direttiva nel Dockerfile (60s interval, 5s timeout,
     start_period 120s, retries 3);
   - healthcheck definition nel docker-compose.yml.

2. Audit hash chain anti-truncation:
   - migration 0002: nuova colonna system_state.last_audit_hash;
   - AuditLog accetta callback on_append, dependencies.py la wire al
     repository.set_last_audit_hash;
   - Orchestrator.boot verifica che il tail file matcha l'anchor
     persistito; mismatch → kill switch CRITICAL.

3. return_4h bootstrap da deribit get_historical:
   - quando dvol_history è vuoto _fetch_return_4h cade su
     deribit.historical_close (1h candle 4h fa);
   - alert LOW se anche il fallback fallisce.

4. execution.environment + execution.eur_to_usd in strategy.yaml:
   - ExecutionConfig promosso a typed schema con i due campi
     consumati al boot;
   - CLI start preferisce i valori da config; CLI flag overridano
     solo quando differenti dai default.

5. Cycle correlation ID:
   - structlog.contextvars.bind_contextvars in run_entry/run_monitor/
     run_health propaga cycle_id e cycle nei log strutturati.

6. SIGTERM/SIGINT clean shutdown:
   - run_forever installa loop.add_signal_handler per SIGTERM e
     SIGINT; il segnale set()ta un asyncio.Event che termina il
     blocco principale, scheduler.shutdown e ctx.aclose finalizzano.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-28 00:37:39 +02:00
parent 411b747e93
commit b5b96f959c
15 changed files with 477 additions and 24 deletions
+66
View File
@@ -0,0 +1,66 @@
"""Tests for the ``cerbero-bite healthcheck`` subcommand."""
from __future__ import annotations
from datetime import UTC, datetime, timedelta
from pathlib import Path
from click.testing import CliRunner
from cerbero_bite.cli import main as cli_main
from cerbero_bite.state import Repository, connect, run_migrations, transaction
def _seed_state(db: Path, *, last_check: datetime, kill_switch: bool = False) -> None:
conn = connect(db)
try:
run_migrations(conn)
repo = Repository()
with transaction(conn):
repo.init_system_state(
conn, config_version="1.0.0", now=last_check
)
if kill_switch:
repo.set_kill_switch(
conn, armed=True, reason="test", now=last_check
)
else:
repo.touch_health_check(conn, now=last_check)
finally:
conn.close()
def test_healthcheck_exits_one_when_db_missing(tmp_path: Path) -> None:
result = CliRunner().invoke(
cli_main,
["healthcheck", "--db", str(tmp_path / "absent.sqlite")],
)
assert result.exit_code == 1
assert "unhealthy" in result.output
def test_healthcheck_exits_one_when_kill_switch_armed(tmp_path: Path) -> None:
db = tmp_path / "state.sqlite"
_seed_state(db, last_check=datetime.now(UTC), kill_switch=True)
result = CliRunner().invoke(cli_main, ["healthcheck", "--db", str(db)])
assert result.exit_code == 1
assert "kill switch" in result.output
def test_healthcheck_exits_one_when_last_check_stale(tmp_path: Path) -> None:
db = tmp_path / "state.sqlite"
_seed_state(db, last_check=datetime.now(UTC) - timedelta(hours=1))
result = CliRunner().invoke(
cli_main,
["healthcheck", "--db", str(db), "--max-staleness-s", "60"],
)
assert result.exit_code == 1
assert "stale" in result.output
def test_healthcheck_exits_zero_on_recent_check(tmp_path: Path) -> None:
db = tmp_path / "state.sqlite"
_seed_state(db, last_check=datetime.now(UTC))
result = CliRunner().invoke(cli_main, ["healthcheck", "--db", str(db)])
assert result.exit_code == 0
assert "healthy" in result.output