commit all of this

feat(cli): add clean analysis export with markdown/json output
docs: require tea for agent gitea workflows
2026-05-14 20:00:38 +02:00 · 2026-05-11 21:54:21 +02:00 · 2026-05-11 21:25:42 +02:00 · 2026-05-11 21:09:47 +02:00 · 2026-05-11 21:07:39 +02:00
18 changed files with 2943 additions and 95 deletions
--- a/.gitea/workflows/ci.yml
+++ b/.gitea/workflows/ci.yml
@@ -93,5 +93,8 @@ jobs:
      - name: Type-check
        run: .venv/bin/python -m mypy src
      - name: Validate man-page sync with CLI
        run: .venv/bin/python -m pytest tests/test_cli.py::test_man_page_covers_cli_long_options -v
      - name: Test
        run: .venv/bin/python -m pytest
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,119 @@
 # AGENTS Guide
 This file documents repository-specific operational guidance for coding agents.
 ## Scope
 - Repository: `zphinx/tai`
 - Git host: `https://git.archflux.net`
 - Preferred release CLI: `tea`
 ## Tea CLI: Required Practices
 ### 0. Agent policy: use tea for Gitea operations
 - Agents should use `tea` as the default interface for Gitea interactions in this repository.
 - Prefer `tea` for release management and remote state checks instead of ad-hoc API calls.
 - Keep using `git` for source control operations (commit/merge/tag/push), and use `tea` for release objects and Gitea-facing workflows.
 ### 1. Verify command syntax for installed tea version
 The installed version may not match upstream examples. Always check help first:
 ```fish
 tea --version
 tea logins --help
 tea releases --help
 ```
 Important version-specific note observed in this repo:
 - `tea logins add` is valid (not `tea login add`)
 - `--default` is not supported on `add` in this version
 - Set default with `tea logins default <name>`
 ### 2. Login setup flow
 Use environment token (do not hardcode tokens in commands/files):
 ```fish
 set -x GITEA_TOKEN "<token>"
 tea logins add --name archflux --url https://git.archflux.net --token "$GITEA_TOKEN"
 tea logins default archflux
 tea logins ls
 ```
 ### 3. Token safety
 - Never print raw tokens in logs, commit messages, or docs.
 - If a token is exposed, revoke it immediately in Gitea and create a replacement.
 ### 4. Release workflow expectations in this repo
 Tag workflow file: `.gitea/workflows/tag.yml`
 Observed behavior:
 - Workflow trigger pattern is numeric tags: `[0-9]*`
 - Tag `0.x.y` triggers build workflow
 - Tag `v0.x.y` can still be published as alias, but does not match that workflow trigger
 Recommended tagging pattern for releases:
 ```fish
 git tag -a 0.6.0 -m "0.6.0"
 git tag -a v0.6.0 -m "v0.6.0"
 git push origin 0.6.0
 git push origin v0.6.0
 ```
 ### 5. Create release object with tea
 ```fish
 tea releases create --repo zphinx/tai --tag 0.6.0 --title "v0.6.0" --note "<release notes>"
 tea releases list --repo zphinx/tai
 ```
 ### 6. Agent release checklist
 Before release:
 1. Confirm branch state is clean.
 2. Run tests/lint (`pytest`, `ruff`).
 3. Merge to `main`.
 4. Create/push tags (`0.x.y` + optional `v0.x.y`).
 5. Create release entry with `tea`.
 6. Verify in `tea releases list`.
 ## Non-Goals for Agents
 - Do not force Kubernetes deployment guidance for current architecture.
 - Treat Docker as one-shot execution model with mounted persistent volumes for runbooks/sessions/logs.
 ## CI Pipeline: Man-Page Validation
 The man-page drift detection is **automatically integrated** into `.gitea/workflows/ci.yml`:
 - **Test name:** `test_man_page_covers_cli_long_options()` in `tests/test_cli.py`
 - **Trigger:** Runs on every push and pull request (as part of the `Test` step)
 - **Behavior:** Extracts all long options from `tai --help` and verifies each is documented in `docs/tai.1`
 - **Failure mode:** CI fails if any long option in CLI is missing from man page; prevents merge of undocumented options
 ### How to Fix Failed Man-Page Validation
 1. **Identify missing options:** CI output shows "Missing options in docs/tai.1: ..."
 2. **Update docs/tai.1:** Add the missing option to the appropriate section (command or global options)
 3. **Re-run tests locally:** `python -m pytest tests/test_cli.py::test_man_page_covers_cli_long_options -v`
 4. **Push to trigger CI:** Once local test passes, push the update to trigger CI validation
 ### Man-Page Maintenance Workflow
 - **When adding CLI options:** Add to `src/tai/cli.py` and immediately update `docs/tai.1` in the same commit
 - **When removing CLI options:** Remove from `src/tai/cli.py` and update `docs/tai.1` in the same commit
 - **When renaming CLI options:** Update both CLI code and `docs/tai.1` in one commit
 - **When changing option behavior/defaults:** Update the option description in `docs/tai.1` to reflect new behavior
 ## Documentation Maintenance
 - When adding or changing CLI commands/options, update `docs/tai.1` in the same change.
 - Keep `README.md` and `docs/tai.1` aligned for user-facing flags and examples.
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,20 +10,39 @@ ______________________________________________________________________
 ### Added
- Tier 3 core session memory implementation:
+- Unified persistent run history in SQLite:
-  - new `src/tai/session_store.py` persistent ChromaDB store
+  - new `src/tai/history_store.py` for host-scoped JSON run records
-  - `--session-memory` option on `tai run`
+  - `--history-db` and `--history/--no-history` options on `tai run`
-  - prior-session retrieval injected into analysis/follow-up prompts
+  - prior host history auto-loaded for analysis/follow-up prompts
-  - final response indexing at session end
+  - analyzed runs auto-indexed into history DB
 - External database targets for history and runbook options:
  - `--history-db` now supports SQLite path/URL and PostgreSQL DSN
  - `--runbooks`/`--store` now support remote ChromaDB URLs
 - External DB authentication options:
  - history DB: `--history-db-user`, `--history-db-password`, `--env-file`
  - runbook store: `--runbooks-user`/`--runbooks-password` and `--store-user`/`--store-password`
  - dotenv credential keys: `TAI_HISTORY_DB_USER`, `TAI_HISTORY_DB_PASSWORD`, `TAI_RUNBOOK_STORE_USER`, `TAI_RUNBOOK_STORE_PASSWORD`
 - Remote runbook/playbook source ingestion:
  - `tai runbooks sync --path` now supports `ssh://` directories
  - `tai runbooks sync --path` now supports HTTP/HTTPS webroots with Markdown links
  - `tai runbooks add` now supports `ssh://` and HTTP/HTTPS Markdown files
 - `--output-file` option on `tai run` to persist final AI analysis output as Markdown
 - `--output-format markdown|json` for `--output-file` exports
 - JSON export schema now includes host-specific run metadata (`generated_at`, collection stats, token usage)
 - New SQLite run history database (`--history-db`) now stores per-run JSON payloads and auto-loads prior host history for analysis context
 - Planner enhancements for broader service detection:
  - generic service candidate extraction from free text
  - package presence probes in plans (`rpm -q` and `dpkg-query -W`)
 - SSH read-only allowlist expanded to permit package presence commands (`rpm`, `dpkg-query`)
- Session memory tests in `tests/test_session_store.py`
+- History DB tests in `tests/test_history_store.py`
 - CLI test coverage for analysis output file writing (`tests/test_cli.py`)
 - CLI test coverage for JSON export and ANSI stripping in written output (`tests/test_cli.py`)
 ### Changed
- Documentation alignment updates in README and ROADMAP to reflect implemented session memory and package-presence capabilities.
+- History reads/writes are now unified on SQLite DB in CLI workflows (`history`, interactive `/history`, analysis context injection).
 - Documentation alignment updates in README and ROADMAP to reflect implemented history DB and package-presence capabilities.
 - Package version metadata alignment: `src/tai/__init__.py` now matches project version `0.4.0`.
 ______________________________________________________________________
--- a/README.md
+++ b/README.md
@@ -34,6 +34,8 @@ The tool may suggest remediation commands in output, but does not execute them.
 - Live probe mode (`uname -a`)
 - Diagnostics collection mode
 - AI analysis mode
 - Optional analysis export via `--output-file <path>` (`--output-format markdown|json`)
 - Automatic host history persistence/read via database (`--history-db`, `--history/--no-history`)
 - Interactive loop with `/collect`, `/analyze`, `/help`, `/quit`
 ### AI and Prompting
@@ -164,6 +166,125 @@ tai run "docker daemon keeps failing" \
  --runbooks ~/.tai/runbooks
 ```
 ### Write Analysis to File
 ```bash
 tai run "sshd authentication failed" \
  --host bastion01 \
  --collect --analyze \
  --output-file ./reports/sshd-analysis.md
 ```
 JSON export:
 ```bash
 tai run "sshd authentication failed" \
  --host bastion01 \
  --collect --analyze \
  --output-file ./reports/sshd-analysis.json \
  --output-format json
 ```
 JSON export includes host-specific run metadata:
 - `schema` and `generated_at`
 - `issue`, `host`, `model`
 - `collection` summary (`total`, `failed`, `succeeded`)
 - `token_usage` (`prompt_tokens`, `completion_tokens`, `total_tokens`) when available from backend
 - `analysis` text
 By default, each analyzed run is also written to the history database and prior
 sessions for the same host are read and injected as historical context.
 Database targets supported by `--history-db`:
 - SQLite file path (for example `~/.tai/history.db`)
 - SQLite URL (for example `sqlite:////tmp/tai-history.db`)
 - PostgreSQL DSN (for example `postgresql://user:pass@dbhost:5432/tai`)
 Example using remote PostgreSQL history database:
 ```bash
 tai run "sshd authentication failed" \
  --host bastion01 \
  --collect --analyze \
  --history-db postgresql://tai_user:secret@db.internal:5432/tai
 ```
 Credential options for external history DB:
 - `--history-db-user <user>`
 - `--history-db-password <password>`
 - `--env-file <path>` (loads dotenv values)
 Dotenv keys for history DB credentials:
 - `TAI_HISTORY_DB_USER`
 - `TAI_HISTORY_DB_PASSWORD`
 Runbook store targets supported by `--runbooks` and `tai runbooks --store`:
 - Local embedded ChromaDB path (default)
 - Remote ChromaDB URL (for example `http://chroma.internal:8000`)
 Example using remote ChromaDB runbook store at analysis time:
 ```bash
 tai run "nginx failing after reboot" \
  --host web01 \
  --collect --analyze \
  --runbooks http://chroma.internal:8000
 ```
 Credential options for remote runbook store:
 - `--runbooks-user <user>` / `--runbooks-password <password>` on `tai run`
 - `--store-user <user>` / `--store-password <password>` on `tai runbooks ...`
 - `--env-file <path>` (loads dotenv values)
 Dotenv keys for runbook store credentials:
 - `TAI_RUNBOOK_STORE_USER`
 - `TAI_RUNBOOK_STORE_PASSWORD`
 Remote runbook (playbook) sources supported by `tai runbooks sync --path`:
 - Local directory path (for example `./runbooks`)
 - SSH directory URI (for example `ssh://ops@ssh.archflux.net/opt/tai/runbooks`)
 - HTTP/HTTPS webroot URL that exposes `.md` links (for example `https://kb.example/runbooks/`)
 Webroot hardening rules:
 - Only `.md` links are considered for download.
 - Downloaded payload must look like real Markdown (HTML wrappers are ignored).
 - Non-markdown payloads are discarded.
 - Downloaded content is never executed. It is stored as plain text and only parsed for AI retrieval context.
 Single runbook (playbook) sources supported by `tai runbooks add`:
 - Local file path
 - SSH file URI (for example `ssh://ops@ssh.archflux.net/opt/tai/runbooks/nginx.md`)
 - HTTP/HTTPS URL to a Markdown file
 For HTTP/HTTPS single-file add, the source URL must end in `.md` and resolve to Markdown content.
 Examples:
 ```bash
 # Sync from SSH-hosted runbooks directory into remote ChromaDB
 tai runbooks sync \
  --path ssh://ops@ssh.archflux.net/opt/tai/runbooks \
  --store http://chroma.internal:8000
 # Sync from HTTPS webroot listing Markdown runbooks
 tai runbooks sync \
  --path https://kb.example/runbooks/ \
  --store ~/.tai/runbooks
 # Add one runbook directly from HTTPS
 tai runbooks add https://kb.example/runbooks/nginx.md --store ~/.tai/runbooks
 ```
 ## Runbook Workflow
 1. Write Markdown runbooks in `runbooks/` with frontmatter keys: `service`, `symptoms`, `tags`.
@@ -189,10 +310,20 @@ Focused suites:
 pytest tests/test_plan.py tests/test_ai.py tests/test_cli.py
 ```
 ## Man Page
 A manual page is available at `docs/tai.1`.
 Render it locally:
 ```bash
 man ./docs/tai.1
 ```
 ## Known Limits
 - Deep service-specific probes (known binary/config/package aliases) are richer for recognized services than generic service names.
- Session memory is available via `--session-memory`, but dedicated history UX commands (`tai history`, `/history`) are not implemented yet.
+- Clipboard export is intentionally not implemented.
 ## Changelog and Roadmap
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -101,7 +101,7 @@ Polish the interface for real-world use.
 - [x] Design CLI interface with run command, interactive prompts, and runbook subcommands
 - [x] Implement structured output sections (Root Cause, Evidence, Recommended Actions)
 - [x] Add RAG debug mode (`--rag-debug`) showing retrieval scores
- [ ] Support output to file or clipboard
+- [x] Support output to file (`--output-file`)
 - [x] Provide comprehensive `--help` command documentation via Typer options
 ______________________________________________________________________
@@ -321,3 +321,194 @@ ______________________________________________________________________
 | 2026-05-04 | RAG vector store (Tier 2/3) | `chromadb` embedded mode (default) or `qdrant` self-hosted |
 | 2026-05-04 | RAG chunking unit | Command-boundary splitting — each collected command = one or more chunks |
 | 2026-05-04 | Runbook format | Markdown with YAML frontmatter, version-controlled in `runbooks/` directory |
 ______________________________________________________________________
 ## End-State UX Goal
 After the current CLI and memory roadmap phases are stable, the long-term UX goal is a full-screen terminal TUI with an ncurses-style workflow.
 ### Target End-State
 - Split-pane troubleshooting workspace (diagnostics, AI output, and command/input area)
 - Live command/probe status with clear success/failure indicators
 - In-session history browser for prior questions, retrieved evidence, and related past sessions
 - Keyboard-first navigation for operators running in SSH-only environments
 ### Delivery Approach
 - Keep shipping incremental CLI features first (current roadmap order remains unchanged)
 - Promote stable workflows into TUI panels once behavior is proven in CLI mode
 - Treat the TUI as a final UX consolidation milestone, not a blocker for core troubleshooting capabilities
 ______________________________________________________________________
 ## Container Distribution Goal (Docker)
 After core CLI/TUI workflows stabilize, provide an official Docker image as an additional distribution target.
 ### Container Execution Model (Decision)
 - Docker is a one-shot invocation target, not a daemon/service mode
 - Each run executes a single `tai` command and exits
 - State is persisted only through mounted host volumes
 ### Why Docker Is Valuable Here
 - Reproducible runtime: pin Python and dependency versions to remove host-level drift
 - Faster operator onboarding: run with one command instead of local Python setup
 - Cleaner CI/CD release path: publish versioned images aligned with git tags
 - Safer local footprint: isolate dependencies from the host OS package manager
 ### Subgoals
 1. Base image and runtime hardening
 - Multi-stage Dockerfile with slim runtime image
 - Non-root runtime user and minimal filesystem permissions
 - Healthcheck for CLI startup and version command
 2. Runtime integration for SSH workflows
 - Documented mounts for `~/.ssh` (read-only where possible) and known-hosts handling
 - Pass-through for SSH config when needed (`--ignore-ssh-config` behavior documented)
 - Clear guidance for jump-host and bastion scenarios from inside the container
 - Documented one-shot run examples for `tai run` and `tai history`
 3. Persistent data strategy
 - Required volume mount guidance for runbook store (`~/.tai/runbooks`)
 - Required volume mount guidance for session memory/history (`~/.tai/sessions`)
 - Optional bind mount for JSONL logs and report export artifacts
 - Clear defaults for container paths and equivalent host path mappings
 4. Release and quality gates
 - Build and publish image on tagged releases
 - Smoke tests in CI: probe mode, collect mode, and history command against mocked endpoints
 - Version labeling (image tags and OCI metadata) tied to changelog/release tags
 ### Data Retention and Lifecycle Policy
 Retention behavior must be explicit and configurable at runtime. Defaults should be conservative and documented.
 1. Retention classes
 - Session memory store (`~/.tai/sessions`): keep semantically indexed summaries for troubleshooting continuity
 - Runbook store (`~/.tai/runbooks`): retain until explicitly replaced or pruned by sync policy
 - JSONL logs and exported reports: operator-controlled retention with optional TTL cleanup
 2. Retention controls
 - Add CLI controls for age-based pruning (for example `--retain-days` on cleanup commands)
 - Add host-scoped cleanup (delete history for one host) and full-store cleanup (all hosts)
 - Add dry-run cleanup mode to show what would be deleted before applying changes
 3. No-persist mode
 - Add a documented ephemeral mode where no session memory or logs are written
 - Ensure one-shot diagnostics can run in read-only operational contexts
 ### Configuration and State Persistence Model
 Configuration and retained state should be predictable across container upgrades and host environments.
 1. Mount and path contract
 - Define canonical container paths for `~/.tai/runbooks`, `~/.tai/sessions`, and optional log/export paths
 - Document required versus optional mounts and expected permissions for each
 - Document UID/GID mapping guidance to prevent host volume ownership issues
 2. Schema and compatibility
 - Introduce explicit storage schema version metadata for persistent stores
 - Define upgrade behavior for older stores (migrate, re-index, or fail with clear guidance)
 - Add compatibility notes for image upgrades and rollback expectations
 3. Backup and recovery
 - Provide export/import workflows for session memory and runbook indexes
 - Document minimal backup set and restore order for disaster recovery
 ### Security and Privacy for Retained Data
 Persisted troubleshooting evidence can include sensitive operational data and must be handled accordingly.
 1. Data minimization
 - Add optional redaction hooks for common sensitive patterns before persistence
 - Keep prompt-only transient data separate from persisted summary/index content
 2. Runtime hardening
 - Target non-root container execution with read-only root filesystem by default
 - Require explicit writable mounts only for retained data locations
 3. Auditable behavior
 - Log retention-affecting operations (cleanup, purge, export/import) with timestamps and scope
 - Define stable exit codes for cleanup and retention workflows to support automation
 ### Kubernetes Position
 Kubernetes is out of scope for this delivery plan.
 - `tai` is currently an operator-invoked troubleshooting client, not a long-running service
 - AI inference is external to `tai` (OpenAI-compatible endpoint), reducing the need for in-cluster model orchestration
 - SSH key/config handling and per-operator context are simpler with local or single-container execution
 Kubernetes can be revisited only if `tai` evolves into a centralized multi-user service with queueing, RBAC, and shared tenancy requirements.
 ______________________________________________________________________
 ## Final Long-Term Goal: Full Rust Migration
 This is a final-stage roadmap goal and remains explicitly out of near-term scope.
 It should begin only after the Python implementation, TUI direction, Docker one-shot model,
 and retention/persistence policies are stable and proven in production usage.
 ### Why This Is the Final Goal
 - Improve execution latency and startup speed for both native runs and container one-shot invocations
 - Produce a single, portable native binary with minimal runtime dependency footprint
 - Strengthen reliability and memory safety under heavy log parsing and concurrent workflows
 - Simplify long-term packaging and distribution across Linux targets
 ### Migration Objectives
 1. Preserve feature parity first
 - Match existing CLI behavior, interactive workflows, RAG integration, runbook management, and history/session-memory features
 - Keep command semantics and safety boundaries equivalent during transition
 2. Target both distribution modes
 - Native Rust binary for direct operator use
 - Docker image built around the Rust binary for one-shot execution with mounted persistent volumes
 3. Keep compatibility guardrails
 - Define persistent data format compatibility or migration tooling for runbook/session stores
 - Preserve operator-visible flags where practical to reduce migration friction
 ### Suggested Delivery Phases
 1. Build baseline Rust CLI scaffold with feature-flagged parity checkpoints
 1. Port SSH execution and read-only policy enforcement modules
 1. Port planner, collectors, prompt composition, and AI client adapters
 1. Port session memory/history and runbook workflows with migration tests
 1. Port interactive UX/TUI layer and deprecate Python runtime path
 ### Rust Toolchain End-State
 - Standardize on Cargo-based build/test/lint pipeline (`cargo fmt`, `cargo clippy`, `cargo test`)
 - Add release profile optimization and reproducible build settings
 - Publish signed native artifacts and Docker images derived from Rust release binaries
 ### Decision Gate Before Starting
 Begin Rust migration only when:
 - Python roadmap milestones are complete and stable
 - Container distribution and retention policy workflows are operationally validated
 - A parity test matrix exists to prove behavior equivalence during migration
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -53,14 +53,27 @@ This document describes tai's current runtime architecture, module responsibilit
 ## Data Stores
- Runbook store (Tier 2): local ChromaDB path, default `~/.tai/runbooks`
+- Runbook store (Tier 2): local ChromaDB path or remote ChromaDB HTTP endpoint (`--runbooks`, `runbooks --store`)
 - Run history store (Tier 3): SQLite file/URL or PostgreSQL DSN (`--history-db`)
 - Session logs: optional JSONL file configured by `--log-file`
 External DB auth can be provided by CLI options or dotenv file (`--env-file`) and is resolved without executing downloaded runbook content.
 ## Runbook Source Ingestion
 `tai runbooks sync --path` and `tai runbooks add` support runbook/playbook source retrieval from:
 - local filesystem paths
 - SSH URIs (`ssh://...`) via read-only remote fetch (`find`, `cat`)
 - HTTP/HTTPS URLs (single `.md` file or webroot index with `.md` links)
 Remote source content is materialized into temporary local files, embedded, and then indexed into the target ChromaDB store.
 ## Retrieval Layers
 - Tier 1 (implemented): in-memory semantic retrieval over diagnostic chunks
 - Tier 2 (implemented): persistent semantic retrieval over runbook corpus
- Tier 3 (pending): persistent retrieval over prior sessions
+- Tier 3 (implemented core): persistent retrieval over prior sessions (dedicated UX commands pending)
 ## Safety Boundaries
--- a/docs/tai.1
+++ b/docs/tai.1
@@ -0,0 +1,284 @@
 .TH TAI 1 "2026-05-11" "tai 0.4.0" "User Commands"
 .SH NAME
 tai \- read-only Linux troubleshooting assistant with SSH diagnostics and AI analysis
 .SH SYNOPSIS
 .B tai
 .RI [ GLOBAL_OPTIONS ]
 .B run
 .I ISSUE
 .RI [ RUN_OPTIONS ]
 .PP
 .B tai
 .B history
 .RI [ HISTORY_OPTIONS ]
 .PP
 .B tai
 .B runbooks
 .B sync
 .RI [ SYNC_OPTIONS ]
 .PP
 .B tai
 .B runbooks
 .B list
 .RI [ LIST_OPTIONS ]
 .PP
 .B tai
 .B runbooks
 .B add
 .I FILE
 .RI [ ADD_OPTIONS ]
 .SH DESCRIPTION
 .B tai
 connects to Linux hosts over SSH, collects read-only diagnostics, and can ask an OpenAI-compatible model for grounded analysis.
 .PP
 Remote runbook (playbook) sources can be local paths, SSH URIs, or HTTP/HTTPS webroots.
 Downloaded runbook content is never executed. It is stored as text and parsed for retrieval context only.
 .SH COMMANDS
 .SS run
 Main troubleshooting entrypoint.
 .TP
 .BI --host " HOST"
 Target host to troubleshoot.
 .TP
 .BI --port " PORT"
 SSH port (default: 22).
 .TP
 .BI --path " PATH"
 Target path to inspect. Repeatable.
 .TP
 .BI --identity-file " FILE"
 SSH private key path.
 .TP
 .BI --jump-host " HOST"
 SSH bastion/jump host.
 .TP
 .B --ignore-ssh-config
 Ignore ~/.ssh/config and rely only on CLI options.
 .TP
 .B --probe / --no-probe
 Enable or disable connectivity probe.
 .TP
 .B --collect / --no-collect
 Collect baseline diagnostics.
 .TP
 .B --analyze / --no-analyze
 Send diagnostics to AI for analysis.
 .TP
 .B --interactive / --no-interactive
 Interactive follow-up mode.
 .TP
 .BI --ai-host " URL"
 OpenAI-compatible AI backend URL.
 .TP
 .BI --model " NAME"
 Model name for analysis.
 .TP
 .BI --ai-key " KEY"
 API key for AI backend.
 .TP
 .BI --ai-timeout-seconds " SECONDS"
 Timeout for AI requests.
 .TP
 .BI --ai-max-tokens " TOKENS"
 Max completion tokens.
 .TP
 .BI --embed-model " NAME"
 Embedding model for RAG.
 .TP
 .B --no-rag
 Disable RAG retrieval.
 .TP
 .B --rag-debug / --no-rag-debug
 Print retrieval debug output.
 .TP
 .BI --runbooks " STORE"
 Runbook store path or remote Chroma URL.
 .TP
 .BI --runbooks-user " USER"
 Runbook store login/user for remote Chroma URLs.
 .TP
 .BI --runbooks-password " PASSWORD"
 Runbook store password for remote Chroma URLs.
 .TP
 .BI --history-db " TARGET"
 History DB target: SQLite path/URL or PostgreSQL DSN.
 .TP
 .BI --history-db-user " USER"
 History DB login/user for external database URLs.
 .TP
 .BI --history-db-password " PASSWORD"
 History DB password for external database URLs.
 .TP
 .B --history / --no-history
 Enable or disable history DB reads/writes.
 .TP
 .BI --output-file " FILE"
 Write analysis to file.
 .TP
 .BI --output-format " FORMAT"
 Output format: markdown or json.
 .TP
 .BI --log-file " FILE"
 Optional JSONL event log path.
 .TP
 .BI --env-file " FILE"
 Optional dotenv file used to resolve DB credentials.
 .SS history
 Search/list indexed troubleshooting history.
 .TP
 .BI --query " TEXT"
 Optional keyword search in issue/summary.
 .TP
 .BI --host " HOST"
 Filter by host.
 .TP
 .BI --limit " N"
 Maximum sessions to show.
 .TP
 .BI --export " FILE"
 Export results as Markdown.
 .TP
 .BI --history-db " TARGET"
 History DB target: SQLite path/URL or PostgreSQL DSN.
 .TP
 .BI --history-db-user " USER"
 History DB login/user for external database URLs.
 .TP
 .BI --history-db-password " PASSWORD"
 History DB password for external database URLs.
 .TP
 .BI --env-file " FILE"
 Optional dotenv file used to resolve DB credentials.
 .SS runbooks sync
 Index all runbooks from source path.
 .TP
 .BI --path " SOURCE"
 Runbook source: local directory, ssh://host/path, or http(s) webroot URL.
 .TP
 .BI --store " TARGET"
 ChromaDB store path or remote URL.
 .TP
 .BI --store-user " USER"
 Runbook store login/user for remote Chroma URLs.
 .TP
 .BI --store-password " PASSWORD"
 Runbook store password for remote Chroma URLs.
 .TP
 .BI --ai-host " URL"
 OpenAI-compatible AI backend URL.
 .TP
 .BI --embed-model " NAME"
 Embedding model name.
 .TP
 .BI --ai-key " KEY"
 API key for AI backend.
 .TP
 .BI --identity-file " FILE"
 SSH private key for ssh:// source.
 .TP
 .BI --jump-host " HOST"
 SSH bastion for ssh:// source.
 .TP
 .B --ignore-ssh-config
 Ignore ~/.ssh/config for ssh:// source.
 .TP
 .BI --env-file " FILE"
 Optional dotenv file used to resolve DB credentials.
 .SS runbooks list
 List indexed runbooks.
 .TP
 .BI --store " TARGET"
 ChromaDB store path or remote URL.
 .TP
 .BI --store-user " USER"
 Runbook store login/user for remote Chroma URLs.
 .TP
 .BI --store-password " PASSWORD"
 Runbook store password for remote Chroma URLs.
 .TP
 .BI --env-file " FILE"
 Optional dotenv file used to resolve DB credentials.
 .SS runbooks add
 Index one runbook file.
 .TP
 .BI FILE
 Runbook source: local file, ssh://host/path/file.md, or HTTP/HTTPS URL ending in .md.
 .TP
 .BI --store " TARGET"
 ChromaDB store path or remote URL.
 .TP
 .BI --store-user " USER"
 Runbook store login/user for remote Chroma URLs.
 .TP
 .BI --store-password " PASSWORD"
 Runbook store password for remote Chroma URLs.
 .TP
 .BI --ai-host " URL"
 OpenAI-compatible AI backend URL.
 .TP
 .BI --embed-model " NAME"
 Embedding model name.
 .TP
 .BI --ai-key " KEY"
 API key for AI backend.
 .TP
 .BI --identity-file " FILE"
 SSH private key for ssh:// source.
 .TP
 .BI --jump-host " HOST"
 SSH bastion for ssh:// source.
 .TP
 .B --ignore-ssh-config
 Ignore ~/.ssh/config for ssh:// source.
 .TP
 .BI --env-file " FILE"
 Optional dotenv file used to resolve DB credentials.
 .SH ENVIRONMENT
 The following variables are recognized for DB credentials:
 .TP
 .B TAI_HISTORY_DB_USER
 History DB user when --history-db points to external database.
 .TP
 .B TAI_HISTORY_DB_PASSWORD
 History DB password when --history-db points to external database.
 .TP
 .B TAI_RUNBOOK_STORE_USER
 Runbook store user for remote ChromaDB.
 .TP
 .B TAI_RUNBOOK_STORE_PASSWORD
 Runbook store password for remote ChromaDB.
 .SH SECURITY NOTES
 .TP
 \(bu
 SSH diagnostics are validated against read-only command policy.
 .TP
 \(bu
 Web/SSH runbook content is never executed.
 .TP
 \(bu
 Webroot ingestion accepts only Markdown-like payloads and skips HTML/non-markdown wrappers.
 .SH FILES
 .TP
 .I ~/.tai/history.db
 Default local history database.
 .TP
 .I ~/.tai/runbooks
 Default local runbook store path.
 .SH EXAMPLES
 .PP
 Analyze with PostgreSQL history DB credentials loaded from .env:
 .PP
 .nf
 $ tai run "sshd auth failed" --host bastion01 --collect --analyze \
    --history-db postgresql://db.internal:5432/tai --env-file ./.env
 .fi
 .PP
 Sync runbooks from HTTPS webroot to remote ChromaDB:
 .PP
 .nf
 $ tai runbooks sync --path https://kb.example/runbooks/ \
    --store https://chroma.internal:8443 --env-file ./.env
 .fi
 .SH SEE ALSO
 .BR README.md ,
 .BR docs/ARCHITECTURE.md
--- a/src/tai/init.py
+++ b/src/tai/init.py
@@ -2,4 +2,4 @@
 __all__ = ["__version__"]
-__version__ = "0.1.0"
+__version__ = "0.4.0"
--- a/src/tai/chroma_telemetry.py
+++ b/src/tai/chroma_telemetry.py
@@ -7,16 +7,15 @@ disabled, so tai wires ChromaDB to this no-op client instead.
 from __future__ import annotations
 from typing import override
 from chromadb.config import System
 from chromadb.telemetry.product import ProductTelemetryClient, ProductTelemetryEvent
 from typing_extensions import override
 class NoOpProductTelemetryClient(ProductTelemetryClient):
    """Telemetry client that intentionally drops all events."""
-    def __init__(self, system: System):
+    def __init__(self, system: System) -> None:
        super().__init__(system)
    @override
--- a/src/tai/cli.py
+++ b/src/tai/cli.py
--- a/src/tai/history_store.py
+++ b/src/tai/history_store.py
@@ -0,0 +1,372 @@
 """Persistent run history store backed by SQLite.
 Stores full per-run JSON payloads and allows retrieving host-specific history
 to ground future analyses.
 """
 from __future__ import annotations
 import json
 import sqlite3
 from contextlib import contextmanager
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Any
 from urllib.parse import urlparse
 from tai.session_store import PastSession
 DEFAULT_HISTORY_DB_PATH = "~/.tai/history.db"
@dataclass(slots=True)
 class HistoryRecord:
    """Full history record persisted for one run."""
    generated_at: str
    issue: str
    host: str
    model: str
    collection_total: int | None
    collection_failed: int | None
    collection_succeeded: int | None
    prompt_tokens: int | None
    completion_tokens: int | None
    total_tokens: int | None
    analysis: str
 class RunHistoryStore:
    """History store for host-scoped run payloads.
    Supported backends:
    - SQLite local path (default, e.g. ``~/.tai/history.db``)
    - SQLite URL (e.g. ``sqlite:////tmp/history.db``)
    - PostgreSQL DSN (e.g. ``postgresql://user:pass@host:5432/dbname``)
    """
    def __init__(self, db_path: str | Path = DEFAULT_HISTORY_DB_PATH) -> None:
        self._backend = "sqlite"
        self._postgres_dsn: str | None = None
        self._path: Path | None = None
        raw = str(db_path)
        parsed = urlparse(raw)
        if parsed.scheme in {"postgres", "postgresql"}:
            self._backend = "postgres"
            self._postgres_dsn = raw
        elif parsed.scheme == "sqlite":
            sqlite_path = parsed.path or ""
            if sqlite_path.startswith("//"):
                sqlite_path = sqlite_path[1:]
            self._path = Path(sqlite_path).expanduser().resolve()
            self._path.parent.mkdir(parents=True, exist_ok=True)
        else:
            self._path = Path(raw).expanduser().resolve()
            self._path.parent.mkdir(parents=True, exist_ok=True)
        self._init_schema()
    def _connect(self) -> sqlite3.Connection:
        if self._path is None:
            raise RuntimeError("SQLite path is not configured for this history backend")
        conn = sqlite3.connect(str(self._path))
        conn.row_factory = sqlite3.Row
        return conn
    @contextmanager
    def _connect_postgres(self) -> Any:
        if self._postgres_dsn is None:
            raise RuntimeError("PostgreSQL DSN is not configured for this history backend")
        try:
            import psycopg  # type: ignore[import-not-found]
            from psycopg.rows import dict_row  # type: ignore[import-not-found]
        except Exception as exc:  # noqa: BLE001
            raise RuntimeError(
                "PostgreSQL history backend requires psycopg. "
                "Install with: pip install psycopg[binary]"
            ) from exc
        conn = psycopg.connect(self._postgres_dsn, row_factory=dict_row)
        try:
            yield conn
        finally:
            conn.close()
    def _init_schema(self) -> None:
        if self._backend == "postgres":
            with self._connect_postgres() as conn:
                with conn.cursor() as cur:
                    cur.execute(
                        """
                        CREATE TABLE IF NOT EXISTS run_history (
                            id BIGSERIAL PRIMARY KEY,
                            generated_at TEXT NOT NULL,
                            host TEXT NOT NULL,
                            issue TEXT NOT NULL,
                            model TEXT NOT NULL,
                            analysis TEXT NOT NULL,
                            payload_json TEXT NOT NULL
                        )
                        """
                    )
                    cur.execute(
                        """
                        CREATE INDEX IF NOT EXISTS idx_run_history_host_ts
                        ON run_history(host, generated_at DESC)
                        """
                    )
                conn.commit()
            return
        with self._connect() as conn:
            conn.execute(
                """
                CREATE TABLE IF NOT EXISTS run_history (
                    id INTEGER PRIMARY KEY AUTOINCREMENT,
                    generated_at TEXT NOT NULL,
                    host TEXT NOT NULL,
                    issue TEXT NOT NULL,
                    model TEXT NOT NULL,
                    analysis TEXT NOT NULL,
                    payload_json TEXT NOT NULL
                )
                """
            )
            conn.execute(
                """
                CREATE INDEX IF NOT EXISTS idx_run_history_host_ts
                ON run_history(host, generated_at DESC)
                """
            )
    def count(self, *, host: str | None = None) -> int:
        if self._backend == "postgres":
            with self._connect_postgres() as conn:
                with conn.cursor() as cur:
                    if host is None:
                        cur.execute("SELECT COUNT(*) AS c FROM run_history")
                    else:
                        cur.execute(
                            "SELECT COUNT(*) AS c FROM run_history WHERE lower(host)=lower(%s)",
                            (host,),
                        )
                    row = cur.fetchone()
            if not row:
                return 0
            return int(row["c"])
        with self._connect() as conn:
            if host is None:
                row = conn.execute("SELECT COUNT(*) AS c FROM run_history").fetchone()
            else:
                row = conn.execute(
                    "SELECT COUNT(*) AS c FROM run_history WHERE lower(host)=lower(?)",
                    (host,),
                ).fetchone()
        count_value = row["c"] if row else 0
        return int(count_value) if isinstance(count_value, (int, float)) else 0
    def add_payload(self, payload: dict[str, object]) -> int:
        generated_at = str(payload.get("generated_at", ""))
        host = str(payload.get("host", ""))
        issue = str(payload.get("issue", ""))
        model = str(payload.get("model", ""))
        analysis = str(payload.get("analysis", ""))
        payload_json = json.dumps(payload, ensure_ascii=True)
        if self._backend == "postgres":
            with self._connect_postgres() as conn:
                with conn.cursor() as cur:
                    cur.execute(
                        """
                        INSERT INTO run_history(
                            generated_at, host, issue, model, analysis,
                            payload_json
                        )
                        VALUES (%s, %s, %s, %s, %s, %s)
                        RETURNING id
                        """,
                        (generated_at, host, issue, model, analysis, payload_json),
                    )
                    row = cur.fetchone()
                conn.commit()
            row_id = row["id"] if row else None
            return int(row_id) if row_id is not None else 0
        with self._connect() as conn:
            cursor = conn.execute(
                """
                INSERT INTO run_history(generated_at, host, issue, model, analysis, payload_json)
                VALUES (?, ?, ?, ?, ?, ?)
                """,
                (generated_at, host, issue, model, analysis, payload_json),
            )
            return cursor.lastrowid if cursor.lastrowid is not None else 0
    def list_host_sessions(self, host: str, *, limit: int = 5) -> list[PastSession]:
        return self.list_recent(host=host, limit=limit)
    def list_recent(self, *, host: str | None = None, limit: int = 20) -> list[PastSession]:
        """Return recent records, optionally filtered by host."""
        if limit < 1:
            return []
        if self._backend == "postgres":
            with self._connect_postgres() as conn:
                with conn.cursor() as cur:
                    if host:
                        cur.execute(
                            """
                            SELECT id, host, issue, analysis
                            FROM run_history
                            WHERE lower(host)=lower(%s)
                            ORDER BY generated_at DESC
                            LIMIT %s
                            """,
                            (host, limit),
                        )
                    else:
                        cur.execute(
                            """
                            SELECT id, host, issue, analysis
                            FROM run_history
                            ORDER BY generated_at DESC
                            LIMIT %s
                            """,
                            (limit,),
                        )
                    rows = cur.fetchall()
            return [
                PastSession(
                    session_id=f"db-{row['id']}",
                    host=str(row["host"]),
                    issue=str(row["issue"]),
                    summary=str(row["analysis"]),
                )
                for row in rows
            ]
        with self._connect() as conn:
            if host:
                rows = conn.execute(
                    """
                    SELECT id, host, issue, analysis
                    FROM run_history
                    WHERE lower(host)=lower(?)
                    ORDER BY generated_at DESC
                    LIMIT ?
                    """,
                    (host, limit),
                ).fetchall()
            else:
                rows = conn.execute(
                    """
                    SELECT id, host, issue, analysis
                    FROM run_history
                    ORDER BY generated_at DESC
                    LIMIT ?
                    """,
                    (limit,),
                ).fetchall()
        return [
            PastSession(
                session_id=f"db-{row['id']}",
                host=str(row["host"]),
                issue=str(row["issue"]),
                summary=str(row["analysis"]),
            )
            for row in rows
        ]
    def search_keyword(
        self,
        keyword: str,
        *,
        host: str | None = None,
        limit: int = 20,
    ) -> list[PastSession]:
        """Search issue/analysis text for *keyword*, optionally scoped by host."""
        term = keyword.strip()
        if not term:
            return self.list_recent(host=host, limit=limit)
        if limit < 1:
            return []
        like_pattern = f"%{term}%"
        if self._backend == "postgres":
            with self._connect_postgres() as conn:
                with conn.cursor() as cur:
                    if host:
                        cur.execute(
                            """
                            SELECT id, host, issue, analysis
                            FROM run_history
                            WHERE lower(host)=lower(%s)
                              AND (issue ILIKE %s OR analysis ILIKE %s)
                            ORDER BY generated_at DESC
                            LIMIT %s
                            """,
                            (host, like_pattern, like_pattern, limit),
                        )
                    else:
                        cur.execute(
                            """
                            SELECT id, host, issue, analysis
                            FROM run_history
                            WHERE issue ILIKE %s OR analysis ILIKE %s
                            ORDER BY generated_at DESC
                            LIMIT %s
                            """,
                            (like_pattern, like_pattern, limit),
                        )
                    rows = cur.fetchall()
            return [
                PastSession(
                    session_id=f"db-{row['id']}",
                    host=str(row["host"]),
                    issue=str(row["issue"]),
                    summary=str(row["analysis"]),
                )
                for row in rows
            ]
        with self._connect() as conn:
            if host:
                rows = conn.execute(
                    """
                    SELECT id, host, issue, analysis
                    FROM run_history
                    WHERE lower(host)=lower(?)
                      AND (issue LIKE ? COLLATE NOCASE OR analysis LIKE ? COLLATE NOCASE)
                    ORDER BY generated_at DESC
                    LIMIT ?
                    """,
                    (host, like_pattern, like_pattern, limit),
                ).fetchall()
            else:
                rows = conn.execute(
                    """
                    SELECT id, host, issue, analysis
                    FROM run_history
                    WHERE issue LIKE ? COLLATE NOCASE OR analysis LIKE ? COLLATE NOCASE
                    ORDER BY generated_at DESC
                    LIMIT ?
                    """,
                    (like_pattern, like_pattern, limit),
                ).fetchall()
        return [
            PastSession(
                session_id=f"db-{row['id']}",
                host=str(row["host"]),
                issue=str(row["issue"]),
                summary=str(row["analysis"]),
            )
            for row in rows
        ]
--- a/src/tai/runbook_store.py
+++ b/src/tai/runbook_store.py
@@ -14,10 +14,12 @@ Typical flow
 from __future__ import annotations
 import base64
 import re
 from dataclasses import dataclass, field
 from pathlib import Path
 from typing import TYPE_CHECKING, Any
 from urllib.parse import urlparse
 if TYPE_CHECKING:
    from tai.ai_client import AIClient
@@ -100,11 +102,19 @@ class RunbookStore:
        Defaults to ``~/.tai/runbooks``.
    """
-    def __init__(self, store_path: str | Path = DEFAULT_STORE_PATH) -> None:
+    def __init__(
        self,
        store_path: str | Path = DEFAULT_STORE_PATH,
        *,
        username: str | None = None,
        password: str | None = None,
    ) -> None:
        import chromadb  # optional dep — imported lazily
-        path = Path(store_path).expanduser().resolve()
+        raw_store = str(store_path)
-        path.mkdir(parents=True, exist_ok=True)
+        parsed = urlparse(raw_store)
        is_remote = parsed.scheme in {"http", "https"}
        settings = None
        try:
            from chromadb.config import Settings
@@ -119,6 +129,24 @@ class RunbookStore:
            # does not expose the real config module.
            settings = None
        if is_remote:
            host = parsed.hostname or "localhost"
            port = parsed.port or (443 if parsed.scheme == "https" else 80)
            ssl = parsed.scheme == "https"
            auth_user = username if username is not None else parsed.username
            auth_pass = password if password is not None else parsed.password
            headers = None
            if auth_user is not None and auth_pass is not None:
                token = base64.b64encode(f"{auth_user}:{auth_pass}".encode()).decode("ascii")
                headers = {"Authorization": f"Basic {token}"}
            if headers is None:
                self._client = chromadb.HttpClient(host=host, port=port, ssl=ssl)
            else:
                self._client = chromadb.HttpClient(host=host, port=port, ssl=ssl, headers=headers)
        else:
            path = Path(store_path).expanduser().resolve()
            path.mkdir(parents=True, exist_ok=True)
            if settings is None:
                self._client = chromadb.PersistentClient(path=str(path))
            else:
--- a/src/tai/session_store.py
+++ b/src/tai/session_store.py
@@ -98,6 +98,60 @@ class SessionStore:
                )
        return sessions
    def list_recent(self, *, host: str | None = None, limit: int = 20) -> list[PastSession]:
        """Return recent indexed sessions, optionally filtered by host."""
        if limit < 1:
            raise ValueError("limit must be >= 1")
        count = self._collection.count()
        if count == 0:
            return []
        results = self._collection.get(
            include=["documents", "metadatas"],
            limit=min(limit, count),
        )
        ids = results.get("ids") or []
        docs = results.get("documents") or []
        metas = results.get("metadatas") or []
        sessions: list[PastSession] = []
        for sid, doc, meta in zip(ids, docs, metas, strict=False):
            sessions.append(
                PastSession(
                    session_id=str(sid),
                    host=str(meta.get("host", "")),
                    issue=str(meta.get("issue", "")),
                    summary=str(doc),
                )
            )
        if host:
            host_norm = host.strip().lower()
            sessions = [s for s in sessions if s.host.lower() == host_norm]
        sessions.sort(key=lambda s: s.session_id, reverse=True)
        return sessions[:limit]
    def search_keyword(
        self,
        keyword: str,
        *,
        host: str | None = None,
        limit: int = 20,
    ) -> list[PastSession]:
        """Return recent sessions matching a keyword in issue or summary text."""
        term = keyword.strip().lower()
        if not term:
            return self.list_recent(host=host, limit=limit)
        all_recent = self.list_recent(host=host, limit=max(limit, self.count()))
        filtered = [
            sess
            for sess in all_recent
            if term in sess.issue.lower() or term in sess.summary.lower()
        ]
        return filtered[:limit]
 def _build_embed_text(*, host: str, issue: str, summary: str) -> str:
    """Build embedding text with host/issue context and summary excerpt."""
--- a/tai-live-ai-check.md
+++ b/tai-live-ai-check.md
@@ -0,0 +1,3 @@
 # Live AI check
 This verifies real embedding calls.
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -1,9 +1,22 @@
 import json
 import os
 import re
 from pathlib import Path
 from types import SimpleNamespace
 from unittest.mock import AsyncMock, MagicMock
 import pytest
 from typer.testing import CliRunner
-from tai.cli import app
+from tai.cli import (
    _download_markdown_url,
    _inject_url_credentials,
    _load_env_file,
    _materialize_runbook_add_path,
    _materialize_runbooks_sync_path,
    _resolve_secret,
    app,
 )
 from tai.collectors import CollectedItem, CollectionReport
 from tai.rag_retriever import Chunk, EmbeddedChunk
 from tai.ssh_client import SSHCommandResult
@@ -338,3 +351,509 @@ def test_interactive_rag_debug_prints_retrieval_scores(monkeypatch) -> None:  #
    assert result.exit_code == 0
    assert "RAG retrieve:" in result.stdout
 def test_history_command_lists_sessions(monkeypatch) -> None:  # type: ignore[no-untyped-def]
    class FakeStore:
        def __init__(self, _path: str, **_kwargs) -> None:
            pass
        def list_recent(self, *, host: str | None = None, limit: int = 20):
            del limit
            if host == "web01":
                return [
                    SimpleNamespace(
                        session_id="20260507T120000Z",
                        host="web01",
                        issue="nginx down",
                        summary="Root cause: bad config",
                    )
                ]
            return []
    monkeypatch.setattr("tai.cli.RunHistoryStore", FakeStore)
    runner = CliRunner()
    result = runner.invoke(
        app,
        ["history", "--history-db", "~/.tai/history.db", "--host", "web01"],
    )
    assert result.exit_code == 0
    assert "session(s)" in result.stdout
    assert "20260507T120000Z" in result.stdout
 def test_history_command_exports_markdown(monkeypatch, tmp_path: Path) -> None:  # type: ignore[no-untyped-def]
    class FakeStore:
        def __init__(self, _path: str, **_kwargs) -> None:
            pass
        def list_recent(self, *, host: str | None = None, limit: int = 20):
            del host, limit
            return [
                SimpleNamespace(
                    session_id="20260507T120000Z",
                    host="web01",
                    issue="nginx down",
                    summary="Root cause: bad config",
                )
            ]
    monkeypatch.setattr("tai.cli.RunHistoryStore", FakeStore)
    export_path = tmp_path / "history.md"
    runner = CliRunner()
    result = runner.invoke(
        app,
        ["history", "--history-db", "~/.tai/history.db", "--export", str(export_path)],
    )
    assert result.exit_code == 0
    assert "Exported" in result.stdout
    text = export_path.read_text(encoding="utf-8")
    assert "# tai session history" in text
    assert "nginx down" in text
 def test_interactive_history_without_store_shows_hint(monkeypatch) -> None:  # type: ignore[no-untyped-def]
    _mock_session(monkeypatch)
    async def fake_collect_from_plan(_session, _plan) -> CollectionReport:  # type: ignore[no-untyped-def]
        return CollectionReport(
            host="ssh.archflux.net",
            items=[
                CollectedItem(
                    name="kernel",
                    result=SSHCommandResult(
                        command="uname -a",
                        exit_code=0,
                        stdout="Linux test",
                        stderr="",
                    ),
                ),
            ],
        )
    commands = iter(["/history", "/quit"])
    monkeypatch.setattr("tai.cli.collect_from_plan", fake_collect_from_plan)
    monkeypatch.setattr("tai.cli.console.input", lambda _prompt: next(commands))
    monkeypatch.setattr("tai.cli._stdin_is_tty", lambda: True)
    runner = CliRunner()
    result = runner.invoke(
        app,
        [
            "run", "apache failed",
            "--host",
            "ssh.archflux.net",
            "--port",
            "5566",
            "--no-probe",
            "--interactive",
            "--no-history",
        ],
    )
    assert result.exit_code == 0
    assert "History DB is disabled" in result.stdout
 def test_run_analyze_writes_output_file(monkeypatch, tmp_path: Path) -> None:  # type: ignore[no-untyped-def]
    _mock_session(monkeypatch)
    async def fake_collect_from_plan(_session, _plan) -> CollectionReport:  # type: ignore[no-untyped-def]
        return CollectionReport(
            host="ssh.archflux.net",
            items=[
                CollectedItem(
                    name="kernel",
                    result=SSHCommandResult(
                        command="uname -a",
                        exit_code=0,
                        stdout="Linux test",
                        stderr="",
                    ),
                ),
            ],
        )
    monkeypatch.setattr("tai.cli.collect_from_plan", fake_collect_from_plan)
    response = SimpleNamespace(content="Root Cause\n\nEvidence\n\nRecommended Actions")
    monkeypatch.setattr(
        "tai.cli.AIClient.complete",
        lambda *_args, **_kwargs: response,
    )
    output_path = tmp_path / "analysis.md"
    runner = CliRunner()
    result = runner.invoke(
        app,
        [
            "run", "apache failed",
            "--host",
            "ssh.archflux.net",
            "--port",
            "5566",
            "--no-probe",
            "--analyze",
            "--output-file",
            str(output_path),
        ],
    )
    assert result.exit_code == 0
    assert "Wrote analysis output" in result.stdout
    assert output_path.exists()
    assert "Root Cause" in output_path.read_text(encoding="utf-8")
 def test_run_analyze_writes_json_output_and_strips_ansi(monkeypatch, tmp_path: Path) -> None:  # type: ignore[no-untyped-def]
    _mock_session(monkeypatch)
    async def fake_collect_from_plan(_session, _plan) -> CollectionReport:  # type: ignore[no-untyped-def]
        return CollectionReport(
            host="ssh.archflux.net",
            items=[
                CollectedItem(
                    name="kernel",
                    result=SSHCommandResult(
                        command="uname -a",
                        exit_code=0,
                        stdout="Linux test",
                        stderr="",
                    ),
                ),
            ],
        )
    monkeypatch.setattr("tai.cli.collect_from_plan", fake_collect_from_plan)
    monkeypatch.setattr(
        "tai.cli.AIClient.complete",
        lambda *_args, **_kwargs: SimpleNamespace(
            content="\x1b[31mRoot Cause\x1b[0m\n\nEvidence\n\nRecommended Actions"
        ),
    )
    output_path = tmp_path / "analysis.json"
    runner = CliRunner()
    result = runner.invoke(
        app,
        [
            "run", "apache failed",
            "--host",
            "ssh.archflux.net",
            "--port",
            "5566",
            "--no-probe",
            "--analyze",
            "--output-file",
            str(output_path),
            "--output-format",
            "json",
        ],
    )
    assert result.exit_code == 0
    payload = json.loads(output_path.read_text(encoding="utf-8"))
    assert payload["schema"] == "tai.analysis.v1"
    assert "generated_at" in payload
    assert payload["issue"] == "apache failed"
    assert payload["host"] == "ssh.archflux.net"
    assert payload["collection"] == {"total": 1, "failed": 0, "succeeded": 1}
    assert payload["token_usage"] == {
        "prompt_tokens": None,
        "completion_tokens": None,
        "total_tokens": None,
    }
    assert "Root Cause" in payload["analysis"]
    assert "\u001b" not in payload["analysis"]
 def test_run_analyze_writes_history_db_record(monkeypatch, tmp_path: Path) -> None:  # type: ignore[no-untyped-def]
    _mock_session(monkeypatch)
    async def fake_collect_from_plan(_session, _plan) -> CollectionReport:  # type: ignore[no-untyped-def]
        return CollectionReport(
            host="ssh.archflux.net",
            items=[
                CollectedItem(
                    name="kernel",
                    result=SSHCommandResult(
                        command="uname -a",
                        exit_code=0,
                        stdout="Linux test",
                        stderr="",
                    ),
                ),
            ],
        )
    monkeypatch.setattr("tai.cli.collect_from_plan", fake_collect_from_plan)
    response = SimpleNamespace(content="Root Cause\n\nEvidence\n\nRecommended Actions")
    monkeypatch.setattr(
        "tai.cli.AIClient.complete",
        lambda *_args, **_kwargs: response,
    )
    history_db = tmp_path / "history.db"
    runner = CliRunner()
    result = runner.invoke(
        app,
        [
            "run", "apache failed",
            "--host",
            "ssh.archflux.net",
            "--port",
            "5566",
            "--no-probe",
            "--analyze",
            "--history-db",
            str(history_db),
        ],
    )
    assert result.exit_code == 0
    import sqlite3
    with sqlite3.connect(str(history_db)) as conn:
        row = conn.execute(
            "SELECT host, issue, payload_json FROM run_history ORDER BY id DESC LIMIT 1"
        ).fetchone()
    assert row is not None
    assert row[0] == "ssh.archflux.net"
    assert row[1] == "apache failed"
    payload = json.loads(row[2])
    assert payload["schema"] == "tai.analysis.v1"
    assert payload["host"] == "ssh.archflux.net"
 def test_materialize_runbooks_sync_path_http_webroot(monkeypatch, tmp_path: Path) -> None:  # type: ignore[no-untyped-def]
    html = '<html><body><a href="nginx.md">nginx</a><a href="ssh.md">ssh</a></body></html>'
    def fake_download(url: str) -> str:
        if url == "https://kb.example/runbooks/":
            return html
        if url.endswith("nginx.md"):
            return "---\nservice: nginx\n---\nbody"
        if url.endswith("ssh.md"):
            return "---\nservice: ssh\n---\nbody"
        raise AssertionError(url)
    monkeypatch.setattr("tai.cli._download_text_url", fake_download)
    source_dir, label, temp_dir = _materialize_runbooks_sync_path(
        "https://kb.example/runbooks/",
        identity_file=None,
        jump_host=None,
        ignore_ssh_config=False,
    )
    assert label == "https://kb.example/runbooks/"
    assert temp_dir is not None
    assert (source_dir / "nginx.md").is_file()
    assert (source_dir / "ssh.md").is_file()
 def test_materialize_runbook_add_path_http_url(monkeypatch) -> None:  # type: ignore[no-untyped-def]
    monkeypatch.setattr(
        "tai.cli._download_markdown_url",
        lambda _url: "---\nservice: nginx\n---\nbody",
    )
    source_file, label, temp_dir = _materialize_runbook_add_path(
        "https://kb.example/runbooks/nginx.md",
        identity_file=None,
        jump_host=None,
        ignore_ssh_config=False,
    )
    assert label == "https://kb.example/runbooks/nginx.md"
    assert temp_dir is not None
    assert source_file.name == "nginx.md"
    assert source_file.read_text(encoding="utf-8").startswith("---")
 def test_download_markdown_url_rejects_html(monkeypatch) -> None:  # type: ignore[no-untyped-def]
    monkeypatch.setattr(
        "tai.cli._download_text_url",
        lambda _url: "<!DOCTYPE html><html><body>not markdown</body></html>",
    )
    with pytest.raises(ValueError, match="does not appear to be a Markdown payload"):
        _download_markdown_url("https://kb.example/runbooks/nginx.md")
 def test_materialize_runbooks_sync_path_http_skips_html_wrappers(monkeypatch) -> None:  # type: ignore[no-untyped-def]
    html = '<html><body><a href="nginx.md">nginx</a><a href="ssh.md">ssh</a></body></html>'
    def fake_download(url: str) -> str:
        if url == "https://kb.example/runbooks/":
            return html
        if url.endswith("nginx.md"):
            return "---\nservice: nginx\n---\nbody"
        if url.endswith("ssh.md"):
            return "<!DOCTYPE html><html><body>wrapper</body></html>"
        raise AssertionError(url)
    monkeypatch.setattr("tai.cli._download_text_url", fake_download)
    source_dir, _label, temp_dir = _materialize_runbooks_sync_path(
        "https://kb.example/runbooks/",
        identity_file=None,
        jump_host=None,
        ignore_ssh_config=False,
    )
    assert temp_dir is not None
    assert (source_dir / "nginx.md").is_file()
    assert not (source_dir / "ssh.md").exists()
 def test_materialize_runbook_add_path_http_requires_md_suffix() -> None:
    with pytest.raises(ValueError, match="must point to a .md file"):
        _materialize_runbook_add_path(
            "https://kb.example/runbooks/",
            identity_file=None,
            jump_host=None,
            ignore_ssh_config=False,
        )
 def test_runbooks_sync_accepts_ssh_source(monkeypatch, tmp_path: Path) -> None:  # type: ignore[no-untyped-def]
    runbooks_dir = tmp_path / "remote-runbooks"
    runbooks_dir.mkdir(parents=True)
    (runbooks_dir / "nginx.md").write_text("---\nservice: nginx\n---\nbody", encoding="utf-8")
    monkeypatch.setattr(
        "tai.cli._materialize_runbooks_sync_path",
        lambda *_args, **_kwargs: (runbooks_dir, "ssh://ops@host/runbooks", None),
    )
    class FakeStore:
        def __init__(self, _path: str, **_kwargs) -> None:
            pass
        def sync(self, _dir: Path, _ai):
            return 1
    monkeypatch.setattr("tai.cli.RunbookStore", FakeStore)
    monkeypatch.setattr("tai.cli.AIClient", lambda *_a, **_k: object())
    runner = CliRunner()
    result = runner.invoke(
        app,
        [
            "runbooks",
            "sync",
            "--path",
            "ssh://ops@host/runbooks",
            "--store",
            "~/.tai/runbooks",
        ],
    )
    assert result.exit_code == 0
    assert "Synced 1 runbook(s)" in result.stdout
    assert "ssh://ops@host/runbooks" in result.stdout
 def test_runbooks_add_accepts_https_source(monkeypatch) -> None:  # type: ignore[no-untyped-def]
    import tempfile
    fd, temp_name = tempfile.mkstemp(prefix="tai-runbook-test-", suffix=".md")
    os.close(fd)
    Path(temp_name).write_text("---\nservice: nginx\n---\nbody", encoding="utf-8")
    monkeypatch.setattr(
        "tai.cli._materialize_runbook_add_path",
        lambda *_args, **_kwargs: (Path(temp_name), "https://kb.example/nginx.md", None),
    )
    class FakeStore:
        def __init__(self, _path: str, **_kwargs) -> None:
            pass
        def sync_single(self, _path: Path, _ai):
            return None
    monkeypatch.setattr("tai.cli.RunbookStore", FakeStore)
    monkeypatch.setattr("tai.cli.AIClient", lambda *_a, **_k: object())
    runner = CliRunner()
    result = runner.invoke(
        app,
        [
            "runbooks",
            "add",
            "https://kb.example/nginx.md",
            "--store",
            "~/.tai/runbooks",
        ],
    )
    assert result.exit_code == 0
    assert "Indexed" in result.stdout
    assert "https://kb.example/nginx.md" in result.stdout
    Path(temp_name).unlink(missing_ok=True)
 def test_inject_url_credentials_postgres() -> None:
    target = "postgresql://db.example.com:5432/tai"
    rendered = _inject_url_credentials(
        target,
        user="tai_user",
        password="secret",
        schemes={"postgresql", "postgres"},
    )
    assert rendered.startswith("postgresql://tai_user:secret@db.example.com:5432/tai")
 def test_inject_url_credentials_ignores_non_matching_scheme() -> None:
    target = "~/.tai/history.db"
    rendered = _inject_url_credentials(
        target,
        user="tai_user",
        password="secret",
        schemes={"postgresql", "postgres"},
    )
    assert rendered == target
 def test_load_env_file_and_resolve_secret(tmp_path: Path, monkeypatch) -> None:  # type: ignore[no-untyped-def]
    env_file = tmp_path / ".env"
    env_file.write_text(
        "TAI_HISTORY_DB_USER=from_file\n"
        "TAI_HISTORY_DB_PASSWORD=from_file_pw\n",
        encoding="utf-8",
    )
    values = _load_env_file(str(env_file))
    assert values["TAI_HISTORY_DB_USER"] == "from_file"
    assert values["TAI_HISTORY_DB_PASSWORD"] == "from_file_pw"
    monkeypatch.setenv("TAI_HISTORY_DB_USER", "from_env")
    assert _resolve_secret(None, "TAI_HISTORY_DB_USER", values) == "from_file"
    assert _resolve_secret("from_cli", "TAI_HISTORY_DB_USER", values) == "from_cli"
 def test_man_page_covers_cli_long_options() -> None:
    runner = CliRunner()
    help_invocations = [
        ["run", "--help"],
        ["history", "--help"],
        ["runbooks", "sync", "--help"],
        ["runbooks", "list", "--help"],
        ["runbooks", "add", "--help"],
    ]
    documented = Path("docs/tai.1").read_text(encoding="utf-8")
    discovered: set[str] = set()
    for args in help_invocations:
        result = runner.invoke(app, args)
        assert result.exit_code == 0, f"help command failed for: {' '.join(args)}"
        discovered.update(re.findall(r"--[a-z0-9][a-z0-9-]*", result.stdout))
    discovered.discard("--help")
    missing = sorted(option for option in discovered if option not in documented)
    assert missing == [], f"Missing options in docs/tai.1: {', '.join(missing)}"
--- a/tests/test_history_store.py
+++ b/tests/test_history_store.py
@@ -0,0 +1,115 @@
 """Tests for SQLite-backed run history storage."""
 from __future__ import annotations
 from pathlib import Path
 from tai.history_store import RunHistoryStore
 def test_history_store_add_and_count(tmp_path) -> None:  # type: ignore[no-untyped-def]
    store = RunHistoryStore(tmp_path / "history.db")
    assert store.count() == 0
    payload = {
        "schema": "tai.analysis.v1",
        "generated_at": "2026-05-11T12:00:00+00:00",
        "issue": "sshd failed",
        "host": "ssh.archflux.net",
        "model": "gemma3:4b",
        "collection": {"total": 5, "failed": 1, "succeeded": 4},
        "token_usage": {"prompt_tokens": 10, "completion_tokens": 20, "total_tokens": 30},
        "analysis": "Root Cause...",
    }
    store.add_payload(payload)
    assert store.count() == 1
    assert store.count(host="ssh.archflux.net") == 1
    assert store.count(host="other") == 0
 def test_history_store_list_host_sessions(tmp_path) -> None:  # type: ignore[no-untyped-def]
    store = RunHistoryStore(tmp_path / "history.db")
    store.add_payload(
        {
            "schema": "tai.analysis.v1",
            "generated_at": "2026-05-11T12:00:00+00:00",
            "issue": "issue one",
            "host": "ssh.archflux.net",
            "model": "gemma3:4b",
            "collection": {"total": 1, "failed": 0, "succeeded": 1},
            "token_usage": {"prompt_tokens": 1, "completion_tokens": 2, "total_tokens": 3},
            "analysis": "first",
        }
    )
    store.add_payload(
        {
            "schema": "tai.analysis.v1",
            "generated_at": "2026-05-11T12:05:00+00:00",
            "issue": "issue two",
            "host": "ssh.archflux.net",
            "model": "gemma3:4b",
            "collection": {"total": 1, "failed": 0, "succeeded": 1},
            "token_usage": {"prompt_tokens": 1, "completion_tokens": 2, "total_tokens": 3},
            "analysis": "second",
        }
    )
    sessions = store.list_host_sessions("ssh.archflux.net", limit=2)
    assert len(sessions) == 2
    assert sessions[0].issue == "issue two"
    assert sessions[1].issue == "issue one"
 def test_history_store_list_recent_and_search_keyword(tmp_path) -> None:  # type: ignore[no-untyped-def]
    store = RunHistoryStore(tmp_path / "history.db")
    store.add_payload(
        {
            "schema": "tai.analysis.v1",
            "generated_at": "2026-05-11T13:00:00+00:00",
            "issue": "nginx failed",
            "host": "web01",
            "model": "gemma3:4b",
            "collection": {"total": 1, "failed": 0, "succeeded": 1},
            "token_usage": {"prompt_tokens": 1, "completion_tokens": 2, "total_tokens": 3},
            "analysis": "nginx config typo",
        }
    )
    store.add_payload(
        {
            "schema": "tai.analysis.v1",
            "generated_at": "2026-05-11T13:10:00+00:00",
            "issue": "sshd failed",
            "host": "ssh.archflux.net",
            "model": "gemma3:4b",
            "collection": {"total": 1, "failed": 0, "succeeded": 1},
            "token_usage": {"prompt_tokens": 1, "completion_tokens": 2, "total_tokens": 3},
            "analysis": "sshd key mismatch",
        }
    )
    recent = store.list_recent(limit=2)
    assert len(recent) == 2
    assert recent[0].issue == "sshd failed"
    matches = store.search_keyword("key", host="ssh.archflux.net", limit=5)
    assert len(matches) == 1
    assert matches[0].host == "ssh.archflux.net"
 def test_history_store_accepts_sqlite_url(tmp_path: Path) -> None:
    db_file = tmp_path / "history-url.db"
    store = RunHistoryStore(f"sqlite:///{db_file}")
    store.add_payload(
        {
            "schema": "tai.analysis.v1",
            "generated_at": "2026-05-11T13:20:00+00:00",
            "issue": "test",
            "host": "host1",
            "model": "gemma3:4b",
            "collection": {"total": 1, "failed": 0, "succeeded": 1},
            "token_usage": {"prompt_tokens": 1, "completion_tokens": 1, "total_tokens": 2},
            "analysis": "ok",
        }
    )
    assert store.count(host="host1") == 1
--- a/tests/test_runbook_store.py
+++ b/tests/test_runbook_store.py
@@ -100,6 +100,7 @@ def _make_chromadb_mock() -> MagicMock:
    client.get_or_create_collection.return_value = collection
    chroma_mod = MagicMock()
    chroma_mod.PersistentClient.return_value = client
    chroma_mod.HttpClient.return_value = client
    return chroma_mod
@@ -251,3 +252,28 @@ def test_runbook_store_sync_single_missing_file_raises(tmp_path: Path) -> None:
        store = RunbookStore(tmp_path / "store")
        with pytest.raises(FileNotFoundError):
            store.sync_single(tmp_path / "missing.md", ai)
 def test_runbook_store_remote_url_uses_http_client() -> None:
    chroma_mock = _make_chromadb_mock()
    with patch.dict("sys.modules", {"chromadb": chroma_mock}):
        store = RunbookStore("https://chroma.example.com:8443")
        assert store.count() == 0
    chroma_mock.HttpClient.assert_called_once_with(host="chroma.example.com", port=8443, ssl=True)
 def test_runbook_store_remote_url_uses_http_client_with_basic_auth() -> None:
    chroma_mock = _make_chromadb_mock()
    with patch.dict("sys.modules", {"chromadb": chroma_mock}):
        store = RunbookStore("https://chroma.example.com:8443", username="tai", password="secret")
        assert store.count() == 0
    args = chroma_mock.HttpClient.call_args.kwargs
    assert args["host"] == "chroma.example.com"
    assert args["port"] == 8443
    assert args["ssl"] is True
    assert "headers" in args
    assert str(args["headers"].get("Authorization", "")).startswith("Basic ")
--- a/tests/test_session_store.py
+++ b/tests/test_session_store.py
@@ -77,3 +77,53 @@ def test_query_returns_past_sessions(tmp_path: Path) -> None:
    assert isinstance(results[0], PastSession)
    assert results[0].host == "web01"
    assert "package missing" in results[0].summary
 def test_list_recent_returns_sessions_sorted_desc(tmp_path: Path) -> None:
    chroma_mock = _make_chromadb_mock()
    collection = chroma_mock.PersistentClient.return_value.get_or_create_collection.return_value
    collection.count.return_value = 3
    collection.get.return_value = {
        "ids": ["20260506T120000Z", "20260507T120000Z", "20260505T120000Z"],
        "documents": ["older", "newer", "oldest"],
        "metadatas": [
            {"host": "web01", "issue": "i1"},
            {"host": "web01", "issue": "i2"},
            {"host": "db01", "issue": "i3"},
        ],
    }
    with patch.dict("sys.modules", {"chromadb": chroma_mock}):
        store = SessionStore(tmp_path / "store")
        results = store.list_recent(limit=2)
    assert len(results) == 2
    assert results[0].session_id == "20260507T120000Z"
    assert results[1].session_id == "20260506T120000Z"
 def test_search_keyword_filters_by_term_and_host(tmp_path: Path) -> None:
    chroma_mock = _make_chromadb_mock()
    collection = chroma_mock.PersistentClient.return_value.get_or_create_collection.return_value
    collection.count.return_value = 3
    collection.get.return_value = {
        "ids": ["20260505T120000Z", "20260506T120000Z", "20260507T120000Z"],
        "documents": [
            "Root cause: nginx config typo",
            "Root cause: package missing",
            "Root cause: nginx port conflict",
        ],
        "metadatas": [
            {"host": "web01", "issue": "nginx fails"},
            {"host": "web01", "issue": "sssd fails"},
            {"host": "db01", "issue": "nginx start failed"},
        ],
    }
    with patch.dict("sys.modules", {"chromadb": chroma_mock}):
        store = SessionStore(tmp_path / "store")
        results = store.search_keyword("nginx", host="web01", limit=5)
    assert len(results) == 1
    assert results[0].host == "web01"
    assert "nginx" in results[0].issue.lower()
Author	SHA1	Message	Date
zphinx	3be14f8f6f	commit all of this All checks were successful CI / test (push) Successful in 27s Details	2026-05-14 20:00:38 +02:00
zphinx	2d8a5a66ca	feat(cli): add clean analysis export with markdown/json output	2026-05-11 21:54:21 +02:00
zphinx	92ce7da28f	docs: require tea for agent gitea workflows Some checks failed CI / test (push) Failing after 15s Details	2026-05-11 21:25:42 +02:00
zphinx	f54af5761b	merge: history UX and retention roadmap Some checks failed Tag Build / build (push) Successful in 9m32s Details CI / test (push) Failing after 16s Details	2026-05-11 21:09:47 +02:00
zphinx	7749a02706	feat: add history UX and expand retention-focused roadmap Some checks failed CI / test (push) Failing after 15s Details	2026-05-11 21:07:39 +02:00
`@@ -2,4 +2,4 @@`

	`__all__ = ["__version__"]`	`__all__ = ["__version__"]`

	`__version__ = "0.1.0"`	`__version__ = "0.4.0"`
		`@@ -0,0 +1,3 @@`
							`# Live AI check`

							`This verifies real embedding calls.`