diff --git a/CHANGELOG.md b/CHANGELOG.md index bbb0180..21217bc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,20 +22,20 @@ ______________________________________________________________________ - Runbook knowledge store module `src/tai/runbook_store.py` (persistent ChromaDB-backed index and query) - Chroma telemetry no-op client `src/tai/chroma_telemetry.py` to suppress noisy local telemetry errors - `tai runbooks` command group with: - - `sync` for indexing all Markdown runbooks - - `list` for listing indexed metadata - - `add` for indexing a single runbook file + - `sync` for indexing all Markdown runbooks + - `list` for listing indexed metadata + - `add` for indexing a single runbook file - `--runbooks` option on `tai run` to enable Tier 2 runbook retrieval - Initial analysis RAG path using retrieved diagnostic chunks (`build_analysis_message_with_chunks`) - Follow-up RAG path updates with tighter `top_k` and runbook context injection - AI runtime controls: - - `--ai-timeout-seconds` - - `--ai-max-tokens` + - `--ai-timeout-seconds` + - `--ai-max-tokens` - Non-streaming AI completion path for improved local backend reliability - Service/subsystem presence probes in collection plans: - - unit-file checks - - expected binary path checks - - status/journal/config probes for recognized services including `sssd` + - unit-file checks + - expected binary path checks + - status/journal/config probes for recognized services including `sssd` - Prompt instruction for "component absent or not installed" interpretation when presence signals are missing - Runbook store unit tests in `tests/test_runbook_store.py` - CLI tests updated for `tai run` subcommand and non-streaming completion mocks diff --git a/ROADMAP.md b/ROADMAP.md index 6a1e8ef..584c9ad 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -130,7 +130,7 @@ model weights alone. Three tiers of increasing capability, each buildable indepe - Build compounding institutional memory from past troubleshooting sessions - Keep all data local — no embeddings or session content leaves the network ---- +______________________________________________________________________ ### Technology Decisions Required @@ -145,7 +145,7 @@ model weights alone. Three tiers of increasing capability, each buildable indepe | Runbook format | Markdown, YAML, JSON | Markdown (human-editable, version-controllable) | ✅ Implemented | | Session index storage | Local `~/.tai/`, configurable path | `~/.tai/sessions/` with ChromaDB collection | ⬜ Pending | ---- +______________________________________________________________________ ### Tier 1 — Diagnostic Chunk Retrieval (in-memory, per-session) @@ -155,31 +155,36 @@ Status: ✅ Implemented On busy hosts this floods the context window with irrelevant output, degrading quality. **Approach:** + - After collection, split each command's output into overlapping token chunks (e.g. 512 tokens, 64 overlap) - Embed all chunks using `nomic-embed-text` via Ollama embeddings API - On each question (initial + follow-up), embed the question and retrieve top-k chunks by cosine similarity - Inject only retrieved chunks into the prompt, not the full dump **New module:** `src/tai/rag_retriever.py` + - `chunk_report(report) -> list[Chunk]` - `embed_chunks(chunks) -> list[EmbeddedChunk]` - `retrieve(question, embedded_chunks, top_k) -> list[Chunk]` **Changes to existing code:** + - `prompt_builder.py`: accept `retrieved_chunks` instead of full `CollectionReport` for RAG-mode prompts - `cli.py`: embed report after collection, pass retriever to `_run_analysis` and `_run_followup_analysis` - `ai_client.py`: add `embed(text) -> list[float]` method using Ollama `/api/embeddings` **Companion features buildable at same time:** + - `--no-rag` flag to bypass retrieval and use full dump (backwards compat) - Token budget display: show user how many tokens are being sent vs. saved - Per-chunk source attribution in AI response (which command produced the evidence) **Tests:** + - `tests/test_rag_retriever.py`: chunk splitting, cosine similarity ranking, top-k retrieval - `tests/test_ai.py`: add `test_embed_returns_float_list()` ---- +______________________________________________________________________ ### Tier 2 — Runbook Knowledge Base (persistent, ChromaDB) @@ -189,33 +194,38 @@ Status: ✅ Implemented specific environments, distros, or internal conventions. **Approach:** + - Maintain a version-controlled corpus of Markdown runbooks in `runbooks/` directory - On first run (or `tai runbooks --sync`), embed all runbooks and persist to ChromaDB collection - On each analysis, retrieve top-3 relevant runbook chunks alongside diagnostic chunks - Inject as a separate `## Runbook Context` section in the prompt **New module:** `src/tai/runbook_store.py` + - `RunbookStore`: wraps ChromaDB collection - `sync(runbooks_dir) -> int` — embed and upsert all runbooks - `query(question, top_k) -> list[RunbookChunk]` **New directory:** `runbooks/` + - `ssh.md`, `nginx.md`, `postgres.md`, `disk.md`, `kernel.md`, etc. - Each runbook: YAML frontmatter (`service`, `symptoms`, `tags`) + Markdown body **New CLI command:** `tai runbooks --sync [--path ./runbooks]` **Changes to existing code:** + - `prompt_builder.py`: add `build_message_with_runbooks(retrieved_chunks, runbook_chunks)` - `cli.py`: optionally load `RunbookStore`, query it per analysis turn **Companion features buildable at same time:** + - `tai runbooks --list` — show indexed runbooks and last sync time - `tai runbooks --add ` — index a single runbook - `/runbooks` slash command in interactive mode — show which runbooks were retrieved - Runbook citation in AI output: "Based on runbook: `ssh.md#AuthenticationFailures`" ---- +______________________________________________________________________ ### Tier 3 — Session Memory Index (institutional learning) @@ -225,27 +235,31 @@ Status: ⬜ Pending same issue type get no benefit from past work. **Approach:** + - On session end, embed the session summary (issue + root cause + actions) and upsert into a persistent ChromaDB collection (`~/.tai/sessions/`) - On session start, query for similar past sessions by issue text + hostname - Inject top-2 past sessions as `## Prior Sessions` context - Optionally: `/history` command in interactive mode to surface past sessions explicitly **New module:** `src/tai/session_store.py` + - `SessionStore`: wraps ChromaDB collection at `~/.tai/sessions/` - `index_session(session_log_path)` — embed and store completed session - `query_similar(issue, host, top_k) -> list[PastSession]` **Changes to existing code:** + - `session_log.py`: add `summarise() -> str` method (issue + final AI response) - `cli.py`: query `SessionStore` at session start, index at session end **Companion features buildable at same time:** + - `tai history` CLI subcommand — search past sessions by keyword - `tai history --host ` — all sessions for a host - `tai history --export ` — export session summaries as Markdown report - Auto-suggest: "Similar issue found from 2 weeks ago — load context? [y/N]" ---- +______________________________________________________________________ ### Implementation Order @@ -258,6 +272,7 @@ Tier 3 (session memory) ← Builds on Tier 2 infrastructure. Minimal extr ``` **Estimated effort:** + - Tier 1: 2–3 days (new module + prompt builder changes + tests) - Tier 2: 3–4 days (ChromaDB + runbook authoring + CLI command + tests) - Tier 3: 1–2 days (reuses Tier 2 infrastructure)