lint

2026-05-06 05:03:51 +02:00
parent d5e1822644
commit bbc75b1559
2 changed files with 28 additions and 13 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -130,7 +130,7 @@ model weights alone. Three tiers of increasing capability, each buildable indepe
 - Build compounding institutional memory from past troubleshooting sessions
 - Keep all data local — no embeddings or session content leaves the network

---
+______________________________________________________________________

 ### Technology Decisions Required

@@ -145,7 +145,7 @@ model weights alone. Three tiers of increasing capability, each buildable indepe
 | Runbook format | Markdown, YAML, JSON | Markdown (human-editable, version-controllable) | ✅ Implemented |
 | Session index storage | Local `~/.tai/`, configurable path | `~/.tai/sessions/` with ChromaDB collection | ⬜ Pending |

---
+______________________________________________________________________

 ### Tier 1 — Diagnostic Chunk Retrieval (in-memory, per-session)

@@ -155,31 +155,36 @@ Status: ✅ Implemented
 On busy hosts this floods the context window with irrelevant output, degrading quality.

 **Approach:**
+
 - After collection, split each command's output into overlapping token chunks (e.g. 512 tokens, 64 overlap)
 - Embed all chunks using `nomic-embed-text` via Ollama embeddings API
 - On each question (initial + follow-up), embed the question and retrieve top-k chunks by cosine similarity
 - Inject only retrieved chunks into the prompt, not the full dump

 **New module:** `src/tai/rag_retriever.py`
+
 - `chunk_report(report) -> list[Chunk]`
 - `embed_chunks(chunks) -> list[EmbeddedChunk]`
 - `retrieve(question, embedded_chunks, top_k) -> list[Chunk]`

 **Changes to existing code:**
+
 - `prompt_builder.py`: accept `retrieved_chunks` instead of full `CollectionReport` for RAG-mode prompts
 - `cli.py`: embed report after collection, pass retriever to `_run_analysis` and `_run_followup_analysis`
 - `ai_client.py`: add `embed(text) -> list[float]` method using Ollama `/api/embeddings`

 **Companion features buildable at same time:**
+
 - `--no-rag` flag to bypass retrieval and use full dump (backwards compat)
 - Token budget display: show user how many tokens are being sent vs. saved
 - Per-chunk source attribution in AI response (which command produced the evidence)

 **Tests:**
+
 - `tests/test_rag_retriever.py`: chunk splitting, cosine similarity ranking, top-k retrieval
 - `tests/test_ai.py`: add `test_embed_returns_float_list()`

---
+______________________________________________________________________

 ### Tier 2 — Runbook Knowledge Base (persistent, ChromaDB)

@@ -189,33 +194,38 @@ Status: ✅ Implemented
 specific environments, distros, or internal conventions.

 **Approach:**
+
 - Maintain a version-controlled corpus of Markdown runbooks in `runbooks/` directory
 - On first run (or `tai runbooks --sync`), embed all runbooks and persist to ChromaDB collection
 - On each analysis, retrieve top-3 relevant runbook chunks alongside diagnostic chunks
 - Inject as a separate `## Runbook Context` section in the prompt

 **New module:** `src/tai/runbook_store.py`
+
 - `RunbookStore`: wraps ChromaDB collection
 - `sync(runbooks_dir) -> int` — embed and upsert all runbooks
 - `query(question, top_k) -> list[RunbookChunk]`

 **New directory:** `runbooks/`
+
 - `ssh.md`, `nginx.md`, `postgres.md`, `disk.md`, `kernel.md`, etc.
 - Each runbook: YAML frontmatter (`service`, `symptoms`, `tags`) + Markdown body

 **New CLI command:** `tai runbooks --sync [--path ./runbooks]`

 **Changes to existing code:**
+
 - `prompt_builder.py`: add `build_message_with_runbooks(retrieved_chunks, runbook_chunks)`
 - `cli.py`: optionally load `RunbookStore`, query it per analysis turn

 **Companion features buildable at same time:**
+
 - `tai runbooks --list` — show indexed runbooks and last sync time
 - `tai runbooks --add <file>` — index a single runbook
 - `/runbooks` slash command in interactive mode — show which runbooks were retrieved
 - Runbook citation in AI output: "Based on runbook: `ssh.md#AuthenticationFailures`"

---
+______________________________________________________________________

 ### Tier 3 — Session Memory Index (institutional learning)

@@ -225,27 +235,31 @@ Status: ⬜ Pending
 same issue type get no benefit from past work.

 **Approach:**
+
 - On session end, embed the session summary (issue + root cause + actions) and upsert into a persistent ChromaDB collection (`~/.tai/sessions/`)
 - On session start, query for similar past sessions by issue text + hostname
 - Inject top-2 past sessions as `## Prior Sessions` context
 - Optionally: `/history` command in interactive mode to surface past sessions explicitly

 **New module:** `src/tai/session_store.py`
+
 - `SessionStore`: wraps ChromaDB collection at `~/.tai/sessions/`
 - `index_session(session_log_path)` — embed and store completed session
 - `query_similar(issue, host, top_k) -> list[PastSession]`

 **Changes to existing code:**
+
 - `session_log.py`: add `summarise() -> str` method (issue + final AI response)
 - `cli.py`: query `SessionStore` at session start, index at session end

 **Companion features buildable at same time:**
+
 - `tai history` CLI subcommand — search past sessions by keyword
 - `tai history --host <hostname>` — all sessions for a host
 - `tai history --export <file>` — export session summaries as Markdown report
 - Auto-suggest: "Similar issue found from 2 weeks ago — load context? [y/N]"

---
+______________________________________________________________________

 ### Implementation Order

@@ -258,6 +272,7 @@ Tier 3 (session memory)        ← Builds on Tier 2 infrastructure. Minimal extr
 ```

 **Estimated effort:**
+
 - Tier 1: 2–3 days (new module + prompt builder changes + tests)
 - Tier 2: 3–4 days (ChromaDB + runbook authoring + CLI command + tests)
 - Tier 3: 1–2 days (reuses Tier 2 infrastructure)