7 Commits

Author SHA1 Message Date
e49670a664 docs(roadmap): add Phase 6 RAG & Knowledge Layer plan
Some checks failed
CI / test (push) Failing after 15s
- Three-tier RAG architecture: diagnostic chunks, runbook KB, session memory
- Technology decisions table with options and recommendations
- Per-tier: approach, new modules, changes to existing code, companion features
- Implementation order and effort estimates
- New dependencies and optional pyproject.toml group
- Decisions log entries for RAG choices pending confirmation
2026-05-04 18:23:33 +02:00
4870bd3bfe ci: rename release.yml to tag.yml, fix trigger to match non-v tags
All checks were successful
CI / test (push) Successful in 20s
Tag Build / build (push) Successful in 8m33s
- Trigger was 'v*' but tags are bare semver (0.3.0) — fix to '[0-9]*'
- Rename to tag.yml to reflect tag-driven build purpose
- Add zip to apt dependencies (required for release zip step)
2026-05-04 06:48:34 +02:00
5798d87993 Merge branch 'feature/interactive-ux-improvements'
All checks were successful
CI / test (push) Successful in 20s
2026-05-04 06:43:33 +02:00
2c738579bd feat(ux): improve interactive mode readability and input visibility
All checks were successful
CI / test (push) Successful in 19s
- Replace plain 'tai>' prompt with styled console.input() bold cyan prompt
- Wrap interactive mode entry in a Rich Panel with border
- Frame each AI response with Rule dividers (──── AI Response ────)
- Style guardrail warnings with ⚠ prefix and bold yellow
- Improve /help output with formatted Panel showing all commands
- Style collection report: ✓/✗ per item with color, truncation in dim
- Style probe output: ✓/✗ with green/red, host info in dim
- Add Rule header divider on session start
2026-05-04 06:37:50 +02:00
27feeed8bf feat: add combined release zip with binary and deb package
All checks were successful
CI / test (push) Successful in 20s
2026-05-04 06:24:19 +02:00
96178c1438 chore: remove logs from tracking, add requirements.txt, improve .gitignore
All checks were successful
CI / test (push) Successful in 20s
2026-05-04 06:21:40 +02:00
021e95b04f test
All checks were successful
CI / test (push) Successful in 19s
2026-05-04 06:16:30 +02:00
6 changed files with 274 additions and 32 deletions

View File

@@ -1,9 +1,9 @@
name: Release name: Tag Build
on: on:
push: push:
tags: tags:
- "v*" - "[0-9]*"
jobs: jobs:
build: build:
@@ -61,8 +61,8 @@ jobs:
run: | run: |
if command -v apt-get >/dev/null 2>&1; then if command -v apt-get >/dev/null 2>&1; then
apt-get update apt-get update
apt-get install -y python3.12 python3.12-venv python3-pip patchelf ccache || \ apt-get install -y python3.12 python3.12-venv python3-pip patchelf ccache zip || \
apt-get install -y python3 python3-pip python3-venv patchelf ccache apt-get install -y python3 python3-pip python3-venv patchelf ccache zip
elif command -v dnf >/dev/null 2>&1; then elif command -v dnf >/dev/null 2>&1; then
dnf install -y python3 python3-pip python3-devel patchelf ccache dnf install -y python3 python3-pip python3-devel patchelf ccache
elif command -v yum >/dev/null 2>&1; then elif command -v yum >/dev/null 2>&1; then
@@ -131,6 +131,16 @@ jobs:
dpkg-deb --build "${deb_dir}" "${out_dir}/${pkg_name}_${deb_version}_${arch}.deb" dpkg-deb --build "${deb_dir}" "${out_dir}/${pkg_name}_${deb_version}_${arch}.deb"
- name: Create release zip with binary and deb
run: |
cd dist
deb_version="${{ steps.version.outputs.deb_version }}"
zip_name="tai-${deb_version}-linux-amd64.zip"
zip "${zip_name}" \
tai \
"tai_${deb_version}_amd64.deb"
cd ..
- name: Upload binary artifact - name: Upload binary artifact
uses: actions/upload-artifact@v3 uses: actions/upload-artifact@v3
with: with:
@@ -146,3 +156,11 @@ jobs:
path: dist/tai_${{ steps.version.outputs.deb_version }}_amd64.deb path: dist/tai_${{ steps.version.outputs.deb_version }}_amd64.deb
if-no-files-found: error if-no-files-found: error
retention-days: 90 retention-days: 90
- name: Upload combined release zip
uses: actions/upload-artifact@v3
with:
name: tai-release-${{ steps.version.outputs.tag }}
path: dist/tai-${{ steps.version.outputs.deb_version }}-linux-amd64.zip
if-no-files-found: error
retention-days: 90

3
.gitignore vendored
View File

@@ -24,3 +24,6 @@ htmlcov/
# IDE # IDE
.vscode/ .vscode/
# Logs and session files
logs/

View File

@@ -117,6 +117,170 @@ Prepare for broader use.
______________________________________________________________________ ______________________________________________________________________
## Phase 6 — RAG & Knowledge Layer
Introduce Retrieval-Augmented Generation to ground AI responses in evidence rather than
model weights alone. Three tiers of increasing capability, each buildable independently.
### Goals
- Eliminate prompt flooding on hosts with large log output
- Ground recommendations in version-controlled runbooks, not model improvisation
- Build compounding institutional memory from past troubleshooting sessions
- Keep all data local — no embeddings or session content leaves the network
---
### Technology Decisions Required
| Decision | Options | Recommendation | Status |
|---|---|---|---|
| Embedding model | `nomic-embed-text`, `mxbai-embed-large`, `all-minilm` | `nomic-embed-text` via Ollama (local, 274MB, strong perf) | ⬜ Pending |
| Vector store — Tier 1 | In-memory numpy cosine, `faiss-cpu` | numpy (zero deps) for session scope | ⬜ Pending |
| Vector store — Tier 2/3 | `chromadb`, `qdrant`, `weaviate`, `pgvector` | `chromadb` (embedded mode, no server needed) or `qdrant` (self-hosted, REST API, production-grade) | ⬜ Pending |
| Chunking strategy | Fixed token, sentence-aware, command-boundary | Command-boundary splitting (natural unit for diagnostics) | ⬜ Pending |
| Hybrid retrieval | Semantic only, BM25 only, hybrid | Hybrid (BM25 keyword + cosine semantic) for best recall | ⬜ Pending |
| Reranking | None, cross-encoder (`ms-marco-MiniLM`), LLM-as-judge | Cross-encoder rerank pass before prompt injection | ⬜ Pending |
| Runbook format | Markdown, YAML, JSON | Markdown (human-editable, version-controllable) | ⬜ Pending |
| Session index storage | Local `~/.tai/`, configurable path | `~/.tai/sessions/` with ChromaDB collection | ⬜ Pending |
---
### Tier 1 — Diagnostic Chunk Retrieval (in-memory, per-session)
**Problem:** Current flow injects all collected output into the prompt as one block.
On busy hosts this floods the context window with irrelevant output, degrading quality.
**Approach:**
- After collection, split each command's output into overlapping token chunks (e.g. 512 tokens, 64 overlap)
- Embed all chunks using `nomic-embed-text` via Ollama embeddings API
- On each question (initial + follow-up), embed the question and retrieve top-k chunks by cosine similarity
- Inject only retrieved chunks into the prompt, not the full dump
**New module:** `src/tai/rag_retriever.py`
- `chunk_report(report) -> list[Chunk]`
- `embed_chunks(chunks) -> list[EmbeddedChunk]`
- `retrieve(question, embedded_chunks, top_k) -> list[Chunk]`
**Changes to existing code:**
- `prompt_builder.py`: accept `retrieved_chunks` instead of full `CollectionReport` for RAG-mode prompts
- `cli.py`: embed report after collection, pass retriever to `_run_analysis` and `_run_followup_analysis`
- `ai_client.py`: add `embed(text) -> list[float]` method using Ollama `/api/embeddings`
**Companion features buildable at same time:**
- `--no-rag` flag to bypass retrieval and use full dump (backwards compat)
- Token budget display: show user how many tokens are being sent vs. saved
- Per-chunk source attribution in AI response (which command produced the evidence)
**Tests:**
- `tests/test_rag_retriever.py`: chunk splitting, cosine similarity ranking, top-k retrieval
- `tests/test_ai.py`: add `test_embed_returns_float_list()`
---
### Tier 2 — Runbook Knowledge Base (persistent, ChromaDB)
**Problem:** AI improvises remediation steps from training data, which may be wrong for
specific environments, distros, or internal conventions.
**Approach:**
- Maintain a version-controlled corpus of Markdown runbooks in `runbooks/` directory
- On first run (or `tai runbooks --sync`), embed all runbooks and persist to ChromaDB collection
- On each analysis, retrieve top-3 relevant runbook chunks alongside diagnostic chunks
- Inject as a separate `## Runbook Context` section in the prompt
**New module:** `src/tai/runbook_store.py`
- `RunbookStore`: wraps ChromaDB collection
- `sync(runbooks_dir) -> int` — embed and upsert all runbooks
- `query(question, top_k) -> list[RunbookChunk]`
**New directory:** `runbooks/`
- `ssh.md`, `nginx.md`, `postgres.md`, `disk.md`, `kernel.md`, etc.
- Each runbook: YAML frontmatter (`service`, `symptoms`, `tags`) + Markdown body
**New CLI command:** `tai runbooks --sync [--path ./runbooks]`
**Changes to existing code:**
- `prompt_builder.py`: add `build_message_with_runbooks(retrieved_chunks, runbook_chunks)`
- `cli.py`: optionally load `RunbookStore`, query it per analysis turn
**Companion features buildable at same time:**
- `tai runbooks --list` — show indexed runbooks and last sync time
- `tai runbooks --add <file>` — index a single runbook
- `/runbooks` slash command in interactive mode — show which runbooks were retrieved
- Runbook citation in AI output: "Based on runbook: `ssh.md#AuthenticationFailures`"
---
### Tier 3 — Session Memory Index (institutional learning)
**Problem:** Every session starts from zero. Repeat incidents on the same host or
same issue type get no benefit from past work.
**Approach:**
- On session end, embed the session summary (issue + root cause + actions) and upsert into a persistent ChromaDB collection (`~/.tai/sessions/`)
- On session start, query for similar past sessions by issue text + hostname
- Inject top-2 past sessions as `## Prior Sessions` context
- Optionally: `/history` command in interactive mode to surface past sessions explicitly
**New module:** `src/tai/session_store.py`
- `SessionStore`: wraps ChromaDB collection at `~/.tai/sessions/`
- `index_session(session_log_path)` — embed and store completed session
- `query_similar(issue, host, top_k) -> list[PastSession]`
**Changes to existing code:**
- `session_log.py`: add `summarise() -> str` method (issue + final AI response)
- `cli.py`: query `SessionStore` at session start, index at session end
**Companion features buildable at same time:**
- `tai history` CLI subcommand — search past sessions by keyword
- `tai history --host <hostname>` — all sessions for a host
- `tai history --export <file>` — export session summaries as Markdown report
- Auto-suggest: "Similar issue found from 2 weeks ago — load context? [y/N]"
---
### Implementation Order
```
Tier 1 (diagnostic chunks) ← Start here. Zero new infra. Immediate prompt quality gain.
Tier 2 (runbook KB) ← After Tier 1. Requires ChromaDB dep + runbook authoring.
Tier 3 (session memory) ← Builds on Tier 2 infrastructure. Minimal extra work.
```
**Estimated effort:**
- Tier 1: 23 days (new module + prompt builder changes + tests)
- Tier 2: 34 days (ChromaDB + runbook authoring + CLI command + tests)
- Tier 3: 12 days (reuses Tier 2 infrastructure)
### New Dependencies
```
# Tier 1 (zero new runtime deps — uses Ollama HTTP API already in use)
# No additions needed
# Tier 2 + 3
chromadb>=0.5,<1.0 # embedded vector store, no separate server
# OR
qdrant-client>=1.9,<2.0 # if self-hosted Qdrant preferred
sentence-transformers>=3.0 # optional: cross-encoder reranking
```
### New pyproject.toml optional group
```toml
[project.optional-dependencies]
rag = [
"chromadb>=0.5,<1.0",
"sentence-transformers>=3.0,<4.0",
]
```
______________________________________________________________________
## Decisions Log ## Decisions Log
| Date | Decision | Outcome | | Date | Decision | Outcome |
@@ -128,3 +292,8 @@ ______________________________________________________________________
| 2026-05-04 | Bastion host support | `--jump-host` flag via SSH native ProxyJump | | 2026-05-04 | Bastion host support | `--jump-host` flag via SSH native ProxyJump |
| 2026-05-04 | SSH config behavior | Use `~/.ssh/config` by default; allow override via `--ignore-ssh-config` | | 2026-05-04 | SSH config behavior | Use `~/.ssh/config` by default; allow override via `--ignore-ssh-config` |
| 2026-05-04 | CLI vs interactive mode | Interactive: REPL for v0.1, `textual` TUI for v0.2+ | | 2026-05-04 | CLI vs interactive mode | Interactive: REPL for v0.1, `textual` TUI for v0.2+ |
| 2026-05-04 | RAG embedding model | `nomic-embed-text` via Ollama (local, air-gapped safe) — ⬜ pending confirmation |
| 2026-05-04 | RAG vector store (Tier 1) | In-memory numpy cosine similarity — zero deps, session-scoped |
| 2026-05-04 | RAG vector store (Tier 2/3) | `chromadb` embedded mode (default) or `qdrant` self-hosted — ⬜ pending confirmation |
| 2026-05-04 | RAG chunking unit | Command-boundary splitting — each collected command = one or more chunks |
| 2026-05-04 | Runbook format | Markdown with YAML frontmatter, version-controlled in `runbooks/` directory |

15
requirements.txt Normal file
View File

@@ -0,0 +1,15 @@
# Core dependencies
typer>=0.12,<1.0
rich>=13.7,<14.0
asyncssh>=2.14,<3.0
openai>=1.30,<2.0
# Development dependencies
pytest>=8.2,<9.0
ruff>=0.5,<1.0
mypy>=1.10,<2.0
mdformat>=0.7,<1.0
yamllint>=1.35,<2.0
# Build dependencies
nuitka>=2.4,<3.0

View File

@@ -8,6 +8,9 @@ from typing import Annotated
import typer import typer
from rich.console import Console from rich.console import Console
from rich.markdown import Markdown from rich.markdown import Markdown
from rich.panel import Panel
from rich.rule import Rule
from rich.text import Text
from tai.ai_client import DEFAULT_AI_HOST, DEFAULT_MODEL, AIClient, AIConfig from tai.ai_client import DEFAULT_AI_HOST, DEFAULT_MODEL, AIClient, AIConfig
from tai.ai_guardrails import validate_ai_response from tai.ai_guardrails import validate_ai_response
@@ -119,11 +122,12 @@ def run(
) )
summary = SSHClient(config).summary() summary = SSHClient(config).summary()
console.print("[bold green]tai[/bold green]") console.print(Rule("[bold green]tai[/bold green]", style="green"))
console.print(f"Issue: {req.issue}") console.print(f" [bold]Issue:[/bold] {req.issue}")
console.print(f"SSH: {summary}") console.print(f" [bold]SSH:[/bold] {summary}")
if req.target_paths: if req.target_paths:
console.print(f"Paths: {', '.join(str(p) for p in req.target_paths)}") console.print(f" [bold]Paths:[/bold] {', '.join(str(p) for p in req.target_paths)}")
console.print()
if not (probe or collect or analyze or interactive): if not (probe or collect or analyze or interactive):
return # nothing SSH-related requested return # nothing SSH-related requested
@@ -227,15 +231,20 @@ async def _interactive_loop(
) -> None: ) -> None:
"""Run a follow-up loop for collecting and conversational analysis.""" """Run a follow-up loop for collecting and conversational analysis."""
console.print( console.print(
"[cyan]Interactive mode:[/cyan] " Panel(
"ask questions directly, or use /collect, /analyze, /help, /quit" "Ask questions directly, or use [bold]/collect[/bold], "
"[bold]/analyze[/bold], [bold]/help[/bold], [bold]/quit[/bold]",
title="[bold cyan]Interactive Mode[/bold cyan]",
border_style="cyan",
padding=(0, 1),
)
) )
prior_questions: list[str] = [] prior_questions: list[str] = []
while True: while True:
try: try:
command = input("tai> ").strip() command = console.input("\n[bold cyan]tai[/bold cyan][dim] >[/dim] ").strip()
except (EOFError, KeyboardInterrupt): except (EOFError, KeyboardInterrupt):
console.print("\n[yellow]Exiting interactive mode.[/yellow]") console.print("\n[yellow]Exiting interactive mode.[/yellow]")
if logger is not None: if logger is not None:
@@ -252,8 +261,18 @@ async def _interactive_loop(
return return
if command == "/help": if command == "/help":
console.print("Commands: /collect, /analyze, /help, /quit") console.print(
console.print("Tip: any non-slash text is treated as a follow-up AI question.") Panel(
"[bold]/collect[/bold] — re-run diagnostics\n"
"[bold]/analyze[/bold] — re-analyze current diagnostics\n"
"[bold]/help[/bold] — show this message\n"
"[bold]/quit[/bold] — end session\n"
"[dim]Anything else is sent directly to the AI as a question.[/dim]",
title="[bold]Commands[/bold]",
border_style="dim",
padding=(0, 1),
)
)
continue continue
if command == "/collect": if command == "/collect":
@@ -319,26 +338,32 @@ async def _interactive_loop(
def _handle_probe_result(result: SSHCommandResult) -> None: def _handle_probe_result(result: SSHCommandResult) -> None:
"""Handle and render probe output for success or failure.""" """Handle and render probe output for success or failure."""
console.print("[cyan]Running SSH probe:[/cyan] uname -a") console.print("[dim]▶ SSH probe:[/dim] uname -a")
if result.exit_code != 0: if result.exit_code != 0:
details = result.stderr or result.stdout or "no error output from ssh" details = result.stderr or result.stdout or "no error output from ssh"
console.print(f"[red]Probe failed (exit {result.exit_code}):[/red] {details}") console.print(f"[bold red]Probe failed[/bold red] (exit {result.exit_code}): {details}")
raise typer.Exit(code=1) raise typer.Exit(code=1)
output = result.stdout or "(no output)" output = result.stdout or "(no output)"
console.print("[bold green]Probe succeeded.[/bold green]") console.print("[bold green]Probe succeeded.[/bold green]")
console.print(f"Remote: {output}") console.print(f" [dim]{output}[/dim]")
def _handle_collection_report(report: CollectionReport) -> None: def _handle_collection_report(report: CollectionReport) -> None:
"""Render collected command status and truncation hints.""" """Render collected command status and truncation hints."""
console.print( failed_label = (
f"[bold]Collection complete:[/bold] {report.total} commands, {report.failed} failed" f"[red]{report.failed} failed[/red]" if report.failed else "[green]0 failed[/green]"
) )
console.print(f"[bold]Collection complete:[/bold] {report.total} commands, {failed_label}")
for item in report.items: for item in report.items:
status = "ok" if item.result.exit_code == 0 else f"exit {item.result.exit_code}"
truncated = item.result.stdout_truncated or item.result.stderr_truncated truncated = item.result.stdout_truncated or item.result.stderr_truncated
trunc = " (truncated)" if truncated else "" trunc_label = " [dim](truncated)[/dim]" if truncated else ""
console.print(f"- {item.name}: {status}{trunc}") if item.result.exit_code == 0:
console.print(f" [green]✓[/green] [dim]{item.name}[/dim]{trunc_label}")
else:
console.print(
f" [red]✗[/red] {item.name} "
f"[red](exit {item.result.exit_code})[/red]{trunc_label}"
)
def _run_analysis( def _run_analysis(
@@ -349,7 +374,9 @@ def _run_analysis(
logger: SessionLogger | None, logger: SessionLogger | None,
) -> None: ) -> None:
"""Send collected data to the AI and stream the analysis to stdout.""" """Send collected data to the AI and stream the analysis to stdout."""
console.print("[cyan]Analyzing...[/cyan]\n") console.print()
console.print(Rule("[bold cyan]Analysis[/bold cyan]", style="cyan"))
console.print()
ai = AIClient(ai_config) ai = AIClient(ai_config)
system_prompt = build_system_prompt() system_prompt = build_system_prompt()
user_message = build_user_message(issue, report) user_message = build_user_message(issue, report)
@@ -362,7 +389,10 @@ def _run_analysis(
warnings = validate_ai_response(response) warnings = validate_ai_response(response)
for item in warnings: for item in warnings:
console.print(f"[yellow]Guardrail warning:[/yellow] {item}") warn_text = Text()
warn_text.append("⚠ Guardrail: ", style="bold yellow")
warn_text.append(item, style="yellow")
console.print(warn_text)
if logger is not None: if logger is not None:
logger.log_event( logger.log_event(
@@ -390,7 +420,9 @@ def _run_followup_analysis(
logger: SessionLogger | None, logger: SessionLogger | None,
) -> str: ) -> str:
"""Run grounded follow-up analysis re-anchored to current diagnostics.""" """Run grounded follow-up analysis re-anchored to current diagnostics."""
console.print("[cyan]Analyzing...[/cyan]\n") console.print()
console.print(Rule("[bold cyan]AI Response[/bold cyan]", style="cyan"))
console.print()
ai = AIClient(ai_config) ai = AIClient(ai_config)
system_prompt = build_system_prompt() system_prompt = build_system_prompt()
user_message = build_followup_message(issue, report, question, prior_questions) user_message = build_followup_message(issue, report, question, prior_questions)
@@ -401,10 +433,14 @@ def _run_followup_analysis(
chunks.append(chunk) chunks.append(chunk)
response = "".join(chunks) response = "".join(chunks)
console.print(Markdown(response)) console.print(Markdown(response))
console.print(Rule(style="dim"))
warnings = validate_ai_response(response) warnings = validate_ai_response(response)
for item in warnings: for item in warnings:
console.print(f"[yellow]Guardrail warning:[/yellow] {item}") warn_text = Text()
warn_text.append("⚠ Guardrail: ", style="bold yellow")
warn_text.append(item, style="yellow")
console.print(warn_text)
if logger is not None: if logger is not None:
logger.log_event( logger.log_event(

View File

@@ -137,8 +137,9 @@ def test_collect_success_prints_summary(monkeypatch) -> None: # type: ignore[no
assert result.exit_code == 0 assert result.exit_code == 0
assert "Collection complete" in result.stdout assert "Collection complete" in result.stdout
assert "kernel: ok" in result.stdout assert "kernel" in result.stdout
assert "journal: ok (truncated)" in result.stdout assert "journal" in result.stdout
assert "truncated" in result.stdout
def test_interactive_collect_then_quit(monkeypatch) -> None: # type: ignore[no-untyped-def] def test_interactive_collect_then_quit(monkeypatch) -> None: # type: ignore[no-untyped-def]
@@ -163,7 +164,7 @@ def test_interactive_collect_then_quit(monkeypatch) -> None: # type: ignore[no-
commands = iter(["/collect", "/quit"]) commands = iter(["/collect", "/quit"])
monkeypatch.setattr("tai.cli.collect_from_plan", fake_collect_from_plan) monkeypatch.setattr("tai.cli.collect_from_plan", fake_collect_from_plan)
monkeypatch.setattr("builtins.input", lambda _prompt: next(commands)) monkeypatch.setattr("tai.cli.console.input", lambda _prompt: next(commands))
runner = CliRunner() runner = CliRunner()
result = runner.invoke( result = runner.invoke(
@@ -180,7 +181,7 @@ def test_interactive_collect_then_quit(monkeypatch) -> None: # type: ignore[no-
) )
assert result.exit_code == 0 assert result.exit_code == 0
assert "Interactive mode" in result.stdout assert "ask questions directly" in result.stdout.lower()
assert "Collection complete" in result.stdout assert "Collection complete" in result.stdout
assert "Bye." in result.stdout assert "Bye." in result.stdout
@@ -210,7 +211,7 @@ def test_interactive_unknown_command_prints_hint(monkeypatch) -> None: # type:
"tai.cli.AIClient.stream", "tai.cli.AIClient.stream",
lambda *_args, **_kwargs: iter(["Check logs."]), lambda *_args, **_kwargs: iter(["Check logs."]),
) )
monkeypatch.setattr("builtins.input", lambda _prompt: next(commands)) monkeypatch.setattr("tai.cli.console.input", lambda _prompt: next(commands))
runner = CliRunner() runner = CliRunner()
result = runner.invoke( result = runner.invoke(
@@ -227,5 +228,5 @@ def test_interactive_unknown_command_prints_hint(monkeypatch) -> None: # type:
) )
assert result.exit_code == 0 assert result.exit_code == 0
assert "Analyzing..." in result.stdout assert "AI Response" in result.stdout
assert "Check logs." in result.stdout assert "Check logs." in result.stdout