Files
tai/docs/ARCHITECTURE.md
zphinx 57f4c0efaa
Some checks failed
CI / test (push) Failing after 15s
feat: complete RAG runbook workflow and release docs
2026-05-06 04:48:41 +02:00

3.1 KiB

Architecture

This document describes tai's current runtime architecture, module responsibilities, and data flow.

High-Level Flow

  1. User runs tai run with issue text and target host settings.
  2. CLI validates input and opens a shared SSH session.
  3. Probe and collection run against a read-only command plan.
  4. Collection output is converted into diagnostic chunks.
  5. Optional RAG retrieval selects top-k chunks per question.
  6. Optional runbook retrieval selects top-k runbook chunks from ChromaDB.
  7. Prompt builder composes system + user message.
  8. AI completion returns analysis.
  9. Guardrails validate response quality signals.
  10. Optional session logger writes JSONL events.

Module Layout

  • src/tai/cli.py
    • Command definitions (run, runbooks sync/list/add)
    • Orchestration across SSH, collection, RAG, prompts, AI, and logging
  • src/tai/input_parser.py
    • User input validation and request normalization
  • src/tai/models.py
    • Core dataclasses (TroubleshootRequest)
  • src/tai/ssh_client.py
    • SSH invocation
    • Read-only command policy validation
    • Probe and command execution helpers
  • src/tai/plan.py
    • Issue keyword/service extraction
    • Command plan generation
    • Service/subsystem presence probes (unit files, binaries)
  • src/tai/collectors.py
    • Executes command plans and builds CollectionReport
  • src/tai/rag_retriever.py
    • Command-output chunking
    • Embedding wrapper structures
    • Similarity retrieval and scoring
  • src/tai/runbook_store.py
    • Persistent ChromaDB runbook indexing and querying
  • src/tai/chroma_telemetry.py
    • No-op telemetry adapter for Chroma local usage
  • src/tai/prompt_builder.py
    • Prompt assembly for full-context and retrieved-context paths
  • src/tai/ai_client.py
    • OpenAI-compatible completions and embeddings client
  • src/tai/ai_guardrails.py
    • Lightweight response guardrails and warnings
  • src/tai/session_log.py
    • Optional JSONL event logging

Data Stores

  • Runbook store (Tier 2): local ChromaDB path, default ~/.tai/runbooks
  • Session logs: optional JSONL file configured by --log-file

Retrieval Layers

  • Tier 1 (implemented): in-memory semantic retrieval over diagnostic chunks
  • Tier 2 (implemented): persistent semantic retrieval over runbook corpus
  • Tier 3 (pending): persistent retrieval over prior sessions

Safety Boundaries

Read-only policy is enforced before each remote command execution.

  • Allowed command families are explicitly enumerated.
  • Shell composition operators are blocked.
  • Commands that fail execution are recorded and surfaced to the model as non-evidence.

Failure and Fallback Behavior

  • If RAG indexing fails, analysis falls back to full-context prompts.
  • If runbook store is unavailable, analysis proceeds without runbook context.
  • If AI call fails, CLI exits with non-zero status and displays an error.

Test Coverage Highlights

  • Planner behavior and service detection
  • Prompt formatting and guardrail-sensitive messaging
  • CLI command behavior and interactive loop controls
  • Runbook store parsing/index/query behavior (with mocked Chroma)
  • SSH policy validation and command execution contract