3.2 KiB
3.2 KiB
Architecture
This document describes tai's current runtime architecture, module responsibilities, and data flow.
High-Level Flow
- User runs
tai runwith issue text and target host settings. - CLI validates input and opens a shared SSH session.
- Probe and collection run against a read-only command plan.
- Collection output is converted into diagnostic chunks.
- Optional RAG retrieval selects top-k chunks per question.
- Optional runbook retrieval selects top-k runbook chunks from ChromaDB.
- Prompt builder composes system + user message.
- AI completion returns analysis.
- Guardrails validate response quality signals.
- Optional session logger writes JSONL events.
Module Layout
src/tai/cli.py- Command definitions (
run,runbooks sync/list/add) - Orchestration across SSH, collection, RAG, prompts, AI, and logging
- Command definitions (
src/tai/input_parser.py- User input validation and request normalization
src/tai/models.py- Core dataclasses (
TroubleshootRequest)
- Core dataclasses (
src/tai/ssh_client.py- SSH invocation
- Read-only command policy validation
- Probe and command execution helpers
src/tai/plan.py- Issue keyword/service extraction
- Command plan generation
- Service/subsystem presence probes (unit files, binaries)
src/tai/collectors.py- Executes command plans and builds
CollectionReport
- Executes command plans and builds
src/tai/rag_retriever.py- Command-output chunking
- Embedding wrapper structures
- Similarity retrieval and scoring
src/tai/runbook_store.py- Persistent ChromaDB runbook indexing and querying
src/tai/chroma_telemetry.py- No-op telemetry adapter for Chroma local usage
src/tai/prompt_builder.py- Prompt assembly for full-context and retrieved-context paths
src/tai/ai_client.py- OpenAI-compatible completions and embeddings client
src/tai/ai_guardrails.py- Lightweight response guardrails and warnings
src/tai/session_log.py- Optional JSONL event logging
Data Stores
- Runbook store (Tier 2): local ChromaDB path, default
~/.tai/runbooks - Session logs: optional JSONL file configured by
--log-file
Retrieval Layers
- Tier 1 (implemented): in-memory semantic retrieval over diagnostic chunks
- Tier 2 (implemented): persistent semantic retrieval over runbook corpus
- Tier 3 (implemented core): persistent retrieval over prior sessions (dedicated UX commands pending)
Safety Boundaries
Read-only policy is enforced before each remote command execution.
- Allowed command families are explicitly enumerated.
- Shell composition operators are blocked.
- Commands that fail execution are recorded and surfaced to the model as non-evidence.
Failure and Fallback Behavior
- If RAG indexing fails, analysis falls back to full-context prompts.
- If runbook store is unavailable, analysis proceeds without runbook context.
- If AI call fails, CLI exits with non-zero status and displays an error.
Test Coverage Highlights
- Planner behavior and service detection
- Prompt formatting and guardrail-sensitive messaging
- CLI command behavior and interactive loop controls
- Runbook store parsing/index/query behavior (with mocked Chroma)
- SSH policy validation and command execution contract