515 lines
22 KiB
Markdown
515 lines
22 KiB
Markdown
# Roadmap
|
||
|
||
This document outlines the major decisions, milestones, and development phases required to bring `tai` from concept to a working tool.
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 0 — Decisions & Prerequisites
|
||
|
||
These must be resolved before meaningful development can begin.
|
||
|
||
### Language Selection
|
||
|
||
- [x] **Decision: Python**
|
||
- Key factors: native vLLM integration, mature SSH libraries (`paramiko` / `asyncssh`), strong text/log parsing, rapid development
|
||
- Single binary distribution will be achieved via **Nuitka** (preferred for true compilation) or **PyInstaller** as a fallback
|
||
- [ ] Evaluate Nuitka vs PyInstaller for binary output quality and CI reproducibility
|
||
- [ ] Add binary build step to CI pipeline
|
||
|
||
### AI Backend & Model
|
||
|
||
- [x] OpenAI-compatible backend client implemented (`AIClient`)
|
||
- [x] Default local backend profile wired for Ollama (`http://localhost:11434/v1`)
|
||
- [x] Default model profile set to `gemma3:4b` (override via `--model`)
|
||
- [ ] Define minimum hardware requirements for running the model locally
|
||
- [x] AI backend is user-supplied/self-hosted
|
||
|
||
### SSH Strategy
|
||
|
||
- [x] **Decision: keypair authentication only** — no password auth; eliminates credential storage risk
|
||
- Default key resolution: `~/.ssh/id_ed25519`, `~/.ssh/id_rsa` (in order of preference)
|
||
- CLI override via `--identity-file <path>`
|
||
- No SSH agent forwarding needed — a shared key is distributed to all managed hosts via Puppet
|
||
- [x] **Known hosts: auto-accept new hosts; reject on key mismatch** — a changed host key triggers a hard stop with a MITM warning; unknown/new hosts are accepted silently on first connect
|
||
- [x] **Bastion/jump host: `--jump-host <host>` flag** — delegates to SSH's native ProxyJump functionality
|
||
- [x] **SSH config behavior: respect existing `~/.ssh/config` by default; allow CLI override**
|
||
- Default: follow host settings from `~/.ssh/config` (for `User`, `Port`, `ProxyJump`, etc.)
|
||
- Override switch: `--ignore-ssh-config` to bypass local SSH config when required
|
||
|
||
### Scope & Constraints
|
||
|
||
- [ ] Define the supported scope of issues (services, network, disk, kernel, etc.)
|
||
- [x] Read-only guarantee implemented with command allowlist + blocked shell operator policy
|
||
- [x] **Decision: interactive REPL mode for v0.1, full TUI for v0.2+**
|
||
- v0.1: chat-loop REPL launched from CLI; human can follow up, correct, and redirect the agent
|
||
- v0.2+: `textual`-based TUI with split panes (collected data | AI output | input bar)
|
||
- Built-in slash commands: `/collect`, `/show logs`, `/clear`, `/host <hostname>`, `/help`, `/quit`
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 1 — Project Foundation
|
||
|
||
Basic project scaffolding and connectivity.
|
||
|
||
- [x] Finalise repository structure and language toolchain
|
||
- [x] Set up CI pipeline (linting, tests)
|
||
- [x] Implement SSH connection module
|
||
- [x] Define SSH config model and probe interface scaffold
|
||
- [x] Connect to remote host
|
||
- [x] Execute read-only commands (e.g. `journalctl`, `systemctl status`, `cat`)
|
||
- [x] Stream or collect command output safely (byte-limited output with truncation marker)
|
||
- [x] Implement basic input parsing (ticket text, hostname, target directories)
|
||
- [x] Write unit tests for SSH and input modules
|
||
- [x] Input parser and CLI tests added
|
||
- [x] SSH module tests added for command policy and SSH argv behavior
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 2 — Data Collection Layer
|
||
|
||
Define what information the agent gathers and how.
|
||
|
||
- [x] Identify a baseline canonical set of data sources per issue type:
|
||
- Service failures: `journalctl`, `systemctl`, service config files
|
||
- Network issues: `ip`, `ss`, `netstat`, firewall rules
|
||
- Disk issues: `df`, `du`, `dmesg`, `smartctl`
|
||
- General: `/var/log/syslog`, `/var/log/messages`, `dmesg`
|
||
- [x] Implement collectors and plan builder for baseline issue categories
|
||
- [x] Implement directory traversal for user-specified paths (read-only)
|
||
- [ ] Add support for per-distro variations (Ubuntu vs RHEL path differences, etc.)
|
||
- [x] Write tests with mocked SSH output
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 3 — AI Integration
|
||
|
||
Wire collected data into the local AI model.
|
||
|
||
- [x] Implement OpenAI-compatible AI client module
|
||
- [x] Design prompt templates for initial and follow-up analysis
|
||
- [x] Implement response guardrail checks and structured response headings
|
||
- [x] Tune context usage with RAG retrieval and chunk/runbook truncation budgets
|
||
- [x] Implement reliable non-streaming completion path for local backends
|
||
- [ ] Continue output quality tuning and grounding evaluation on real hosts
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 4 — CLI & User Experience
|
||
|
||
Polish the interface for real-world use.
|
||
|
||
- [x] Design CLI interface with run command, interactive prompts, and runbook subcommands
|
||
- [x] Implement structured output sections (Root Cause, Evidence, Recommended Actions)
|
||
- [x] Add RAG debug mode (`--rag-debug`) showing retrieval scores
|
||
- [x] Support output to file (`--output-file`)
|
||
- [x] Provide comprehensive `--help` command documentation via Typer options
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 5 — Hardening & Distribution
|
||
|
||
Prepare for broader use.
|
||
|
||
- [ ] Security review of SSH handling and credential storage
|
||
- [ ] Ensure no data is written to the remote system under any path
|
||
- [ ] Package for distribution (binary release, container image, or distro packages)
|
||
- [ ] Write installation and quickstart documentation
|
||
- [ ] End-to-end integration tests against a test VM
|
||
|
||
______________________________________________________________________
|
||
|
||
## Phase 6 — RAG & Knowledge Layer
|
||
|
||
Introduce Retrieval-Augmented Generation to ground AI responses in evidence rather than
|
||
model weights alone. Three tiers of increasing capability, each buildable independently.
|
||
|
||
### Goals
|
||
|
||
- Eliminate prompt flooding on hosts with large log output
|
||
- Ground recommendations in version-controlled runbooks, not model improvisation
|
||
- Build compounding institutional memory from past troubleshooting sessions
|
||
- Keep all data local — no embeddings or session content leaves the network
|
||
|
||
______________________________________________________________________
|
||
|
||
### Technology Decisions Required
|
||
|
||
| Decision | Options | Recommendation | Status |
|
||
|---|---|---|---|
|
||
| Embedding model | `nomic-embed-text`, `mxbai-embed-large`, `all-minilm` | `nomic-embed-text` via Ollama (local, 274MB, strong perf) | ✅ Implemented |
|
||
| Vector store — Tier 1 | In-memory numpy cosine, `faiss-cpu` | numpy (zero deps) for session scope | ✅ Implemented |
|
||
| Vector store — Tier 2/3 | `chromadb`, `qdrant`, `weaviate`, `pgvector` | `chromadb` embedded mode | ✅ Tier 2 Implemented |
|
||
| Chunking strategy | Fixed token, sentence-aware, command-boundary | Command-boundary splitting (natural unit for diagnostics) | ✅ Implemented |
|
||
| Hybrid retrieval | Semantic only, BM25 only, hybrid | Hybrid (BM25 keyword + cosine semantic) for best recall | ⬜ Pending |
|
||
| Reranking | None, cross-encoder (`ms-marco-MiniLM`), LLM-as-judge | Cross-encoder rerank pass before prompt injection | ⬜ Pending |
|
||
| Runbook format | Markdown, YAML, JSON | Markdown (human-editable, version-controllable) | ✅ Implemented |
|
||
| Session index storage | Local `~/.tai/`, configurable path | `~/.tai/sessions/` with ChromaDB collection | ✅ Implemented (core) |
|
||
|
||
______________________________________________________________________
|
||
|
||
### Tier 1 — Diagnostic Chunk Retrieval (in-memory, per-session)
|
||
|
||
Status: ✅ Implemented
|
||
|
||
**Problem:** Current flow injects all collected output into the prompt as one block.
|
||
On busy hosts this floods the context window with irrelevant output, degrading quality.
|
||
|
||
**Approach:**
|
||
|
||
- After collection, split each command's output into overlapping token chunks (e.g. 512 tokens, 64 overlap)
|
||
- Embed all chunks using `nomic-embed-text` via Ollama embeddings API
|
||
- On each question (initial + follow-up), embed the question and retrieve top-k chunks by cosine similarity
|
||
- Inject only retrieved chunks into the prompt, not the full dump
|
||
|
||
**New module:** `src/tai/rag_retriever.py`
|
||
|
||
- `chunk_report(report) -> list[Chunk]`
|
||
- `embed_chunks(chunks) -> list[EmbeddedChunk]`
|
||
- `retrieve(question, embedded_chunks, top_k) -> list[Chunk]`
|
||
|
||
**Changes to existing code:**
|
||
|
||
- `prompt_builder.py`: accept `retrieved_chunks` instead of full `CollectionReport` for RAG-mode prompts
|
||
- `cli.py`: embed report after collection, pass retriever to `_run_analysis` and `_run_followup_analysis`
|
||
- `ai_client.py`: add `embed(text) -> list[float]` method using Ollama `/api/embeddings`
|
||
|
||
**Companion features buildable at same time:**
|
||
|
||
- `--no-rag` flag to bypass retrieval and use full dump (backwards compat)
|
||
- Token budget display: show user how many tokens are being sent vs. saved
|
||
- Per-chunk source attribution in AI response (which command produced the evidence)
|
||
|
||
**Tests:**
|
||
|
||
- `tests/test_rag_retriever.py`: chunk splitting, cosine similarity ranking, top-k retrieval
|
||
- `tests/test_ai.py`: add `test_embed_returns_float_list()`
|
||
|
||
______________________________________________________________________
|
||
|
||
### Tier 2 — Runbook Knowledge Base (persistent, ChromaDB)
|
||
|
||
Status: ✅ Implemented
|
||
|
||
**Problem:** AI improvises remediation steps from training data, which may be wrong for
|
||
specific environments, distros, or internal conventions.
|
||
|
||
**Approach:**
|
||
|
||
- Maintain a version-controlled corpus of Markdown runbooks in `runbooks/` directory
|
||
- On first run (or `tai runbooks --sync`), embed all runbooks and persist to ChromaDB collection
|
||
- On each analysis, retrieve top-3 relevant runbook chunks alongside diagnostic chunks
|
||
- Inject as a separate `## Runbook Context` section in the prompt
|
||
|
||
**New module:** `src/tai/runbook_store.py`
|
||
|
||
- `RunbookStore`: wraps ChromaDB collection
|
||
- `sync(runbooks_dir) -> int` — embed and upsert all runbooks
|
||
- `query(question, top_k) -> list[RunbookChunk]`
|
||
|
||
**New directory:** `runbooks/`
|
||
|
||
- `ssh.md`, `nginx.md`, `postgres.md`, `disk.md`, `kernel.md`, etc.
|
||
- Each runbook: YAML frontmatter (`service`, `symptoms`, `tags`) + Markdown body
|
||
|
||
**New CLI command:** `tai runbooks --sync [--path ./runbooks]`
|
||
|
||
**Changes to existing code:**
|
||
|
||
- `prompt_builder.py`: add `build_message_with_runbooks(retrieved_chunks, runbook_chunks)`
|
||
- `cli.py`: optionally load `RunbookStore`, query it per analysis turn
|
||
|
||
**Companion features buildable at same time:**
|
||
|
||
- `tai runbooks --list` — show indexed runbooks and last sync time
|
||
- `tai runbooks --add <file>` — index a single runbook
|
||
- `/runbooks` slash command in interactive mode — show which runbooks were retrieved
|
||
- Runbook citation in AI output: "Based on runbook: `ssh.md#AuthenticationFailures`"
|
||
|
||
______________________________________________________________________
|
||
|
||
### Tier 3 — Session Memory Index (institutional learning)
|
||
|
||
Status: ✅ Implemented (core retrieval/indexing) / ⬜ UX commands pending
|
||
|
||
**Problem:** Every session starts from zero. Repeat incidents on the same host or
|
||
same issue type get no benefit from past work.
|
||
|
||
**Implemented now:**
|
||
|
||
- On session end, embed the session summary (issue + root cause + actions) and upsert into a persistent ChromaDB collection (`~/.tai/sessions/`)
|
||
- On session start, query for similar past sessions by issue text + hostname
|
||
- Inject top-2 past sessions as `## Prior Sessions` context
|
||
|
||
**Pending UX layer:**
|
||
|
||
- `/history` command in interactive mode to surface past sessions explicitly
|
||
|
||
**New module:** `src/tai/session_store.py`
|
||
|
||
- `SessionStore`: wraps ChromaDB collection at `~/.tai/sessions/`
|
||
- `index_session(host, issue, summary, ai)` — embed and store completed session
|
||
- `query(question, host, ai, top_k) -> list[PastSession]`
|
||
|
||
**Changes to existing code:**
|
||
|
||
- `cli.py`: query `SessionStore` during analysis turns and index final responses at session end
|
||
|
||
**Companion features buildable at same time:**
|
||
|
||
- `tai history` CLI subcommand — search past sessions by keyword
|
||
- `tai history --host <hostname>` — all sessions for a host
|
||
- `tai history --export <file>` — export session summaries as Markdown report
|
||
- Auto-suggest: "Similar issue found from 2 weeks ago — load context? [y/N]"
|
||
|
||
______________________________________________________________________
|
||
|
||
### Implementation Order
|
||
|
||
```
|
||
Tier 1 (diagnostic chunks) ← Start here. Zero new infra. Immediate prompt quality gain.
|
||
↓
|
||
Tier 2 (runbook KB) ← After Tier 1. Requires ChromaDB dep + runbook authoring.
|
||
↓
|
||
Tier 3 (session memory) ← Builds on Tier 2 infrastructure. Minimal extra work.
|
||
```
|
||
|
||
**Estimated effort:**
|
||
|
||
- Tier 1: 2–3 days (new module + prompt builder changes + tests)
|
||
- Tier 2: 3–4 days (ChromaDB + runbook authoring + CLI command + tests)
|
||
- Tier 3: 1–2 days (reuses Tier 2 infrastructure)
|
||
|
||
### New Dependencies
|
||
|
||
```
|
||
# Tier 1 (zero new runtime deps — uses Ollama HTTP API already in use)
|
||
# No additions needed
|
||
|
||
# Tier 2 + 3
|
||
chromadb>=0.5,<1.0 # embedded vector store, no separate server
|
||
# OR
|
||
qdrant-client>=1.9,<2.0 # if self-hosted Qdrant preferred
|
||
|
||
sentence-transformers>=3.0 # optional: cross-encoder reranking
|
||
```
|
||
|
||
### New pyproject.toml optional group
|
||
|
||
```toml
|
||
[project.optional-dependencies]
|
||
rag = [
|
||
"chromadb>=0.5,<1.0",
|
||
"sentence-transformers>=3.0,<4.0",
|
||
]
|
||
```
|
||
|
||
______________________________________________________________________
|
||
|
||
## Decisions Log
|
||
|
||
| Date | Decision | Outcome |
|
||
|------|----------|---------|
|
||
| 2026-05-04 | Implementation language | Python — with single distributable binary via Nuitka |
|
||
| 2026-05-04 | AI backend API | OpenAI-compatible API endpoint (local Ollama by default) |
|
||
| 2026-05-04 | Default model | `gemma3:4b` |
|
||
| 2026-05-04 | SSH auth methods | Keypair only (ed25519/RSA); auto-accept new hosts; reject on key change (MITM) |
|
||
| 2026-05-04 | Bastion host support | `--jump-host` flag via SSH native ProxyJump |
|
||
| 2026-05-04 | SSH config behavior | Use `~/.ssh/config` by default; allow override via `--ignore-ssh-config` |
|
||
| 2026-05-04 | CLI vs interactive mode | Interactive: REPL for v0.1, `textual` TUI for v0.2+ |
|
||
| 2026-05-04 | RAG embedding model | `nomic-embed-text` via Ollama (local, air-gapped safe) |
|
||
| 2026-05-04 | RAG vector store (Tier 1) | In-memory numpy cosine similarity — zero deps, session-scoped |
|
||
| 2026-05-04 | RAG vector store (Tier 2/3) | `chromadb` embedded mode (default) or `qdrant` self-hosted |
|
||
| 2026-05-04 | RAG chunking unit | Command-boundary splitting — each collected command = one or more chunks |
|
||
| 2026-05-04 | Runbook format | Markdown with YAML frontmatter, version-controlled in `runbooks/` directory |
|
||
|
||
______________________________________________________________________
|
||
|
||
## End-State UX Goal
|
||
|
||
After the current CLI and memory roadmap phases are stable, the long-term UX goal is a full-screen terminal TUI with an ncurses-style workflow.
|
||
|
||
### Target End-State
|
||
|
||
- Split-pane troubleshooting workspace (diagnostics, AI output, and command/input area)
|
||
- Live command/probe status with clear success/failure indicators
|
||
- In-session history browser for prior questions, retrieved evidence, and related past sessions
|
||
- Keyboard-first navigation for operators running in SSH-only environments
|
||
|
||
### Delivery Approach
|
||
|
||
- Keep shipping incremental CLI features first (current roadmap order remains unchanged)
|
||
- Promote stable workflows into TUI panels once behavior is proven in CLI mode
|
||
- Treat the TUI as a final UX consolidation milestone, not a blocker for core troubleshooting capabilities
|
||
|
||
______________________________________________________________________
|
||
|
||
## Container Distribution Goal (Docker)
|
||
|
||
After core CLI/TUI workflows stabilize, provide an official Docker image as an additional distribution target.
|
||
|
||
### Container Execution Model (Decision)
|
||
|
||
- Docker is a one-shot invocation target, not a daemon/service mode
|
||
- Each run executes a single `tai` command and exits
|
||
- State is persisted only through mounted host volumes
|
||
|
||
### Why Docker Is Valuable Here
|
||
|
||
- Reproducible runtime: pin Python and dependency versions to remove host-level drift
|
||
- Faster operator onboarding: run with one command instead of local Python setup
|
||
- Cleaner CI/CD release path: publish versioned images aligned with git tags
|
||
- Safer local footprint: isolate dependencies from the host OS package manager
|
||
|
||
### Subgoals
|
||
|
||
1. Base image and runtime hardening
|
||
|
||
- Multi-stage Dockerfile with slim runtime image
|
||
- Non-root runtime user and minimal filesystem permissions
|
||
- Healthcheck for CLI startup and version command
|
||
|
||
2. Runtime integration for SSH workflows
|
||
|
||
- Documented mounts for `~/.ssh` (read-only where possible) and known-hosts handling
|
||
- Pass-through for SSH config when needed (`--ignore-ssh-config` behavior documented)
|
||
- Clear guidance for jump-host and bastion scenarios from inside the container
|
||
- Documented one-shot run examples for `tai run` and `tai history`
|
||
|
||
3. Persistent data strategy
|
||
|
||
- Required volume mount guidance for runbook store (`~/.tai/runbooks`)
|
||
- Required volume mount guidance for session memory/history (`~/.tai/sessions`)
|
||
- Optional bind mount for JSONL logs and report export artifacts
|
||
- Clear defaults for container paths and equivalent host path mappings
|
||
|
||
4. Release and quality gates
|
||
|
||
- Build and publish image on tagged releases
|
||
- Smoke tests in CI: probe mode, collect mode, and history command against mocked endpoints
|
||
- Version labeling (image tags and OCI metadata) tied to changelog/release tags
|
||
|
||
### Data Retention and Lifecycle Policy
|
||
|
||
Retention behavior must be explicit and configurable at runtime. Defaults should be conservative and documented.
|
||
|
||
1. Retention classes
|
||
|
||
- Session memory store (`~/.tai/sessions`): keep semantically indexed summaries for troubleshooting continuity
|
||
- Runbook store (`~/.tai/runbooks`): retain until explicitly replaced or pruned by sync policy
|
||
- JSONL logs and exported reports: operator-controlled retention with optional TTL cleanup
|
||
|
||
2. Retention controls
|
||
|
||
- Add CLI controls for age-based pruning (for example `--retain-days` on cleanup commands)
|
||
- Add host-scoped cleanup (delete history for one host) and full-store cleanup (all hosts)
|
||
- Add dry-run cleanup mode to show what would be deleted before applying changes
|
||
|
||
3. No-persist mode
|
||
|
||
- Add a documented ephemeral mode where no session memory or logs are written
|
||
- Ensure one-shot diagnostics can run in read-only operational contexts
|
||
|
||
### Configuration and State Persistence Model
|
||
|
||
Configuration and retained state should be predictable across container upgrades and host environments.
|
||
|
||
1. Mount and path contract
|
||
|
||
- Define canonical container paths for `~/.tai/runbooks`, `~/.tai/sessions`, and optional log/export paths
|
||
- Document required versus optional mounts and expected permissions for each
|
||
- Document UID/GID mapping guidance to prevent host volume ownership issues
|
||
|
||
2. Schema and compatibility
|
||
|
||
- Introduce explicit storage schema version metadata for persistent stores
|
||
- Define upgrade behavior for older stores (migrate, re-index, or fail with clear guidance)
|
||
- Add compatibility notes for image upgrades and rollback expectations
|
||
|
||
3. Backup and recovery
|
||
|
||
- Provide export/import workflows for session memory and runbook indexes
|
||
- Document minimal backup set and restore order for disaster recovery
|
||
|
||
### Security and Privacy for Retained Data
|
||
|
||
Persisted troubleshooting evidence can include sensitive operational data and must be handled accordingly.
|
||
|
||
1. Data minimization
|
||
|
||
- Add optional redaction hooks for common sensitive patterns before persistence
|
||
- Keep prompt-only transient data separate from persisted summary/index content
|
||
|
||
2. Runtime hardening
|
||
|
||
- Target non-root container execution with read-only root filesystem by default
|
||
- Require explicit writable mounts only for retained data locations
|
||
|
||
3. Auditable behavior
|
||
|
||
- Log retention-affecting operations (cleanup, purge, export/import) with timestamps and scope
|
||
- Define stable exit codes for cleanup and retention workflows to support automation
|
||
|
||
### Kubernetes Position
|
||
|
||
Kubernetes is out of scope for this delivery plan.
|
||
|
||
- `tai` is currently an operator-invoked troubleshooting client, not a long-running service
|
||
- AI inference is external to `tai` (OpenAI-compatible endpoint), reducing the need for in-cluster model orchestration
|
||
- SSH key/config handling and per-operator context are simpler with local or single-container execution
|
||
|
||
Kubernetes can be revisited only if `tai` evolves into a centralized multi-user service with queueing, RBAC, and shared tenancy requirements.
|
||
|
||
______________________________________________________________________
|
||
|
||
## Final Long-Term Goal: Full Rust Migration
|
||
|
||
This is a final-stage roadmap goal and remains explicitly out of near-term scope.
|
||
It should begin only after the Python implementation, TUI direction, Docker one-shot model,
|
||
and retention/persistence policies are stable and proven in production usage.
|
||
|
||
### Why This Is the Final Goal
|
||
|
||
- Improve execution latency and startup speed for both native runs and container one-shot invocations
|
||
- Produce a single, portable native binary with minimal runtime dependency footprint
|
||
- Strengthen reliability and memory safety under heavy log parsing and concurrent workflows
|
||
- Simplify long-term packaging and distribution across Linux targets
|
||
|
||
### Migration Objectives
|
||
|
||
1. Preserve feature parity first
|
||
|
||
- Match existing CLI behavior, interactive workflows, RAG integration, runbook management, and history/session-memory features
|
||
- Keep command semantics and safety boundaries equivalent during transition
|
||
|
||
2. Target both distribution modes
|
||
|
||
- Native Rust binary for direct operator use
|
||
- Docker image built around the Rust binary for one-shot execution with mounted persistent volumes
|
||
|
||
3. Keep compatibility guardrails
|
||
|
||
- Define persistent data format compatibility or migration tooling for runbook/session stores
|
||
- Preserve operator-visible flags where practical to reduce migration friction
|
||
|
||
### Suggested Delivery Phases
|
||
|
||
1. Build baseline Rust CLI scaffold with feature-flagged parity checkpoints
|
||
2. Port SSH execution and read-only policy enforcement modules
|
||
3. Port planner, collectors, prompt composition, and AI client adapters
|
||
4. Port session memory/history and runbook workflows with migration tests
|
||
5. Port interactive UX/TUI layer and deprecate Python runtime path
|
||
|
||
### Rust Toolchain End-State
|
||
|
||
- Standardize on Cargo-based build/test/lint pipeline (`cargo fmt`, `cargo clippy`, `cargo test`)
|
||
- Add release profile optimization and reproducible build settings
|
||
- Publish signed native artifacts and Docker images derived from Rust release binaries
|
||
|
||
### Decision Gate Before Starting
|
||
|
||
Begin Rust migration only when:
|
||
|
||
- Python roadmap milestones are complete and stable
|
||
- Container distribution and retention policy workflows are operationally validated
|
||
- A parity test matrix exists to prove behavior equivalence during migration
|