zphinx/tai

Go to file

CI / test (push) Failing after 15s

Details

feat(rag): implement Tier 1 in-memory RAG for interactive follow-ups

- Add embed() to AIClient using Ollama nomic-embed-text via /v1/embeddings
- Add DEFAULT_EMBED_MODEL and embed_model field to AIConfig
- New rag_retriever.py: chunk_report(), EmbeddedChunk, retrieve() (pure-Python cosine)
- prompt_builder: add build_message_with_chunks() for RAG-aware follow-up prompts
- cli: add --no-rag flag, embed report chunks after collection, retrieve top-5 per question
- Graceful fallback to full-context if embedding model unavailable
- 16 new tests in test_rag_retriever.py (67 total, all passing)
- Add chromadb>=0.5 as optional [rag] dep in pyproject.toml
- README: add step 3 (pull nomic-embed-text), update Suggested Tooling table

2026-05-04 18:36:12 +02:00

.gitea/workflows

ci: rename release.yml to tag.yml, fix trigger to match non-v tags

2026-05-04 06:48:34 +02:00

src/tai

feat(rag): implement Tier 1 in-memory RAG for interactive follow-ups

2026-05-04 18:36:12 +02:00

tests

feat(rag): implement Tier 1 in-memory RAG for interactive follow-ups

2026-05-04 18:36:12 +02:00

.gitignore

chore: remove logs from tracking, add requirements.txt, improve .gitignore

2026-05-04 06:21:40 +02:00

.yamllint.yml

update

2026-05-04 04:08:50 +02:00

CHANGELOG.md

update

2026-05-04 04:22:58 +02:00

LICENSE

Initial commit

2026-05-04 02:11:16 +02:00

pyproject.toml

update

2026-05-04 18:30:33 +02:00

README.md

update

2026-05-04 18:30:33 +02:00

requirements.txt

chore: remove logs from tracking, add requirements.txt, improve .gitignore

2026-05-04 06:21:40 +02:00

ROADMAP.md

docs(roadmap): add Phase 6 RAG & Knowledge Layer plan

2026-05-04 18:23:33 +02:00

README.md

tai — Linux AI Troubleshooting Agent

tai is an agentic AI-driven troubleshooting tool for Linux systems. It autonomously investigates issues on remote hosts via SSH, analyzes relevant logs and configuration files, and provides a clear diagnosis along with suggested remediation steps — all without making any changes to the target system.

Overview

Given a problem description and a target hostname, tai connects to the remote system over SSH, gathers relevant data (logs, configuration files, service status, etc.), and uses a locally-hosted AI model to reason about the root cause and recommend solutions.

The agent operates in read-only mode at all times. It will never modify the target system under any circumstances — all suggestions are presented to the human troubleshooter for review and action.

Supported Distributions

Ubuntu
Debian
RHEL
Rocky Linux

Example Workflow

A troubleshooter receives a ticket reporting that the Apache service on a remote server has failed to start. They provide tai with:

The ticket description or error message
The hostname of the affected system
Any relevant directories to focus on

tai then connects to the host, reads through system logs, service configurations, and any other related files, and returns a structured analysis of the likely cause along with recommended next steps.

Suggested Tooling

Component	Tool
AI inference backend	Ollama
Chat model	`gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b`
Embedding model	`nomic-embed-text` (via Ollama)
Vector store	ChromaDB (embedded, local)
Language	Python 3.11+

How-To: Setting Up the AI Backend (Arch Linux + RTX 3080)

tai uses Ollama as its local AI backend. It exposes an OpenAI-compatible HTTP API that tai talks to — no cloud services, no data leaving your machine.

An RTX 3080 (10 GB VRAM) comfortably runs 7–8B parameter models at 4-bit quantisation.

1. Install CUDA and Ollama

# CUDA runtime (skip if already installed)
sudo pacman -S cuda

# Ollama with CUDA support from the AUR
yay -S ollama-cuda
# or: paru -S ollama-cuda

# Enable and start the service
sudo systemctl enable --now ollama

2. Pull a chat model

ollama pull gemma3:4b       # ~3 GB — fast, good for sysadmin tasks
ollama pull llama3.1:8b     # ~5 GB — stronger reasoning
ollama pull qwen2.5:7b      # ~4.5 GB — strong structured output

3. Pull the embedding model

tai uses nomic-embed-text to embed diagnostic data and runbooks for semantic retrieval (RAG). Pull it on the same host as Ollama:

ollama pull nomic-embed-text   # ~274 MB

Verify it loaded:

curl http://localhost:11434/api/embeddings \
  -d '{"model":"nomic-embed-text","prompt":"test"}'

A JSON response with an "embedding" array confirms it is ready.

4. Verify the chat model works

ollama run gemma3:4b "what causes a systemd service to enter failed state?"

5. Verify the HTTP API is running

tai communicates with Ollama over its OpenAI-compatible REST API:

curl http://localhost:11434/api/generate \
  -d '{"model":"gemma3:4b","prompt":"hello","stream":false}'

A JSON response with a response field confirms everything is working.

6. Point tai at your Ollama instance

Once tai AI integration is complete, use these flags:

tai "nginx failing to start" --host web01 \
  --ai-host http://localhost:11434 \
  --model gemma3:4b

The default values for --ai-host and --model will be http://localhost:11434 and gemma3:4b respectively, so for local use you won't need to specify them explicitly.

README.md Unescape Escape

tai — Linux AI Troubleshooting Agent

Overview

Supported Distributions

Example Workflow

Suggested Tooling

How-To: Setting Up the AI Backend (Arch Linux + RTX 3080)

1. Install CUDA and Ollama

2. Pull a chat model

3. Pull the embedding model

4. Verify the chat model works

5. Verify the HTTP API is running

6. Point tai at your Ollama instance

README.md