113 lines
3.8 KiB
Markdown
113 lines
3.8 KiB
Markdown
# tai — Linux AI Troubleshooting Agent
|
||
|
||
`tai` is an agentic AI-driven troubleshooting tool for Linux systems. It autonomously investigates issues on remote hosts via SSH, analyzes relevant logs and configuration files, and provides a clear diagnosis along with suggested remediation steps — all without making any changes to the target system.
|
||
|
||
## Overview
|
||
|
||
Given a problem description and a target hostname, `tai` connects to the remote system over SSH, gathers relevant data (logs, configuration files, service status, etc.), and uses a locally-hosted AI model to reason about the root cause and recommend solutions.
|
||
|
||
The agent operates in **read-only mode at all times**. It will never modify the target system under any circumstances — all suggestions are presented to the human troubleshooter for review and action.
|
||
|
||
## Supported Distributions
|
||
|
||
- Ubuntu
|
||
- Debian
|
||
- RHEL
|
||
- Rocky Linux
|
||
|
||
## Example Workflow
|
||
|
||
A troubleshooter receives a ticket reporting that the Apache service on a remote server has failed to start. They provide `tai` with:
|
||
|
||
1. The ticket description or error message
|
||
1. The hostname of the affected system
|
||
1. Any relevant directories to focus on
|
||
|
||
`tai` then connects to the host, reads through system logs, service configurations, and any other related files, and returns a structured analysis of the likely cause along with recommended next steps.
|
||
|
||
## Suggested Tooling
|
||
|
||
| Component | Tool |
|
||
|-----------|------|
|
||
| AI inference backend | [Ollama](https://ollama.com) |
|
||
| Chat model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` |
|
||
| Embedding model | `nomic-embed-text` (via Ollama) |
|
||
| Vector store | [ChromaDB](https://www.trychroma.com) (embedded, local) |
|
||
| Language | Python 3.11+ |
|
||
|
||
______________________________________________________________________
|
||
|
||
## How-To: Setting Up the AI Backend (Arch Linux + RTX 3080)
|
||
|
||
`tai` uses [Ollama](https://ollama.com) as its local AI backend. It exposes an OpenAI-compatible HTTP API that `tai` talks to — no cloud services, no data leaving your machine.
|
||
|
||
An RTX 3080 (10 GB VRAM) comfortably runs 7–8B parameter models at 4-bit quantisation.
|
||
|
||
### 1. Install CUDA and Ollama
|
||
|
||
```bash
|
||
# CUDA runtime (skip if already installed)
|
||
sudo pacman -S cuda
|
||
|
||
# Ollama with CUDA support from the AUR
|
||
yay -S ollama-cuda
|
||
# or: paru -S ollama-cuda
|
||
|
||
# Enable and start the service
|
||
sudo systemctl enable --now ollama
|
||
```
|
||
|
||
### 2. Pull a chat model
|
||
|
||
```bash
|
||
ollama pull gemma3:4b # ~3 GB — fast, good for sysadmin tasks
|
||
ollama pull llama3.1:8b # ~5 GB — stronger reasoning
|
||
ollama pull qwen2.5:7b # ~4.5 GB — strong structured output
|
||
```
|
||
|
||
### 3. Pull the embedding model
|
||
|
||
`tai` uses `nomic-embed-text` to embed diagnostic data and runbooks for semantic retrieval (RAG). Pull it on the same host as Ollama:
|
||
|
||
```bash
|
||
ollama pull nomic-embed-text # ~274 MB
|
||
```
|
||
|
||
Verify it loaded:
|
||
|
||
```bash
|
||
curl http://localhost:11434/api/embeddings \
|
||
-d '{"model":"nomic-embed-text","prompt":"test"}'
|
||
```
|
||
|
||
A JSON response with an `"embedding"` array confirms it is ready.
|
||
|
||
### 4. Verify the chat model works
|
||
|
||
```bash
|
||
ollama run gemma3:4b "what causes a systemd service to enter failed state?"
|
||
```
|
||
|
||
### 5. Verify the HTTP API is running
|
||
|
||
`tai` communicates with Ollama over its OpenAI-compatible REST API:
|
||
|
||
```bash
|
||
curl http://localhost:11434/api/generate \
|
||
-d '{"model":"gemma3:4b","prompt":"hello","stream":false}'
|
||
```
|
||
|
||
A JSON response with a `response` field confirms everything is working.
|
||
|
||
### 6. Point tai at your Ollama instance
|
||
|
||
Once `tai` AI integration is complete, use these flags:
|
||
|
||
```bash
|
||
tai "nginx failing to start" --host web01 \
|
||
--ai-host http://localhost:11434 \
|
||
--model gemma3:4b
|
||
```
|
||
|
||
The default values for `--ai-host` and `--model` will be `http://localhost:11434` and `gemma3:4b` respectively, so for local use you won't need to specify them explicitly.
|