diff --git a/README.md b/README.md index 5ef7c20..d25e954 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,9 @@ A troubleshooter receives a ticket reporting that the Apache service on a remote | Component | Tool | |-----------|------| | AI inference backend | [Ollama](https://ollama.com) | -| Model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` | +| Chat model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` | +| Embedding model | `nomic-embed-text` (via Ollama) | +| Vector store | [ChromaDB](https://www.trychroma.com) (embedded, local) | | Language | Python 3.11+ | ______________________________________________________________________ @@ -55,7 +57,7 @@ yay -S ollama-cuda sudo systemctl enable --now ollama ``` -### 2. Pull a model +### 2. Pull a chat model ```bash ollama pull gemma3:4b # ~3 GB — fast, good for sysadmin tasks @@ -63,13 +65,30 @@ ollama pull llama3.1:8b # ~5 GB — stronger reasoning ollama pull qwen2.5:7b # ~4.5 GB — strong structured output ``` -### 3. Verify the model works +### 3. Pull the embedding model + +`tai` uses `nomic-embed-text` to embed diagnostic data and runbooks for semantic retrieval (RAG). Pull it on the same host as Ollama: + +```bash +ollama pull nomic-embed-text # ~274 MB +``` + +Verify it loaded: + +```bash +curl http://localhost:11434/api/embeddings \ + -d '{"model":"nomic-embed-text","prompt":"test"}' +``` + +A JSON response with an `"embedding"` array confirms it is ready. + +### 4. Verify the chat model works ```bash ollama run gemma3:4b "what causes a systemd service to enter failed state?" ``` -### 4. Verify the HTTP API is running +### 5. Verify the HTTP API is running `tai` communicates with Ollama over its OpenAI-compatible REST API: @@ -80,7 +99,7 @@ curl http://localhost:11434/api/generate \ A JSON response with a `response` field confirms everything is working. -### 5. Point tai at your Ollama instance +### 6. Point tai at your Ollama instance Once `tai` AI integration is complete, use these flags: diff --git a/pyproject.toml b/pyproject.toml index 348fc5d..165d664 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -19,6 +19,9 @@ dependencies = [ ] [project.optional-dependencies] +rag = [ + "chromadb>=0.5,<1.0", +] dev = [ "pytest>=8.2,<9.0", "ruff>=0.5,<1.0",