update

2026-05-04 18:30:33 +02:00
parent e49670a664
commit 739e19f595
2 changed files with 27 additions and 5 deletions
--- a/README.md
+++ b/README.md
@@ -30,7 +30,9 @@ A troubleshooter receives a ticket reporting that the Apache service on a remote
 | Component | Tool |
 |-----------|------|
 | AI inference backend | [Ollama](https://ollama.com) |
-| Model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` |
+| Chat model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` |
+| Embedding model | `nomic-embed-text` (via Ollama) |
+| Vector store | [ChromaDB](https://www.trychroma.com) (embedded, local) |
 | Language | Python 3.11+ |

 ______________________________________________________________________
@@ -55,7 +57,7 @@ yay -S ollama-cuda
 sudo systemctl enable --now ollama
 ```

-### 2. Pull a model
+### 2. Pull a chat model

 ```bash
 ollama pull gemma3:4b       # ~3 GB — fast, good for sysadmin tasks
@@ -63,13 +65,30 @@ ollama pull llama3.1:8b     # ~5 GB — stronger reasoning
 ollama pull qwen2.5:7b      # ~4.5 GB — strong structured output
 ```

-### 3. Verify the model works
+### 3. Pull the embedding model
+
+`tai` uses `nomic-embed-text` to embed diagnostic data and runbooks for semantic retrieval (RAG). Pull it on the same host as Ollama:
+
+```bash
+ollama pull nomic-embed-text   # ~274 MB
+```
+
+Verify it loaded:
+
+```bash
+curl http://localhost:11434/api/embeddings \
+  -d '{"model":"nomic-embed-text","prompt":"test"}'
+```
+
+A JSON response with an `"embedding"` array confirms it is ready.
+
+### 4. Verify the chat model works

 ```bash
 ollama run gemma3:4b "what causes a systemd service to enter failed state?"
 ```

-### 4. Verify the HTTP API is running
+### 5. Verify the HTTP API is running

 `tai` communicates with Ollama over its OpenAI-compatible REST API:

@@ -80,7 +99,7 @@ curl http://localhost:11434/api/generate \

 A JSON response with a `response` field confirms everything is working.

-### 5. Point tai at your Ollama instance
+### 6. Point tai at your Ollama instance

 Once `tai` AI integration is complete, use these flags:

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -19,6 +19,9 @@ dependencies = [
 ]

 [project.optional-dependencies]
+rag = [
+  "chromadb>=0.5,<1.0",
+]
 dev = [
  "pytest>=8.2,<9.0",
  "ruff>=0.5,<1.0",