29
README.md
29
README.md
@@ -30,7 +30,9 @@ A troubleshooter receives a ticket reporting that the Apache service on a remote
|
|||||||
| Component | Tool |
|
| Component | Tool |
|
||||||
|-----------|------|
|
|-----------|------|
|
||||||
| AI inference backend | [Ollama](https://ollama.com) |
|
| AI inference backend | [Ollama](https://ollama.com) |
|
||||||
| Model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` |
|
| Chat model | `gemma3:4b`, `llama3.1:8b`, or `qwen2.5:7b` |
|
||||||
|
| Embedding model | `nomic-embed-text` (via Ollama) |
|
||||||
|
| Vector store | [ChromaDB](https://www.trychroma.com) (embedded, local) |
|
||||||
| Language | Python 3.11+ |
|
| Language | Python 3.11+ |
|
||||||
|
|
||||||
______________________________________________________________________
|
______________________________________________________________________
|
||||||
@@ -55,7 +57,7 @@ yay -S ollama-cuda
|
|||||||
sudo systemctl enable --now ollama
|
sudo systemctl enable --now ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Pull a model
|
### 2. Pull a chat model
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ollama pull gemma3:4b # ~3 GB — fast, good for sysadmin tasks
|
ollama pull gemma3:4b # ~3 GB — fast, good for sysadmin tasks
|
||||||
@@ -63,13 +65,30 @@ ollama pull llama3.1:8b # ~5 GB — stronger reasoning
|
|||||||
ollama pull qwen2.5:7b # ~4.5 GB — strong structured output
|
ollama pull qwen2.5:7b # ~4.5 GB — strong structured output
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. Verify the model works
|
### 3. Pull the embedding model
|
||||||
|
|
||||||
|
`tai` uses `nomic-embed-text` to embed diagnostic data and runbooks for semantic retrieval (RAG). Pull it on the same host as Ollama:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ollama pull nomic-embed-text # ~274 MB
|
||||||
|
```
|
||||||
|
|
||||||
|
Verify it loaded:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl http://localhost:11434/api/embeddings \
|
||||||
|
-d '{"model":"nomic-embed-text","prompt":"test"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
A JSON response with an `"embedding"` array confirms it is ready.
|
||||||
|
|
||||||
|
### 4. Verify the chat model works
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ollama run gemma3:4b "what causes a systemd service to enter failed state?"
|
ollama run gemma3:4b "what causes a systemd service to enter failed state?"
|
||||||
```
|
```
|
||||||
|
|
||||||
### 4. Verify the HTTP API is running
|
### 5. Verify the HTTP API is running
|
||||||
|
|
||||||
`tai` communicates with Ollama over its OpenAI-compatible REST API:
|
`tai` communicates with Ollama over its OpenAI-compatible REST API:
|
||||||
|
|
||||||
@@ -80,7 +99,7 @@ curl http://localhost:11434/api/generate \
|
|||||||
|
|
||||||
A JSON response with a `response` field confirms everything is working.
|
A JSON response with a `response` field confirms everything is working.
|
||||||
|
|
||||||
### 5. Point tai at your Ollama instance
|
### 6. Point tai at your Ollama instance
|
||||||
|
|
||||||
Once `tai` AI integration is complete, use these flags:
|
Once `tai` AI integration is complete, use these flags:
|
||||||
|
|
||||||
|
|||||||
@@ -19,6 +19,9 @@ dependencies = [
|
|||||||
]
|
]
|
||||||
|
|
||||||
[project.optional-dependencies]
|
[project.optional-dependencies]
|
||||||
|
rag = [
|
||||||
|
"chromadb>=0.5,<1.0",
|
||||||
|
]
|
||||||
dev = [
|
dev = [
|
||||||
"pytest>=8.2,<9.0",
|
"pytest>=8.2,<9.0",
|
||||||
"ruff>=0.5,<1.0",
|
"ruff>=0.5,<1.0",
|
||||||
|
|||||||
Reference in New Issue
Block a user