updated README

2025-10-03 09:38:46 -04:00
parent e954bf82fb
commit 665fcbe191
9 changed files with 310 additions and 572 deletions
@@ -1,185 +1,220 @@
-# ML Repo — Architecture and External RAG Server Design (for Ollama/Open WebUI)
+# ML Stack — Local AI Orchestration Toolkit

-My openWebUI/searxng configs, plugins, RAG server, as well as a custom program that runs the AI's code in isolated Docker containers
+This repository packages a complete self-hosted assistant stack around Open WebUI plus several companion services: a scheduler that can trigger chats and workflows, a docker-backed code runner, a Roku remote tool server, Nextcloud file access, SearxNG metasearch, and a headless browser UI for deep-research sessions. Everything is wired together through `docker-compose.yml` so the stack can be brought up on a single host.

-*Last updated: 2025-09-10*
+_Last updated: 2025-10-03_

 ---

-## Summary :3
+## A (Few) Notes

-This repository wires together a local AI stack built around **Open WebUI**, **Ollama**, **SearxNG**, and two custom utilities: a **code runner** (executes model-generated code inside sandboxed containers) and a **headless research browser UI**. The current compose setup already gives you working RAG (retrieval-augmented generation) **inside Open WebUI** without needing a separate RAG service.
+1. ports are currently exposed on most services for development purposes (e.g. 12253 for the scheduler), remove these in production or consider adding a proxy
+
+2. **ALL DATA IS STORED IN VOLUMES!!!** This means if you do `docker compose down -v` your data **WILL** dissapear. Consider mounting a persistant directory to avoid this
+
+3. Before starting the cluster, check if you need the different components (e.g. Nextcloud Tool Server). They are set to restart on failiure and will throw if missing env vars/credentials, which will loop endlessly
+
+4. If you do not use cloudflared for tunneling, please adjust the CORS policies accordingly, and consider adding a reverse proxy to either your local machine or the compose
+
+5. The code runner and scheduler both mount the host Docker socket. Ensure the host user/group IDs match the compose configuration (`DOCKER_GID` build arg defaults to 977) so containers can operate without root. This will be replaced when I enentually migrate this to a kubernetes cluster
+
+6. When adjusting `NEXTCLOUD_ACCESS_DIRS`, remember to restart `ollama-nextcloud` so the regex list is reloaded

 ---

-## Repo map and how each piece fits
+## Stack At A Glance
+
+| Compose service | Directory / build context | External ports | Primary role |
+|-----------------|---------------------------|----------------|--------------|
+| `open-webui` | (image: `ghcr.io/open-webui/open-webui:main`) | `4000 -> 8080` | Chat UI, agent orchestration, embedded knowledge base & RAG powered by Postgres |
+| `postgres` | – | – | Persistence for Open WebUI (users, KB, events) |
+| `searxng` | `searxng.yml` | `4001 -> 8080` (debug only) | Private SearxNG instance used for live web search tools |
+| `coderunner` | `coderunner/` | – (internal `8787`) | Bun service that executes pure source code inside sandboxed Docker containers |
+| `openwebui_tools` | `tools/` | – (internal `1331`) | Python Roku remote API exposed as an OpenAPI tool server |
+| `browser` | `browser/` | `7788 -> 7788` | Playwright Chromium UI for autonomous browsing / research |
+| `schedules-api` | `scheduler/` | `12253 -> 12253` | Cron-style job scheduler that can open chats, call templates, and upload files |
+| `ollama-nextcloud` | `nextcloud/` | `13284 -> 1111` | Nextcloud WebDAV proxy with caching and access controls |
+
+Volumes declared in compose: `open-webui`, `pgdata`, `searxng_data`, `webui_data`, `schedule_data`, and `nextcloud_data`
+
+
+> [!CAUTION]
+> PLEASE I BEG OF YOU REMEMBER TO BACK THESE UP/USE A LOCAL DIRECTORY.
+> IF YOU DO NOT AND REMOVE OR PRUNE THE VOLUMES YOU WILL LOSE *ALL* DATA
+
+---
+
+## Service Details
+
+### Open WebUI (`open-webui`)
+- Runs the latest `ghcr.io/open-webui/open-webui:main` image with Postgres backing for durable data (`open-webui` and `pgdata` volumes)
+
+- `.env` enables the login form, optional API keys (not currently used), and forwards identifying headers so downstream tools know which user initiated a request
+
+- Depends on the tool containers (`openwebui_tools`, `coderunner`, `schedules-api`, `ollama-nextcloud`) via internal networking; discover their OpenAPI docs from inside the UI to register tools
+
+### Postgres (`postgres`)
+> [!IMPORTANT]
+> If you plan on exposing ports on this service, please move the inline credentials to the `.env` file
+
+- Standard `postgres:latest` image. Credentials are set inline in compose for local development
+
+- Health-checked with `pg_isready`; the data volume `pgdata` stores Open WebUI metadata
+
+### SearxNG (`searxng`)
+- Private SearxNG deployment for agent web search tasks with HTML/JSON outputs enabled
+
+- Mounts `searxng.yml` and persists internal data to `searxng_data`. External port 4001 is exposed only for local debugging and should be removed in production
+
+### Code Runner (`coderunner`)
+- Bun-based HTTP server that accepts pure source code plus optional extra files, then runs the workload in a throwaway Docker container pinned to an allow-listed base image per language
+
+- Enforces strict limits (`--network=none`, read-only root FS, tmpfs workdir, CPU/memory caps, dropped capabilities). Supported Languages:
+	- `python`
+	- `node`
+	- `bun`
+	- `bash`
+	- `ruby`
+	- `go`
+	- `rust`
+	- `java`
+	- `c`
+	- `cpp`
+- Exposes `GET /openapi.json` and `POST /execute` inside the internal network (`http://coderunner:8787`). Requires the host Docker socket to spawn child sandboxes; the compose file mounts it read-only with matching group ID.
+
+### Roku Tool Server (`openwebui_tools`)
+- Lightweight Python HTTP server that proxies Roku remote commands
+
+- Reads `ROKU_IP` from `.env`; returns helpful errors when the IP is missing or the device is offline
+
+- Serves `GET /roku/openapi.json` for automatic tool registration and handles `GET /roku/{command}` requests. Supported command list matches the enum in `spec/roku.openapi.json` (navigation, inputs, power, volume, remote finder)
+
+### Browser Research UI (`browser`)
+- Builds the upstream `browser-use/web-ui` project, installs Chromium plus dependencies, and launches the UI on port 7788
+
+- Runs as an unprivileged user (uid 1000) with dedicated tmpfs directories and a `webui_data` volume for persisted history/state
+
+- Configure resolution, telemetry, and default LLM via `browser/.env` or container environment variables
+
+- The browser-use docs can be found at https://docs.browser-use.com/
+
+### Scheduler API (`schedules-api`)
+- Bun/Node cron worker that lets you schedule Open WebUI chats or template-driven jobs using authenticated user tokens
+
+- Persists schedule definitions to `schedule_data` (JSON payload) and can store uploaded supporting files under the same volume
+
+- Reads workflow templates from the bundled `scheduler/templates.json`. To inject custom templates, mount a host file or populate the root-level `templates.json/` directory and update the compose volume mapping
+
+- Key endpoints (documented in `scheduler/openapi.json`):
+  - `GET /openapi.json`: tool contract.
+  - `POST /api/schedules`: create or replace a schedule (cron or one-shot ISO timestamp). Validates feature flags, attachments, and template references
+  - `GET /api/schedules`: list schedules scoped to the calling user (identified via Open WebUI bearer token)
+  - `DELETE /api/schedules/{name}`: remove a schedule the user owns
+- Includes a static UI in `scheduler/public/` for manual interaction. Uses `node-cron` to avoid overlapping executions; failed jobs clean themselves up
+
+### Nextcloud Files Tool (`ollama-nextcloud`)
+- Express + WebDAV proxy that exposes a simple JSON API for browsing, downloading, and uploading files stored in Nextcloud
+
+- Environment variables (configured in `.env`):
+  - `NEXTCLOUD_APP_ID` / `NEXTCLOUD_APP_PASS` / `NEXTCLOUD_WEBDAV_ADDR`: service credentials
+  - `NEXTCLOUD_ACCESS_DIRS`: JSON array of regex strings that whitelist readable paths (e.g. `["^/Notes", "^/School"]`). When unset, the tool has full access
+
+- Cached downloads are stored under `/tmp` using an embedded SQLite index (`cache.ts`). The server keeps ETags in sync and reuses cached bytes when possible unless `bypasscache` is requested
+
+- Major endpoints (see `nextcloud/openapi.json`):
+  - `GET /openapi.json`: discovery document for tool registration.
+  - `POST /file`: fetch a file. Automatically caches and returns metadata + content-type.
+  - `POST /dir`: list directory contents (shallow or recursive).
+  - `PUT /file`: upload via multipart form-data (optional recursive dir creation, never overwrites existing files).
+
+### Cloudflared Tunnel Config
+- `cloudflared-tunnel-config.yml` maps friendly hostnames to the local services (Ollama, Open WebUI, tool servers). Use it as a blueprint when exposing the stack through Cloudflare Tunnels.
+
+---
+
+## Configuration (`.env`)
+
+```env
+ROKU_IP=
+
+WEBUI_URL=
+
+# use built-in login form (username/password)
+ENABLE_LOGIN_FORM="true"
+
+# forward identity on outbound model requests (if you're going to use openAI/external LLM)
+ENABLE_FORWARD_USER_INFO_HEADERS="true"
+
+# allow user api keys for the scheduler calling OWUI’s
+ENABLE_API_KEY_AUTH="true"
+
+NEXTCLOUD_APP_ID=
+NEXTCLOUD_APP_PASS=
+NEXTCLOUD_WEBDAV_ADDR=
+NEXTCLOUD_ACCESS_DIRS=
+```
+
+---
+
+## Running the Stack
+
+1. Install Docker and Docker 
+
+2. Populate `.env` with the correct Roku and Nextcloud settings plus any Open WebUI options
+
+3. Build images (pull base layers and bake GID overrides where needed):
+   ```sh
+   docker compose build --pull
+   ```
+
+4. Launch everything:
+   ```sh
+   docker compose up -d
+   ```
+
+5. Open WebUI is available on http://localhost:4000 (use credentials from the UI setup). The supporting services are reachable on the ports listed above or through the internal Docker network
+
+To inspect logs for a specific service:

 ```sh
-.
-├─ docker-compose.yml
-├─ searxng.yml                  # searxng settings; defaults, json+html enabled; not a public instance
-├─ cloudflared-tunnel-config.yml # cloudflare tunnel routing to ollama, openwebui, and tools
-├─ README.md
-├─ LICENSE                      # apache-2.0
-│
-├─ rag-server/
-│  ├─ Dockerfile                # Runs the file that does the RAG stuff
-│  └─ index.tsx                 # Does the RAG stuff
-│
-├─ browser/
-│  └─ Dockerfile                # builds browser-use/web-ui (playwright chromium) on :7788
-|
-└─ coderunner/
-   ├─ Dockerfile                # bun-based service that exposes an OpenAPI tool for sandboxed code exec
-   ├─ index.ts                  # the server; integrates with Open WebUI as a tool via /openapi.json
-   └─ package.json              # @types/node only (dev) to feed the OCD
+docker compose logs -f coderunner
 ```

-### Open WebUI (in `docker-compose.yml`)
+Bring the stack down (volumes persist):

-* purpose: chat UI + orchestration layer; **includes a built-in knowledge base + RAG** with chunking, embedding, search, and prompt templating.
-* notable: backed by Postgres in this compose. exposes `4000:8080`.
-* storage: a docker volume `open-webui:` holds app data; Postgres uses `pgdata:`.
-
-### Postgres (in `docker-compose.yml`)
-
-* purpose: persistence for Open WebUI features (users, knowledge, etc.). health-checked with `pg_isready`.
-
-### SearxNG (in `docker-compose.yml` + `searxng.yml`)
-
-* purpose: metasearch engine used by Open WebUI tools/agents for live web lookups.
-* config highlights: `use_default_settings: true`, `public_instance: false`, `limiter: false`; formats: `html` and `json`.
-
-### Coderunner service (`coderunner/`)
-
-* **what it is:** a small HTTP server (Bun runtime) that executes pure source code in short-lived, sandboxed containers.
-* **why it exists:** lets Open WebUI tools run code safely with tight resource limits (no network, read-only fs, cgroup limits, `--cap-drop=ALL`, `no-new-privileges`).
-* **integration contract:** exposes an **OpenAPI schema at `/openapi.json`** and a single POST `/execute` endpoint. Open WebUI can import this as a **tool server**.
-* **security posture:** pulls allow-listed base images (gcc, python, node, bun, etc.), mounts only a tmpfs workdir, times out jobs ≈25s, and runs with non-root uid/gid. The container has access to the host’s docker socket *only* to run the sandbox containers.
-
-### Browser-use web-ui (`browser/`)
-
-* purpose: “autonomous” research browser UI (chromium via playwright), reachable on `:7788`.
-* built from upstream `browser-use/web-ui` repo, with python deps and browsers installed in the image.
-
-### Cloudflared tunnel (`cloudflared-tunnel-config.yml`)
-
-* maps hostnames (like `mlep.domain.com` for Ollama, `owebui.domain.com` for Open WebUI, and a `tools` host) to the internal services. Useful for private, authenticated access without public inbound ports.
-
---
-
-## Why I currently **don’t** use an external RAG server
-
-Open WebUI ships with pretty good **knowledge / RAG** support: add files/URLs, it chunks + embeds, indexes, retrieves, and automatically **prefixes retrieved context** to the model prompt using a RAG template. For lightweight to mid-sized corpora and single-user/small-team usage, that’s often all you need.
-
-**Stay with built-in RAG if most of these are true:**
-
-* total corpus is ≤ \~100k chunks and grows slowly.
-* single user or small team (no multi-tenant isolation needed).
-* no special retrieval logic (hybrid lexical+semantic, rerankers, metadata filters) beyond what Open WebUI provides.
-* tolerance for “UI-managed” knowledge; you don’t need programmatic ingestion pipelines or job queues.
-
-## When an external RAG server makes sense
-
-Adopt a decoupled RAG service when you need one or more of:
-
-* **bigger data / throughput**: millions of chunks, higher QPS, horizontal scaling.
-* **advanced retrieval**: custom chunkers, hybrid search (bm25 + vector), **reranking**, time-decay, per-tenant filters, embeddings A/B, or multi-modal (image/audio) retrieval.
-* **programmatic ingestion**: CI-driven pipelines from git/docs/confluence/S3; delta updates; background jobs.
-* **governance / isolation**: strict multi-tenant separation, PII retention controls, audit trails.
-* **interoperability**: a clean HTTP API and OpenAPI so other apps (beyond Open WebUI) can reuse your index.
-
---
-
-## External RAG Server — Design and Reference Implementation
-
-This is a small, dependency-light service designed to run with **Bun** and integrate with both **Ollama** and **Open WebUI**.
-
-### Goals
-
-* minimal moving parts; runs fine on a single host.
-* uses Ollama for **embeddings** and **chat**.
-* supports **collections**, **upserts**, **queries**, and an opinionated `/chat` that does retrieve-then-generate.
-* ships an **OpenAPI** so Open WebUI can import it as a tool server.
-* default in-memory store (persisted to JSON) for simplicity; optional adapters for vector DBs later.
-
-### API surface
-
-* `GET /openapi.json` – schema for tool integration.
-* `POST /collections` – create a logical collection `{ name }`.
-* `GET /collections` – list collections.
-* `POST /upsert` – `{ collection, items:[{ id?, text, metadata? }] }`; chunks+embeds text and stores vectors.
-* `POST /query` – `{ collection, query, topK?=5, where? }` --> nearest chunks with scores.
-* `POST /chat` – `{ collection, query, topK?=5, model?, embedModel? }` --> runs RAG and calls Ollama chat, returns the answer + citations.
-
-### Storage Strategy
-
-* **default:** in-memory + JSON file on disk (`./data/rag.json`). good for dev/small usage.
-* **plug-in adapters:** swap in Qdrant, SQLite-Vec, pgvector, Weaviate, etc., without changing the HTTP API.
-
---
-
-### Add to `docker-compose.yml`
-
-```yaml
-  rag:
-    build:
-      context: ./rag-server
-      dockerfile: Dockerfile
-    environment:
-      OLLAMA_BASE: "http://mlep.domain.com:11434"
-      OLLAMA_CHAT_MODEL: "llama3.1"
-      OLLAMA_EMBED_MODEL: "nomic-embed-text"
-    volumes:
-      - rag_data:/app/data
-    networks:
-      - internal
-    restart: unless-stopped
-
-volumes:
-  rag_data:
+```sh
+docker compose down
 ```

-> if you already expose services via cloudflared, add another hostname mapping to the `rag` container (`- hostname: rag.domain.com -> service: http://rag:8788`).
+---
+
+## Registering Tool Servers in Open WebUI
+
+Inside Open WebUI (Settings --> Tools --> Add tool server), point to the internal URLs:
+- Code runner: `http://coderunner:8787/openapi.json`
+- Scheduler: `http://schedules-api:12253/openapi.json`
+- Nextcloud files: `http://ollama-nextcloud:1111/openapi.json`
+- Roku remote: `http://openwebui_tools:1331/roku/openapi.json`
+
+These should be fully internal in the docker network. If you expose them consider using a reverse proxy/authentication

 ---

-## Wiring the RAG server into Open WebUI and Ollama
+## Data, Volumes, and Shared Paths

-### 1. Pull models
+- `open-webui` volume: Open WebUI application state (uploads, knowledge base, configs)
+- `pgdata` volume: Postgres cluster data directory
+- `searxng_data` volume: SearxNG runtime files
+- `webui_data` volume: browser-use web UI session data
+- `schedule_data` volume: scheduler persisted schedules and stored file attachments
+- `nextcloud_data` volume: temp storage for cached Nextcloud content

-* `ollama pull nomic-embed-text` (embeddings)
-* `ollama pull llama3.1` (chat)
-
-### 2. Expose the OpenAPI to Open WebUI as a **tool server**
-
-* in Open WebUI --> **settings --> tools** --> **add tool server**
-* paste the url for the cloudflared hostname
-* you’ll now see tool functions like `listCollections`, `createCollection`, `upsert`, `query`, `chat` available to the assistant
-
-### 3. Usage pattern inside a chat
-
-* to build a knowledge base, call the `createCollection` and `upsert` tools with your documents
-* to answer, call `chat` which performs retrieve-then-generate against your chosen collection
-
---
-
-## FAQ — Built-in vs. External RAG
-
-**Q: will Open WebUI’s built-in RAG conflict with this server?**
-no — you can use either, or both. Open WebUI’s knowledge base is great for ad-hoc use. this service is for programmatic/control-plane needs or when you outgrow the UI’s storage/retrieval.
-
-**Q: how do enforce tenant isolation?**
-use one collection per tenant and never mix. for stronger guarantees, run separate RAG instances or choose Qdrant with per-collection access control.
-
-**Q: how can use my chunker/reranker?**
-yes. place them ahead of `/upsert` and `/query` respectively, or add endpoints like `/rerank` and `/embed` to experiment.
-
-**Q: can this call OpenAI-compatible endpoints instead of native Ollama?**
-Ollama exposes an experimental OpenAI-compatible API. you can add a thin client if you already point tools at `/v1/chat/completions`.
+> [!IMPORTANT]
+> Back up the volumes you care about before upgrading images

 ---

 ## License

-This write-up and reference code are provided under the same **Apache-2.0** terms as the repository.
+The repository and reference code are released under Apache-2.0 (see `LICENSE`).
+