# Air-Gapped Deployment Guide Turnstone can run entirely without internet access. This guide covers pre-downloading all model weights, configuring offline mode, and verifying that no outbound connections are made at runtime. ## What requires network access by default | Component | When | What it downloads | |-----------|------|------------------| | Stage 2 ML classifier | First diagnose run (if `TURNSTONE_CLASSIFIER_MODEL` is set) | HuggingFace model weights (~300 MB) | | Stage 4 sentence-transformers embedder | First diagnose run (if `TURNSTONE_EMBED_BACKEND=sentence_transformers`) | Embedding model (~130 MB) | | LLM inference | Every diagnose run | Nothing — calls your configured `GPU_SERVER_URL` only | | Log glean | Every glean run | Nothing — reads local files or SSH sources | If neither the classifier nor the sentence-transformers embedder is enabled, Turnstone makes no outbound network calls at runtime (only local SQLite reads/writes and your configured LLM endpoint). ## Step 1 — Pre-download models (on an internet-connected machine) Run these commands in the `cf` conda environment before moving to the air-gapped host: ```bash # Stage 2 ML classifier (only needed if TURNSTONE_CLASSIFIER_MODEL is set) conda run -n cf python -c " from transformers import pipeline pipeline('text-classification', model='byviz/bylastic_classification_logs') print('classifier cached') " # Stage 4 sentence-transformers embedder (only if TURNSTONE_EMBED_BACKEND=sentence_transformers) conda run -n cf python -c " from sentence_transformers import SentenceTransformer SentenceTransformer('BAAI/bge-small-en-v1.5') print('embedder cached') " ``` Models are cached to `~/.cache/huggingface/`. Copy that directory to the air-gapped host at the same path before deployment. ## Step 2 — Pre-ingest your documentation corpus On the internet-connected machine, or before cutting the network: ```bash # Write your manifest (see scripts/manifests/example.yaml) # Then bulk-upload to the context DB: conda run -n cf python scripts/harvest_docs.py --manifest scripts/manifests/your-site.yaml ``` The context DB (`turnstone-context.db`) is a plain SQLite file — copy it to the air-gapped host alongside `turnstone.db`. ## Step 3 — Set offline environment variables Add to your `.env` file (copy from `.env.example`): ```bash # Block all HuggingFace hub network access TURNSTONE_OFFLINE_MODE=1 # Point models at the pre-downloaded cache (usually the default) # HF_HOME=/home/youruser/.cache/huggingface ``` `TURNSTONE_OFFLINE_MODE=1` sets both `HF_HUB_OFFLINE=1` and `TRANSFORMERS_OFFLINE=1` before any model library loads. If the cache is missing or incomplete, the classifier falls back to the pattern-tag / regex path and embedding is skipped — diagnose still works, just without ML-assisted severity or suppression. ## Step 4 — Configure a local LLM endpoint Turnstone's LLM reasoning calls your `GPU_SERVER_URL`. On an air-gapped host this must be a local endpoint — either Ollama or a local cf-orch coordinator: ```bash # Local Ollama GPU_SERVER_URL=http://localhost:11434 # Local cf-orch coordinator GPU_SERVER_URL=http://localhost:7700 ``` Pull the Ollama model before cutting network access: ```bash ollama pull llama3.1:8b ``` ## Step 5 — Verify no outbound connections at runtime Start Turnstone and run a diagnose query, then check for unexpected outbound connections: ```bash # Watch for any connection to HuggingFace, PyPI, or other external hosts ss -tp | grep python # or lsof -i -n -P | grep python | grep ESTABLISHED ``` Expected: only connections to your `GPU_SERVER_URL` and any SSH log sources. No connections to `huggingface.co`, `cdn-lfs.huggingface.co`, or `pypi.org`. ## Deployment checklist - [ ] `~/.cache/huggingface/` copied to air-gapped host (if using ML classifier or embedder) - [ ] `TURNSTONE_OFFLINE_MODE=1` set in `.env` - [ ] `GPU_SERVER_URL` points to a local inference endpoint - [ ] Ollama model pulled locally (if using Ollama) - [ ] Context DB pre-populated with runbooks via `harvest_docs.py` - [ ] No internet access verified with `ss -tp` during a diagnose run - [ ] `TURNSTONE_API_KEY` set if the host is accessible over the network (see API auth docs) ## Troubleshooting **"OSError: We couldn't connect to huggingface.co…"** The model is not in the local cache. Either download it on a connected machine and copy `~/.cache/huggingface/`, or unset `TURNSTONE_CLASSIFIER_MODEL` to fall back to the pattern-based classifier. **Diagnose still works but no ML severity in pipeline stages** Expected when running offline without a pre-cached model. Stage 2 falls back to `pattern_tags` → regex severity detection automatically. **LLM reasoning missing from diagnose output** Check that `GPU_SERVER_URL` is reachable from the air-gapped host and that your local Ollama/vLLM has the configured model pulled.