# The Lieutenant

*Last reviewed May 2026*

The Lieutenant is the codename for the host underneath my local-first AI architecture. **Murray** is the name for the operation that runs on top of it — the persona that receives requests, routes them, and answers. The Lieutenant is the hardware. Murray is the system you actually talk to.

The name evolved. Originally Murray was just the receptionist: a small local model that received every prompt and decided whether it should run locally or go out to a frontier API. As the operation grew — orchestration, tool use, the surveillance and storage layers — the receptionist's name spread to cover the whole thing. The receptionist is still Murray; now everything else is too.

## The Premise

Most consumer and small-team AI deployments default to cloud APIs. That default has costs that compound: ongoing fees, data egress to providers whose retention policies change, latency on tasks that should be local, and a structural dependence on external availability for routine operations.

Local-first AI sovereignty is the discipline of running the operations that should be local — locally — and routing only what genuinely benefits from frontier models out to APIs. Murray is the working implementation. The Lieutenant is what it runs on.

## Architecture

Described here at the capability layer rather than the product layer, so the description survives the inevitable tool churn underneath.

**The hypervisor host.** A Type-1 hypervisor running on enterprise-grade workstation hardware with ECC memory. Hosts the isolated VMs that handle each operating role. Sized for sustained, unattended operation rather than peak benchmark performance.

**Bulk storage with redundancy.** A storage VM with passthrough access to enterprise drives configured for fault tolerance. Loss of a single drive does not interrupt the operation. The storage layer is the foundation everything else depends on, so it is built for reliability before performance.

**Object detection pipeline.** A surveillance VM running object detection against the camera array, operating unattended, treated as a critical-uptime service.

**Inference and orchestration.** A VM with a passed-through GPU, running local model serving, agentic orchestration, a conversational interface, and tool-use integrations. This is where Murray lives.

**Control plane.** A separate, low-power machine handles routine administration, scheduled jobs, and the operator-facing dashboards. It is the layer that makes the rest of the architecture pleasant to operate day to day.

**Networking and remote access.** A hardened network with secured remote access. The architecture is designed so that the operator can administer it from anywhere without exposing it to the public internet.

**Cold offsite.** Backups land on a cold-tier offsite provider. Recovery from total premises loss is rehearsed; it is not theoretical.

## Operating Philosophy

**Hierarchical inference.** Murray's receptionist function classifies every prompt. Simple or sensitive queries execute locally. Complex or creative queries route to frontier APIs. Most volume stays local; cost stays bounded; sensitive content never leaves the premises.

**Failure isolation.** Each VM is independently bootable, recoverable, and replaceable. Storage failure does not affect inference. Inference failure does not affect surveillance. The architecture treats blast radius as a first-class design concern.

**Lethal reliability.** The operating standard is industrial, not consumer. Unattended operation. Clean failure modes. Predictable maintenance windows. Where storage and surveillance are concerned, downtime is treated as the worst available outcome.

## Current Implementation

*Snapshot as of May 2026. The implementation evolves; the architecture above does not.*

- **Hypervisor host:** Threadripper PRO platform with ECC memory, running Proxmox VE
- **Storage:** TrueNAS Scale with ZFS RAIDZ2 on enterprise drives
- **Surveillance:** UniFi Protect, replacing the prior Frigate-on-Debian pipeline
- **Inference orchestration:** Hermes Agent for agentic workflows, with local model serving for the receptionist tier and frontier API routing for the complex tier
- **Control plane:** Apple Mac Mini M4 at the studio, with a Mac Studio M5 Ultra coming online for heavier orchestration
- **Network:** UniFi with Tailscale-secured remote access
- **Cold offsite:** Oracle Cloud Infrastructure

When the implementation changes — and it does — the change lands here. The architecture above does not move.

## Why It Exists

The Lieutenant is the practical answer to a strategic question: what should run locally, and what should leave the premises? Murray is the operating system that answers it day to day. Together they are the infrastructure on which the rest of my technical work depends.