Deployment (Single VM)
The production Resolve deployment runs on a single Standard_D8s_v5 Azure VM (8 vCPUs, 32 GiB RAM). The backend, frontend, Incus container runtime, and all worker containers live on this one host. This document describes how the stack is wired together and how to operate it.
Stack overview
Section titled “Stack overview”Service topology
Section titled “Service topology”The stack runs as systemd units. All services are managed via systemctl.
| Service | Command | Port | Notes |
|---|---|---|---|
amplifier-resolve-backend.service |
amplifier-resolve serve |
:10120 (loopback) |
FastAPI backend |
amplifier-resolve-frontend.service |
Node/Caddy static serve | :3000 (or via Caddy) |
React SPA build output |
caddy.service |
Caddy reverse proxy | :443, :80 |
TLS termination + routing |
incus.service |
Incus container runtime | — | Must be running for worker containers |
# Check all service statussystemctl status amplifier-resolve-backendsystemctl status caddysystemctl status incus
# Restart backendsystemctl restart amplifier-resolve-backend
# Follow backend logsjournalctl -u amplifier-resolve-backend -f
# Follow Caddy logsjournalctl -u caddy -fEnvironment configuration
Section titled “Environment configuration”The backend reads its environment from /root/.amplifier/resolve/env. This file is
the canonical source for feature flags, API keys, and infrastructure settings.
# View current envcat /root/.amplifier/resolve/env
# Edit (then restart backend to apply)nano /root/.amplifier/resolve/envsystemctl restart amplifier-resolve-backendKey environment variables
Section titled “Key environment variables”| Variable | Required | Description |
|---|---|---|
AMPLIFIER_RESOLVE_ENABLE_REALITY_CHECK |
For RC | true to enable reality check capability |
AMPLIFIER_RESOLVE_REALITY_CHECK_RUNNER_IMAGE |
For RC | Incus image name for RC runner |
AMPLIFIER_RESOLVE_DATA_DIR |
No | Override data root (default: ~/.amplifier/resolve/) |
ANTHROPIC_API_KEY |
Yes (for LLM) | Primary LLM provider |
OPENAI_API_KEY |
Alt | Alternative LLM provider |
GH_TOKEN |
Yes | GitHub access for workspace repos + PR creation |
Auth token
Section titled “Auth token”Bearer token is auto-generated at first startup and stored at:
~/.amplifier/resolve/token# Get your tokencat ~/.amplifier/resolve/token
# Verify it workscurl -H "Authorization: Bearer $(cat ~/.amplifier/resolve/token)" \ http://localhost:10120/api/healthDisk management
Section titled “Disk management”Data locations
Section titled “Data locations”| Path | Content | Disk |
|---|---|---|
~/.amplifier/resolve/instances/ |
Instance event logs, state, artifacts | OS disk |
~/.amplifier/resolve/token |
Auth token | OS disk |
/workspace/ |
Azure Files NFS mount — application state | Azure Files |
/var/lib/incus/ |
Incus container storage | OS disk |
/var/log/ |
System logs | OS disk |
Reclaim space
Section titled “Reclaim space”# Check disk usagedf -h /
# Find large directoriesdu -sh ~/.amplifier/resolve/instances/* | sort -rh | head -20
# Remove completed instance data (older than 7 days)find ~/.amplifier/resolve/instances/ -maxdepth 1 -type d -mtime +7 \ -exec rm -rf {} +
# Clean Incus images (removes unused images)incus image listincus image delete <fingerprint>
# Clean stopped containersincus listincus delete resolve-{id} # for completed/stuck instances
# Check journal disk usagejournalctl --disk-usage# Vacuum to last 7 daysjournalctl --vacuum-time=7dIncus operations
Section titled “Incus operations”# List all containers (running + stopped)incus list
# Check container resource usageincus info resolve-{id}
# Connect to a running worker containerincus exec resolve-{id} -- /bin/bash
# Read instance events directlyincus exec resolve-{id} -- tail -f /project/.resolve/events.jsonl
# Stop a stuck containerincus stop resolve-{id} --forceincus delete resolve-{id}DNS and TLS
Section titled “DNS and TLS”resolve.amplifier.ms CNAME points to the VM’s public IP. Caddy handles TLS via
Let’s Encrypt ACME automatically.
# Check Caddy's TLS cert statuscaddy validate --config /etc/caddy/Caddyfilecurl -I https://resolve.amplifier.ms/api/healthWorker cache image
Section titled “Worker cache image”The cache image (amplifier-cache:python) reduces worker startup from ~5 minutes to
~15 seconds by pre-baking common Python dependencies into an Incus image.
# Build the cache image (run once, or after major dependency changes)cd ~/amplifier-resolvebash scripts/build-cache-image.sh
# Verify image existsincus image list | grep amplifier-cache
# If cache image is missing, workers use cold path (~5 min startup)Updating the platform
Section titled “Updating the platform”# Pull latest backendcd ~/amplifier-resolvegit pull
# Restart backend (picks up new code)systemctl restart amplifier-resolve-backend
# Pull and rebuild frontendcd ~/amplifier-app-resolvegit pullnpm installnpm run build
# Check statussystemctl status amplifier-resolve-backendcurl https://resolve.amplifier.ms/api/healthOperational health checks
Section titled “Operational health checks”#!/usr/bin/env bash# Quick health check script
BASE="https://resolve.amplifier.ms"TOKEN=$(cat /root/.amplifier/resolve/token)
echo "=== Service Status ==="systemctl is-active amplifier-resolve-backend caddy incus
echo ""echo "=== API Health ==="curl -s "$BASE/api/health" | jq .
echo ""echo "=== Auth Status ==="curl -s "$BASE/api/auth/status" | jq .
echo ""echo "=== Active Instances ==="curl -s -H "Authorization: Bearer $TOKEN" "$BASE/api/instances?status=running" | jq 'length'
echo ""echo "=== Disk Usage ==="df -h / | tail -1
echo ""echo "=== Container Count ==="incus list --format csv | wc -lLogs and debugging
Section titled “Logs and debugging”# Backend logs (last 100 lines)journalctl -u amplifier-resolve-backend -n 100
# Backend logs with timestamps, followjournalctl -u amplifier-resolve-backend -f --output=short-iso
# Look for errors in last hourjournalctl -u amplifier-resolve-backend --since "1 hour ago" | grep -i error
# Caddy access logsjournalctl -u caddy -f
# Instance-specific events (bypass API)tail -f ~/.amplifier/resolve/instances/{instance_id}/events/events.jsonl | jq .