Files
toallab-automation/docs/summaries/handoff-2026-03-29-openclaw-vm-refactor.md
Patrick Toal df1dd39197 docs: update claude setup
refactor: Move some things to roles
refactor: fix some linting
2026-04-12 14:02:12 -04:00

91 lines
6.3 KiB
Markdown

# Session Handoff: OpenClaw Deployment + VM Role Refactor
**Date:** 2026-03-29
**Session Focus:** Extract SNO VM creation into its own role; build new OpenClaw playbook with Signal channel and security stack
**Context Usage at Handoff:** ~60%
## What Was Accomplished
1. **Refactored SNO VM deployment into `proxmox_vm` role**`roles/proxmox_vm/`
2. **Removed `create_vm.yml` from `sno_deploy` role**`roles/sno_deploy/tasks/create_vm.yml` deleted
3. **Updated `deploy_openshift.yml` Play 1** to use `role: proxmox_vm` directly
4. **Created `roles/openclaw/`** — full role for OpenClaw installation and Signal channel
5. **Created `playbooks/deploy_openclaw.yml`** — 3-play pipeline: VM creation → SSH wait → install
## Files Created or Modified
| File Path | Action | Description |
|-----------|--------|-------------|
| `roles/proxmox_vm/tasks/main.yml` | Created | VM creation tasks moved from sno_deploy/tasks/create_vm.yml |
| `roles/proxmox_vm/defaults/main.yml` | Created | Proxmox connection + VM spec defaults |
| `roles/proxmox_vm/meta/main.yml` | Created | Role metadata |
| `roles/sno_deploy/tasks/create_vm.yml` | Deleted | Moved to proxmox_vm role |
| `roles/sno_deploy/defaults/main.yml` | Modified | Removed `sno_pvc_disk_gb` (VM-only, now in proxmox_vm) |
| `roles/sno_deploy/meta/argument_specs.yml` | Modified | Removed VM-creation-only entries |
| `playbooks/deploy_openshift.yml` | Modified | Play 1 now uses `role: proxmox_vm` |
| `roles/openclaw/defaults/main.yml` | Created | Role-scoped defaults only (no proxmox vars) |
| `roles/openclaw/meta/main.yml` | Created | Role metadata |
| `roles/openclaw/handlers/main.yml` | Created | Reload systemd + restart openclaw |
| `roles/openclaw/tasks/main.yml` | Created | Orchestrates security → install → signal |
| `roles/openclaw/tasks/security.yml` | Created | UFW + rootless Podman |
| `roles/openclaw/tasks/install.yml` | Created | User + Node.js + OpenClaw binary + systemd service |
| `roles/openclaw/tasks/signal.yml` | Created | signal-cli install + registration reminder |
| `roles/openclaw/templates/openclaw-config.yaml.j2` | Created | OpenClaw config (model provider + Signal channel) |
| `roles/openclaw/templates/openclaw.service.j2` | Created | Hardened systemd unit |
| `playbooks/deploy_openclaw.yml` | Created | Full deployment playbook |
## Decisions Made This Session
- **DR-1: `proxmox_vm` role keeps `sno_*` variable names** BECAUSE renaming would break existing host_vars and SNO playbook — STATUS: confirmed
- **DR-2: `proxmox_vm` defaults duplicated in `sno_deploy`** BECAUSE Play 4 (install.yml) runs in a separate play and cannot inherit defaults from Play 1's role — STATUS: confirmed
- **DR-3: No Tailscale** BECAUSE OPNsense firewall provides perimeter security; UFW on VM is defense-in-depth only — STATUS: confirmed
- **DR-4: Rootless Podman instead of Docker CE** for agent sandbox isolation — `podman-docker` shim provides docker CLI compatibility; `DOCKER_HOST` points to user Podman socket — STATUS: confirmed
- **DR-5: `openclaw` user is non-system (`system: false`)** BECAUSE rootless Podman requires `/etc/subuid`+`/etc/subgid` entries, which Ubuntu only creates for non-system users — STATUS: confirmed
- **DR-6: VM spec vars live in playbook Play 1 `vars:` block** (not in `openclaw` role defaults) BECAUSE they're only used in VM creation, not in the role itself — STATUS: confirmed
## Key Numbers
- OpenClaw gateway port: **18789**
- signal-cli version: **0.13.15** (pinned in `openclaw_signal_cli_version` default — verify this is current)
- Node.js version: **24** (`openclaw_node_version`)
- OpenClaw VM defaults: **2 vCPU, 4096 MB RAM, 40 GB disk**
- UFW: allow **22/tcp** (SSH) + **18789/tcp** (gateway); deny all else inbound
## Conditional Logic Established
- IF `openclaw_signal_enabled: true` THEN signal.yml runs AND Signal block appears in config template
- IF `openclaw_vm_ip == 'dhcp'` THEN DHCP cloud-init task runs, ELSE static IP task runs (requires `openclaw_vm_gateway` and `openclaw_vm_nameserver`)
- IF disk already imported (scsi0 present in VM config) THEN `qm importdisk` and disk attach tasks are skipped (idempotency guard)
## Exact State of Work in Progress
- `openclaw` role: complete and syntax-checked (no errors)
- `deploy_openclaw.yml`: syntax-checked — passes with expected warnings (inventory host not yet defined)
- Signal registration: **cannot be automated** — requires interactive QR scan or SMS captcha. Tasks print instructions; user must run manually post-deploy.
## Open Questions Requiring User Input
- [ ] What inventory hostname/group for the OpenClaw VM? Currently hardcoded to `openclaw.toal.ca` in playbook `hosts:` — confirm or change
- [ ] What `openclaw_vm_vnet` should be used? Defaulted to `lan` — confirm VNet name in Proxmox
- [ ] Static IP or DHCP for the OpenClaw VM? (`openclaw_vm_ip` default is `dhcp`)
- [ ] Which phone number to use for Signal? Dedicated bot number recommended (registration de-authenticates the main Signal app on that number)
- [ ] Confirm `signal-cli` version **0.13.15** is the desired version — check https://github.com/AsamK/signal-cli/releases
## Assumptions That Need Validation
- ASSUMED: OpenClaw config file format is YAML at `$OPENCLAW_STATE_DIR/config.yaml` — validate against actual OpenClaw docs/source; the config template (`openclaw-config.yaml.j2`) may need field name corrections
- ASSUMED: `DOCKER_HOST=unix:/run/user/<uid>/podman/podman.sock` is sufficient for OpenClaw to use Podman for sandboxes — validate that OpenClaw respects `DOCKER_HOST`
- ASSUMED: `openclaw` npm package name is correct — verify at https://www.npmjs.com/package/openclaw
- ASSUMED: Ubuntu 24.04 Noble cloud image at `https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img` — stable URL, but verify
## What NOT to Re-Read
- `roles/sno_deploy/tasks/install.yml` — already reviewed this session; no changes made
- `roles/sno_deploy/tasks/create_vm.yml` — deleted; content now in `roles/proxmox_vm/tasks/main.yml`
## Files to Load Next Session
- `playbooks/deploy_openclaw.yml` — needed to review/run the playbook
- `roles/openclaw/tasks/install.yml` — needed if adjusting OpenClaw install steps
- `roles/openclaw/templates/openclaw-config.yaml.j2` — needed if config format needs correction
- `roles/openclaw/tasks/signal.yml` — needed if adjusting Signal setup