docs: update plan for migration of backend

This commit is contained in:
2026-04-12 12:54:10 -04:00
parent c789454810
commit 5c830443f3
2 changed files with 217 additions and 1 deletions

View File

@@ -0,0 +1,214 @@
# Plan: bab-backend-ansible Rewrite
**Date:** 2026-04-12
**Status:** DRAFT — awaiting confirmation before execution
**Target repo:** `/home/ptoal/Dev/Projects/bab-backend-ansible`
**Architecture reference:** `docs/context/sdlc-architecture.md`
---
## What Gets Retired
These playbooks are Appwrite-specific and have no equivalent in the new architecture. Delete them.
| File | Reason |
|------|--------|
| `playbooks/install_appwrite.yml` | Appwrite self-hosted, replaced by supabase.com |
| `playbooks/bootstrap_appwrite.yml` | Appwrite self-hosted |
| `playbooks/upgrade_appwrite.yml` | Appwrite self-hosted |
| `playbooks/backup_appwrite.yml` | Appwrite Docker volumes + MariaDB; replaced by pg_dump |
| `playbooks/provision_database.yml` | Appwrite collection/attribute schema; replaced by Supabase migrations |
| `playbooks/provision_users.yml` | Appwrite user provisioning; replaced by Supabase Auth admin API |
| `playbooks/load_data.yml` | Appwrite seed data |
| `playbooks/read_database.yml` | Appwrite diagnostic |
| `playbooks/tasks/patch_appwrite_compose.yml` | Appwrite-specific |
| `playbooks/tasks/upgrade_appwrite_step.yml` | Appwrite-specific |
| `playbooks/templates/appwrite*.j2` | Appwrite config templates |
| `appwrite.json` | Appwrite project definition |
---
## What Gets Kept / Adapted
| File | Action | Notes |
|------|--------|-------|
| `playbooks/install_nginx.yml` | **Keep** | Dev server nginx still needed |
| `playbooks/configure_act_runner.yml` | **Keep** | Gitea runner still on bab1 |
| `playbooks/install_node_exporter.yml` | **Keep** | Monitoring unchanged |
| `playbooks/clean_logs.yml` | **Keep** | Day2 ops unchanged |
| `update_certificates.yml` | **Keep** | TLS cert renewal unchanged |
| `playbooks/deploy_application.yml` | **Adapt** | See below |
| `rulebooks/gitea_webhook.yml` | **Adapt** | See below |
| `rulebooks/alertmanager_listener.yml` | **Keep** | Alerting unchanged |
| `requirements.yml` | **Update** | Remove Appwrite collections; add `community.postgresql` |
### `deploy_application.yml` adaptation
Split into two playbooks:
- `deploy_dev.yml` — nginx artifact swap on bab1 (keep existing logic, update paths/vars)
- `deploy_prod.yml` — S3 sync: fetch artifact tarball → extract → `aws s3 sync` to `{{ s3_bucket }}`
### `rulebooks/gitea_webhook.yml` adaptation
Add branch-based routing: `dev` branch → `oysqn-deploy-dev` job template; `main` branch → `oysqn-deploy-prod` job template (AAP manual approval gate in the workflow).
Updated payload contract: `{ artifact_url, version, branch }` (already matches bab-app pattern).
---
## New Playbooks
### 1. `playbooks/migrate_supabase.yml`
Runs `supabase db push` against the target environment. On failure, executes the matching rollback SQL from `supabase/rollback/` via `psql`, then aborts.
**Vars:** `supabase_project_ref`, `supabase_db_password` (from Vault)
**Steps:**
1. Pre-migration `pg_dump` snapshot → `bab1.mgmt.toal.ca:/var/backups/oysqn/` (pre-migration label)
2. `supabase db push --project-ref {{ supabase_project_ref }}` (via `supabase` CLI on control node or EE)
3. On failure: identify failing migration filename → run `supabase/rollback/<filename>.sql` via psql → fail with message
4. On irreversible migration (rollback file contains `IRREVERSIBLE` marker): halt, alert, do not run rollback SQL
### 2. `playbooks/backup_supabase_prod.yml`
Performs `pg_dump` of production Supabase DB → compressed → stored on bab1 via SSH. Enforces retention policy.
**Vars:** `supabase_postgres_url` (from Vault at `kv/oys/prod/supabase/postgres_url`)
**Steps:**
1. Determine backup type: monthly (1st of month → `oysqn-prod-YYYY-MM-monthly.sql.gz`) or regular (`oysqn-prod-YYYYMMDD-HHMMSS.sql.gz`)
2. `pg_dump "{{ supabase_postgres_url }}" | gzip` → SSH copy to `bab1.mgmt.toal.ca:/var/backups/oysqn/`
3. Rotate: delete regular backups older than 90 days or count > 30; delete monthly backups older than 12 months or count > 12
**Note:** `pg_dump` runs from the AAP EE or control node (not bab1) — postgres_url is the direct Supabase.com connection string.
### 3. `playbooks/sync_gitea_secrets.yml`
Reads `url` + `anon_key` from Vault, constructs `.env` content, updates Gitea repo variables via API.
**Steps:**
1. Vault lookup: `kv/oys/dev/supabase/{url,anon_key}` and `kv/oys/prod/supabase/{url,anon_key}`
2. Construct `ENV_FILE_DEV` and `ENV_FILE_PROD` content (multiline env file format)
3. `PUT /api/v1/repos/{{ gitea_owner }}/{{ gitea_repo }}/actions/variables/ENV_FILE_DEV` (Gitea API, token from `kv/oys/shared/infra/gitea_token`)
4. Same for `ENV_FILE_PROD`
**Trigger:** AAP schedule (daily) + on-demand job template
### 4. `playbooks/deploy_dev.yml`
Fetch artifact tarball → extract → nginx swap on bab1.
**Vars:** `artifact_url`, `version`
**Steps:**
1. Download artifact from Gitea Release URL (auth header: `kv/oys/shared/infra/gitea_token`)
2. Extract to tempdir
3. Rsync/copy to nginx webroot (e.g. `/usr/share/nginx/html/oysqn/`)
4. Cleanup tempdir
### 5. `playbooks/deploy_prod.yml`
Fetch artifact tarball → extract → `aws s3 sync`.
**Vars:** `artifact_url`, `version`
**Steps:**
1. Download artifact (Gitea token auth)
2. Extract to tempdir
3. `aws s3 sync <tempdir>/ s3://{{ s3_bucket }}/ --delete` (AWS creds from Vault `kv/oys/prod/app/`)
4. Cleanup tempdir
---
## EDA Rulebook (updated)
`rulebooks/gitea_webhook.yml` — routes by branch:
```yaml
rules:
- name: Deploy to dev
condition:
all:
- event.payload.data.artifact_url is defined
- event.payload.data.branch == "dev"
action:
run_job_template:
name: oysqn-deploy-dev
organization: OYS
job_args:
extra_vars:
artifact_url: "{{ event.payload.data.artifact_url }}"
version: "{{ event.payload.data.version }}"
- name: Deploy to prod (approval gate in AAP workflow)
condition:
all:
- event.payload.data.artifact_url is defined
- event.payload.data.branch == "main"
action:
run_job_template:
name: oysqn-deploy-prod
organization: OYS
job_args:
extra_vars:
artifact_url: "{{ event.payload.data.artifact_url }}"
version: "{{ event.payload.data.version }}"
```
---
## AAP Workflow Templates
### `oysqn-deploy-dev`
```
pre-migration backup (backup_supabase_prod) [SKIP if no migrations]
→ migrate (migrate_supabase — dev project)
→ deploy (deploy_dev)
→ E2E smoke test (yarn test:e2e BASE_URL=https://dev.oysqn.app)
→ on failure: rollback migration (handled in migrate_supabase), redeploy previous artifact, notify
→ on success: notify
```
### `oysqn-deploy-prod`
```
[manual approval gate]
→ pre-migration backup (backup_supabase_prod)
→ migrate (migrate_supabase — prod project)
→ deploy (deploy_prod)
→ E2E smoke test (yarn test:e2e BASE_URL=https://oysqn.app)
→ on failure: rollback migration, redeploy previous S3 artifact, notify
→ on success: notify
```
---
## Implementation Sequence
1. **Rename/archive existing Appwrite playbooks** — move to `playbooks/archive/appwrite/`; do not delete until new playbooks are tested
2. **Update `requirements.yml`** — add `community.postgresql`, remove Appwrite-specific collections
3. **Write `sync_gitea_secrets.yml`** — lowest risk, standalone, no deploy dependency; test in isolation
4. **Write `backup_supabase_prod.yml`** — test against dev Supabase project first (with a throwaway postgres URL)
5. **Write `migrate_supabase.yml`** — needs `supabase` CLI in EE or on control node; verify CLI availability first
6. **Adapt `deploy_dev.yml`** from existing `deploy_application.yml`
7. **Write `deploy_prod.yml`** (new — S3)
8. **Update EDA rulebook** — branch routing
9. **Configure AAP** — create job templates, workflow templates, approval gate, schedule for backup + secret sync
10. **Decommission Appwrite** — after prod cutover confirmed
---
## Open Questions
- [x] **Supabase CLI in EE**: Not present — added to `ee-demo` via `append_final` build step. `SUPABASE_VERSION` build arg required. Verify asset URL against GitHub releases before first build.
- [x] **pg_dump location**: Not present — added `postgresql [platform:rpm]` to `ee-demo` system deps. Runs from AAP EE control node against Supabase.com postgres_url.
- [x] **EE image**: `amazon.aws` collection added to `ee-demo` `requirements.yml`; `boto3`/`botocore` added to `requirements.txt`. S3 sync via `amazon.aws.s3_sync` module.
- [x] **Dev server URL**: `https://bab.toal.ca` — E2E `BASE_URL` for dev workflow.
- [x] **nginx webroot path for dev**: `/usr/share/nginx/html/` on `bab1.mgmt.toal.ca` — confirmed.
- [x] **Gitea artifact auth**: Token at `kv/oys/bab_gitea` (note: deviates from `kv/oys/(dev|prod|shared)/...` convention — existing secret, use as-is). Gitea base URL: `https://gitea.toal.ca/`. Pass as `Authorization: token <value>` header in `get_url`.
- [ ] **dev postgres_url**: Architecture doc has no `postgres_url` in Vault for dev (`kv/oys/dev/supabase/`). Migration playbook needs it to run rollback SQL via psql. Add `kv/oys/dev/supabase/postgres_url` to Vault before first migration run.
---
## Files Created/Modified Summary
| Action | Path |
|--------|------|
| Retire (move to archive) | `playbooks/install_appwrite.yml` and 9 others (see above) |
| Keep | `install_nginx.yml`, `configure_act_runner.yml`, `install_node_exporter.yml`, `clean_logs.yml`, `update_certificates.yml` |
| Adapt | `deploy_application.yml` → split into `deploy_dev.yml` + `deploy_prod.yml` |
| Adapt | `rulebooks/gitea_webhook.yml` |
| Update | `requirements.yml` |
| New | `migrate_supabase.yml` |
| New | `backup_supabase_prod.yml` |
| New | `sync_gitea_secrets.yml` |
| New | `deploy_dev.yml` |
| New | `deploy_prod.yml` |