Containerise: Postgres + Redis/RQ + API + Ghidra worker

Brings up the documented target architecture as a docker-compose stack — a
modular monolith with the Ghidra step split into its own async worker.

- worker/: RQ queue (lazy redis import) + run_acquisition task (Job status
  queued→started→finished/failed, drives ams.acquire with sink=db)
- Job model + JobOut schema; Snapshot.data is JSONB on Postgres
- POST/GET /jobs: stream an upload to a shared volume, enqueue, poll status
- docker/api.Dockerfile (slim) + docker/worker.Dockerfile (JDK21 + Ghidra
  fetched at build, overridable via GHIDRA_URL) + docker-compose.yml
- ghidra.py: AMS_GHIDRA_SCRIPTS override for in-container script path
- pyproject: [worker] extra (rq/redis/psycopg), python-multipart in [api]
- tests: 4 new (task success/failure + endpoint enqueue/503) -> 22/22

Verified: API image builds, container serves /health + /ui + /jobs; compose
config validates. Worker image (downloads ~1 GB Ghidra) not built here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Patryk Gensch
2026-05-31 12:24:47 +02:00
parent 6797ad5ddb
commit f4aa7caaa9
15 changed files with 511 additions and 3 deletions

View File

@@ -48,6 +48,33 @@ analyzeHeadless <projDir> <projName> -process PIKLIB8.dll \
-postScript extract_engine_surface.py "$(pwd)/snapshots/PIKLIB8.snapshot.json"
```
## Uruchomienie w kontenerach (docker compose)
Pełny stack — modularny monolit + **wydzielony worker Ghidry** — czterema usługami:
```
db (Postgres) ── api (FastAPI + UI) ──┐
├── redis (kolejka)
worker (Ghidra headless) ─────────────┘
```
```bash
docker compose up --build # api na http://localhost:8000
```
`api` i `worker` współdzielą wolumen `uploads`: API streamuje wgrane archiwum na dysk,
worker czyta je po ścieżce (przez Redisa leci tylko ścieżka, nie bajty). Obraz workera
**pobiera Ghidrę (~1 GB) przy pierwszym buildzie** — wersję nadpiszesz przez
`--build-arg GHIDRA_URL=…`. Postgres trzyma katalog trwale (wolumen `pgdata`).
Asynchroniczna akwizycja przez API (zlecenie → kolejka → worker → snapshot w bazie):
```bash
curl -F file=@game.iso -F game="Reksio i UFO" http://localhost:8000/jobs # → 202 {id, status:queued}
curl http://localhost:8000/jobs/1 # poll: queued→started→finished
```
Endpointy joba: `POST /jobs` (upload+enqueue), `GET /jobs`, `GET /jobs/{id}` (status, `snapshot_id`, `error`).
## Akwizycja — ISO/ZIP → katalog
Worker, który domyka łańcuch od *pliku gry* do wpisu w katalogu: rozpakowuje archiwum,