Containerise: Postgres + Redis/RQ + API + Ghidra worker

Brings up the documented target architecture as a docker-compose stack — a
modular monolith with the Ghidra step split into its own async worker.

- worker/: RQ queue (lazy redis import) + run_acquisition task (Job status
  queued→started→finished/failed, drives ams.acquire with sink=db)
- Job model + JobOut schema; Snapshot.data is JSONB on Postgres
- POST/GET /jobs: stream an upload to a shared volume, enqueue, poll status
- docker/api.Dockerfile (slim) + docker/worker.Dockerfile (JDK21 + Ghidra
  fetched at build, overridable via GHIDRA_URL) + docker-compose.yml
- ghidra.py: AMS_GHIDRA_SCRIPTS override for in-container script path
- pyproject: [worker] extra (rq/redis/psycopg), python-multipart in [api]
- tests: 4 new (task success/failure + endpoint enqueue/503) -> 22/22

Verified: API image builds, container serves /health + /ui + /jobs; compose
config validates. Worker image (downloads ~1 GB Ghidra) not built here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Patryk Gensch
2026-05-31 12:24:47 +02:00
parent 6797ad5ddb
commit f4aa7caaa9
15 changed files with 511 additions and 3 deletions

19
docker/api.Dockerfile Normal file
View File

@@ -0,0 +1,19 @@
# API + Command Center UI. Stays slim — the heavy Ghidra lifting lives in the worker image.
FROM python:3.12-slim
WORKDIR /app
# Copy metadata first for layer caching, then the source.
COPY pyproject.toml README.md ./
COPY ams ./ams
COPY ghidra_scripts ./ghidra_scripts
COPY snapshots ./snapshots
# Editable install keeps ams + ghidra_scripts co-located (the worker resolves the script
# path relative to the package). The API needs the queue client too, to enqueue jobs.
RUN pip install --no-cache-dir -e ".[api]" rq redis "psycopg[binary]>=3.1"
ENV AMS_UPLOAD_DIR=/data/uploads
EXPOSE 8000
CMD ["uvicorn", "ams.api.app:create_app", "--factory", "--host", "0.0.0.0", "--port", "8000"]

35
docker/worker.Dockerfile Normal file
View File

@@ -0,0 +1,35 @@
# Ghidra-equipped acquisition worker. Self-contained: bundles JDK 21 + a pinned Ghidra
# release so `docker compose up` just works (at the cost of a heavy, slow-to-build image).
#
# Override the Ghidra build without editing this file:
# docker build --build-arg GHIDRA_URL=https://github.com/.../ghidra_X_PUBLIC_DATE.zip ...
FROM eclipse-temurin:21-jdk-jammy
ARG GHIDRA_URL=https://github.com/NationalSecurityAgency/ghidra/releases/download/Ghidra_11.3_build/ghidra_11.3_PUBLIC_20250205.zip
# Runtime deps: python (the package), unzip/wget (fetch Ghidra), libarchive-tools (bsdtar:
# unpacks ISO9660 + ZIP game archives).
RUN apt-get update && apt-get install -y --no-install-recommends \
python3 python3-pip unzip wget ca-certificates libarchive-tools \
&& rm -rf /var/lib/apt/lists/*
# Fetch + unpack Ghidra into /opt/ghidra (strip the versioned top-level dir).
RUN wget -q "$GHIDRA_URL" -O /tmp/ghidra.zip \
&& unzip -q /tmp/ghidra.zip -d /opt \
&& mv /opt/ghidra_* /opt/ghidra \
&& rm /tmp/ghidra.zip
ENV GHIDRA_HOME=/opt/ghidra
ENV AMS_GHIDRA_SCRIPTS=/app/ghidra_scripts
ENV AMS_UPLOAD_DIR=/data/uploads
WORKDIR /app
COPY pyproject.toml README.md ./
COPY ams ./ams
COPY ghidra_scripts ./ghidra_scripts
COPY snapshots ./snapshots
RUN pip3 install --no-cache-dir -e ".[api,acquire,worker]"
# Drain the 'acquire' queue. Shell form so $REDIS_URL expands at runtime.
CMD rq worker --url "${REDIS_URL:-redis://redis:6379/0}" acquire