Commit Graph

2 Commits

Author SHA1 Message Date
Patryk Gensch
ba9db82a4c worker: optional PyGhidra back-end for Ghidra 11.4+/12.x (no Jython)
The .py extractor runs fine under PyGhidra in the GUI; only `analyzeHeadless`
doesn't init PyGhidra. Add an env-gated CPython path so modern Ghidra works headless:

- ghidra.run_extractor_pyghidra(): runs the same GhidraScript via pyghidra.run_script
  (boots Ghidra in-process, imports+analyses, getScriptArgs()=[out_path]); run_extractor
  dispatches to it when AMS_USE_PYGHIDRA is set. No script changes needed.
- worker image installs pyghidra + sets GHIDRA_INSTALL_DIR; compose exposes
  AMS_USE_PYGHIDRA (default off). Jython path stays the default and untouched.
- README documents both variants (Jython <=11.3.x vs PyGhidra 11.4+/12.x).
- test: AMS_USE_PYGHIDRA routes to the PyGhidra back-end (clear error if pkg missing).

35/35 tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 18:03:04 +02:00
Patryk Gensch
6797ad5ddb Add ISO/ZIP acquisition pipeline (ams.acquire worker)
Closes the chain from a game file to a catalog entry: unpack an ISO/ZIP,
content-identify the engine DLL (CMC_ObjectsContainer marker in RTTI, so a
renamed file is still found), hash it (sha256 + md5 + optional ssdeep via
ppdeep), run Ghidra headless with the extractor, enrich and import the snapshot.

- unpack.py: bsdtar (ISO9660 + ZIP) with a pure-Python zipfile fallback
- identify.py: content-based engine-DLL picker + hashing
- ghidra.py: analyzeHeadless launcher discovery + post-script run
- pipeline.py: orchestration with injectable extract_fn; sink db|http|none
- cli.py: python -m ams.acquire (incl. --identify-only dry run)
- tests: 7 new (forged PE markers + stubbed extractor) -> 18/18

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 12:11:56 +02:00