Closes the chain from a game file to a catalog entry: unpack an ISO/ZIP,
content-identify the engine DLL (CMC_ObjectsContainer marker in RTTI, so a
renamed file is still found), hash it (sha256 + md5 + optional ssdeep via
ppdeep), run Ghidra headless with the extractor, enrich and import the snapshot.
- unpack.py: bsdtar (ISO9660 + ZIP) with a pure-Python zipfile fallback
- identify.py: content-based engine-DLL picker + hashing
- ghidra.py: analyzeHeadless launcher discovery + post-script run
- pipeline.py: orchestration with injectable extract_fn; sink db|http|none
- cli.py: python -m ams.acquire (incl. --identify-only dry run)
- tests: 7 new (forged PE markers + stubbed extractor) -> 18/18
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Static HTML/CSS/JS served by FastAPI (mounted at /ui, / redirects there),
talking to the existing JSON API — no node/npm, no bundler.
- games/versions sidebar with A/B version selectors
- visual 4-axis diff (types/methods/events/fields, +/- struct_layout) with
+/-/~ rows, per-axis counts, class (owner) filter, moved-methods section
- single-snapshot browser (tabs + live filter)
- app.py mounts StaticFiles(html=True) last so API routes win; / -> /ui/
Smoke-tested live on uvicorn: /, /ui/ and assets serve 200; UI wiring drives
the same /games and /diff endpoints verified end-to-end. app.js passes
`node --check`.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Ghidra headless post-script (pyghidra/Jython) that extracts the scripting
"surface" of Aidem Media engine DLLs into a versionable snapshot.json, for
diffing engine versions. All four axes validated on the golden pair
(PIKLIB8.dll / MSVC6 vs bloomoodll.dll / MSVC8):
- types : CMC_ObjectsContainer::resolve factory ladder
(script name -> C++ class, ctor, object size; + dispatch_addr,
via_module_iface for the dual MULTIARRAY branch)
- methods : CMC_*_Runner::prepareMthHashSet (name -> id) + inheritance chain
- events : CMC_*::getBehavioursList (ordered per-class list)
- fields : CMC_* ctor -> CMElement::getProperty<T>Value (name + type)
(+ bonus struct_layout: this+offset stores via decompiler P-code)
Extraction rests on semantic anchors (call targets, referenced string
literals, push/immediate operands), never decompiled-C text, so the same
script works across both compilers despite ILT stubs, undefined string
literals, unnamed FUN_ ctors and an MSVC6 inline-strcpy off-by-one.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>