Similar versions: surface-overlap metric + endpoint + UI panel

Ranks catalogued engine versions by how much of their CMC_* surface they share, which (unlike a binary fuzzy hash) stays meaningful across compilers — the golden pair PIKLIB8/MSVC6 vs bloomoodll/MSVC8 scores 85%. - similarity.py: jaccard, surface_similarity (per-axis + pooled overall), fuzzy_similarity (ssdeep via ppdeep, secondary signal) - service.similar_snapshots + GET /snapshots/{id}/similar?min=N (SimilarHit) - UI: "Podobne wersje" panel in the snapshot browser (overlap bar + ⇄ diff) - tests: 6 new (jaccard, identical/disjoint, golden pair 0<x<100, fuzzy, endpoint + min filter) -> 28/28 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 12:33:50 +02:00
parent 30b2b1011e
commit 38be932abc
8 changed files with 275 additions and 1 deletions
--- a/ams/api/routes/snapshots.py
+++ b/ams/api/routes/snapshots.py
@@ -39,3 +39,20 @@ def get_snapshot(snapshot_id: int, db: Session = Depends(get_db)) -> models.Snap
    if snap is None:
        raise HTTPException(404, "snapshot not found")
    return snap
+
+
+@router.get("/{snapshot_id}/similar", response_model=list[schemas.SimilarHit])
+def similar_snapshots(
+    snapshot_id: int,
+    min: int = Query(0, ge=0, le=100, description="drop hits below this overall score"),
+    db: Session = Depends(get_db),
+) -> list[schemas.SimilarHit]:
+    hits = service.similar_snapshots(db, snapshot_id, minimum=min)
+    if hits is None:
+        raise HTTPException(404, "snapshot not found")
+    return [
+        schemas.SimilarHit(
+            snapshot=schemas.SnapshotOut.model_validate(snap),
+            overall=score["overall"], fuzzy=score["fuzzy"], axes=score["axes"])
+        for snap, score in hits
+    ]