Similar versions: surface-overlap metric + endpoint + UI panel
Ranks catalogued engine versions by how much of their CMC_* surface they share,
which (unlike a binary fuzzy hash) stays meaningful across compilers — the golden
pair PIKLIB8/MSVC6 vs bloomoodll/MSVC8 scores 85%.
- similarity.py: jaccard, surface_similarity (per-axis + pooled overall),
fuzzy_similarity (ssdeep via ppdeep, secondary signal)
- service.similar_snapshots + GET /snapshots/{id}/similar?min=N (SimilarHit)
- UI: "Podobne wersje" panel in the snapshot browser (overlap bar + ⇄ diff)
- tests: 6 new (jaccard, identical/disjoint, golden pair 0<x<100, fuzzy,
endpoint + min filter) -> 28/28
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -39,3 +39,20 @@ def get_snapshot(snapshot_id: int, db: Session = Depends(get_db)) -> models.Snap
|
||||
if snap is None:
|
||||
raise HTTPException(404, "snapshot not found")
|
||||
return snap
|
||||
|
||||
|
||||
@router.get("/{snapshot_id}/similar", response_model=list[schemas.SimilarHit])
|
||||
def similar_snapshots(
|
||||
snapshot_id: int,
|
||||
min: int = Query(0, ge=0, le=100, description="drop hits below this overall score"),
|
||||
db: Session = Depends(get_db),
|
||||
) -> list[schemas.SimilarHit]:
|
||||
hits = service.similar_snapshots(db, snapshot_id, minimum=min)
|
||||
if hits is None:
|
||||
raise HTTPException(404, "snapshot not found")
|
||||
return [
|
||||
schemas.SimilarHit(
|
||||
snapshot=schemas.SnapshotOut.model_validate(snap),
|
||||
overall=score["overall"], fuzzy=score["fuzzy"], axes=score["axes"])
|
||||
for snap, score in hits
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user