Big ANN '23 streaming track, dataset msturing-30M-clustered(final_runbook.yaml)
Subject outcomes
- scannscore 0.992
- hwtl_sdu_anns_streamscore 0.967
- puckscore 0.085
ML Engineering & Research
Results of the Big ANN: NeurIPS'23 competition. Four tracks of practical approximate nearest-neighbor search (Filtered, Out-of-Distribution, Sparse, Streaming), each with its own dataset and per-run leaderboard. Entries scored on throughput (QPS) and accuracy (recall@10 / average precision) across operating points on their recall/QPS tradeoff curve.
Response matrix
Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 30 subjects × 5 items, 30% of cells evaluated.
Fit to width. Hover for subject & item; click a cell for details.

Scale: Per metric (scales differ, so each is shown on its own scale): recall in [0, 1] (higher = more accurate retrieval); QPS = queries per second, unbounded (higher = faster search).
Sample items
A spread of items across the difficulty range. This benchmark does not publish per-answer traces, so each item shows which subjects succeeded.
Big ANN '23 streaming track, dataset msturing-30M-clustered(final_runbook.yaml)
Subject outcomes
Big ANN '23 sparse track, dataset sparse-full
Subject outcomes
Big ANN '23 filter track, dataset yfcc-10M
Subject outcomes
Big ANN '23 ood track, dataset text2image-10M
Subject outcomes
Big ANN '23 filter track, dataset random-filter-s
Subject outcomes
Subjects
30 subjects, ranked by mean response across this benchmark's items.