Skip to main content

ML Engineering & Research

Big ANN NeurIPS'23 Competition

Results of the Big ANN: NeurIPS'23 competition. Four tracks of practical approximate nearest-neighbor search (Filtered, Out-of-Distribution, Sparse, Streaming), each with its own dataset and per-run leaderboard. Entries scored on throughput (QPS) and accuracy (recall@10 / average precision) across operating points on their recall/QPS tradeoff curve.

5items
30subjects
30%observed
Modelsubject type
CC-BY-4.0license
generaldomain
textmodality

Response matrix

Every model, scored item by item.

Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 30 subjects × 5 items, 30% of cells evaluated.

Fit to width. Hover for subject & item; click a cell for details.

Big ANN NeurIPS'23 Competition response matrix: AI models (rows) against items (columns)
lowhighUnobserved

Scale: Per metric (scales differ, so each is shown on its own scale): recall in [0, 1] (higher = more accurate retrieval); QPS = queries per second, unbounded (higher = faster search).

Sample items

What the questions look like — and how subjects answer.

A spread of items across the difficulty range. This benchmark does not publish per-answer traces, so each item shows which subjects succeeded.

Item 164% solve rate

Big ANN '23 streaming track, dataset msturing-30M-clustered(final_runbook.yaml)

Subject outcomes

  • scannscore 0.992
  • hwtl_sdu_anns_streamscore 0.967
  • puckscore 0.085
Item 2186816% solve rate

Big ANN '23 sparse track, dataset sparse-full

Subject outcomes

  • shnswscore 12388.977
  • linscanscore 0.403
  • cufescore 0.368
Item 3728670% solve rate

Big ANN '23 filter track, dataset yfcc-10M

Subject outcomes

  • pineconescore 90662.739
  • faissscore 0.482
Item 4875621% solve rate

Big ANN '23 ood track, dataset text2image-10M

Subject outcomes

  • hannsscore 53000.496
  • ngtscore 0.769
Item 51287024% solve rate

Big ANN '23 filter track, dataset random-filter-s

Subject outcomes

  • faissscore 33664.583

Subjects

The models, agents, and reward models evaluated.

30 subjects, ranked by mean response across this benchmark's items.

  1. 1pinecone34834.629
  2. 2hanns23326.196
  3. 3scann20112.965
  4. 4zilliz19603.321
  5. 5pinecone-ood19304.088
  6. 6parlayivf15106.762
  7. 7mysteryann-dif10063.096
  8. 8mysteryann9861.974
  9. 9puck7276.153
  10. 10wm_filter7028.291
  11. 11sustech-ood6788.513
  12. 12dhq6751.962
  13. 13hwtl_sdu_anns_filter6719.79
  14. 14pyanns6521.695
  15. 15pinecone_smips5019.545
  16. 16puck-fizz3973.926
  17. 17ngt3619.115
  18. 18epsearch3223.306
  19. 19shnsw3174.475
  20. 20vamana3162.459
  21. 21fdufilterdiskann2895.75
  22. 22faiss2488.076
  23. 23diskann2369.887
  24. 24faissplus1852.146
  25. 25rubignn1718.152
  26. 26nle1156.53
  27. 27cufe1155.088
  28. 28sustech-whu334.76
  29. 29linscan52.07
  30. 30hwtl_sdu_anns_stream0.868