ML Engineering & Research

ML4CO-Bench-101

ML4CO-Bench-101: a modular benchmark of neural combinatorial-optimization solvers under a paradigm-model-learning taxonomy, over 7 graph CO problems (TSP, ATSP, CVRP, MIS, MCl, MVC, MCut). This build ingests the released per-instance neural-solver outputs (ML4CO/ML4CO-Bench-101-SL, ml4co_result/) and computes the achieved objective value per instance, with the reference (optimal/near-optimal) objective as correct_answer.

35,578items

1subjects

100%observed

Modelsubject type

CC-BY-4.0license

generaldomain

textmodality

Original source Paper Build script ← All benchmarks

Response matrix

Every model, scored item by item.

Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 1 subjects × 35,578 items, 100% of cells evaluated.

ML4CO-Bench-101 response matrix: AI models (rows) against items (columns) — lowhighUnobserved
Scale: Achieved objective value per instance — tour cost (TSP/ATSP/CVRP), set size (MIS/MVC/MCl), or cut value (MCut); each combinatorial problem is shown on its own scale.

Sample items

What the questions look like — and how subjects answer.

A spread of items across the difficulty range. This benchmark does not publish per-answer traces, so each item shows which subjects succeeded.

Item 199% solve rateanswer: 0.986935

ML4CO-Bench-101 ATSP instance, scale 50, #1782

Subject outcomes

ML4CO-SL-solverscore 0.988

Item 2772% solve rateanswer: 7.723952

ML4CO-Bench-101 TSP instance, scale 100, #662

Subject outcomes

ML4CO-SL-solverscore 7.724

Item 31084% solve rateanswer: 10.685417

ML4CO-Bench-101 CVRP instance, scale 50, #9278

Subject outcomes

ML4CO-SL-solverscore 10.836

Item 41477% solve rateanswer: 14.476074

ML4CO-Bench-101 CVRP instance, scale 100, #6429

Subject outcomes

ML4CO-SL-solverscore 14.767

Item 51922% solve rateanswer: 18.993048

ML4CO-Bench-101 CVRP instance, scale 100, #6151

Subject outcomes

ML4CO-SL-solverscore 19.216

Item 6893000% solve rateanswer: 8734.000000

ML4CO-Bench-101 MCut instance, scale ba-giant, #5

Subject outcomes

ML4CO-SL-solverscore 8930

Subjects

The models, agents, and reward models evaluated.

1 subjects, ranked by mean response across this benchmark's items.

1ML4CO-SL-solver102.815

Full data on Hugging Face Back to the gallery