Skip to main content

Medical

ClinBench

ClinBench released ablation study: per-item lung-cancer staging from TCGA pathology reports. Binary (gold == prediction) responses for 3 models across four extracted fields (pT, pN, tumor_stage, histologic_diagnosis) over 774 TCGA lung cases. The main 11-model benchmark is aggregate-F1 only with access-gated notes; this is the one released per-(model, item) artifact.

774items
3subjects
100%observed
Apache-2.0license
medicinedomain
textmodality

Response matrix

Every model, scored item by item.

Each row is an AI model and each column an item, ordered so the strongest models and easiest items gather toward one corner. 3 subjects × 774 items, 100% of cells evaluated.

Fit to width. Hover for subject & item; click a cell for details.

ClinBench response matrix: AI models (rows) against items (columns)
Correct (1)Incorrect (0)Unobserved

Scale: 1 = correct · 0 = incorrect

Sample items

What the questions look like — and how subjects answer.

A spread of items across the difficulty range, each shown with a few subjects' actual answers.

Item 125% solve rateanswer: pT=TX; pN=NX; tumor_stage=Stage IV; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-93-A4JP (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: TX pN: NX tumor_stage: Stage IV histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 incorrect

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 incorrect

    Lung Adenocarcinoma

  • gpt-4o-mini incorrect

    Lung Adenocarcinoma

Item 258% solve rateanswer: pT=T2a; pN=N1; tumor_stage=Stage IIA; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-55-1596 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2a pN: N1 tumor_stage: Stage IIA histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 incorrect

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 incorrect

    Lung Adenocarcinoma

  • gpt-4o-mini incorrect

    Lung Adenocarcinoma

Item 367% solve rateanswer: pT=T2a; pN=N0; tumor_stage=Stage I; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-49-AARQ (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2a pN: N0 tumor_stage: Stage I histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini incorrect

    Lung Adenocarcinoma

Item 475% solve rateanswer: pT=T1b; pN=NX; tumor_stage=Stage IA; histologic_diagnosis=Lung Squamous Cell Carcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-94-7943 (subtype: lusc). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T1b pN: NX tumor_stage: Stage IA histologic_diagnosis: Lung Squamous Cell Carcinoma

How subjects answered

  • gpt-4o-2024-05-13 correct

    Lung Squamous Cell Carcinoma

  • gpt-4o-mini correct

    Lung Squamous Cell Carcinoma

  • gpt-3.5-turbo-1106 incorrect

    Lung Squamous Cell Carcinoma

Item 575% solve rateanswer: pT=T2a; pN=N0; tumor_stage=Stage IB; histologic_diagnosis=Lung Squamous Cell Carcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-39-5019 (subtype: lusc). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2a pN: N0 tumor_stage: Stage IB histologic_diagnosis: Lung Squamous Cell Carcinoma

How subjects answered

  • gpt-4o-2024-05-13 correct

    Lung Squamous Cell Carcinoma

  • gpt-3.5-turbo-1106 incorrect

    Lung Squamous Cell Carcinoma

  • gpt-4o-mini incorrect

    Lung Squamous Cell Carcinoma

Item 683% solve rateanswer: pT=T2b; pN=N0; tumor_stage=Stage IIA; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-MP-A4TE (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2b pN: N0 tumor_stage: Stage IIA histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

Item 783% solve rateanswer: pT=T1b; pN=N0; tumor_stage=Stage IA; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-73-7498 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T1b pN: N0 tumor_stage: Stage IA histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

Item 892% solve rateanswer: pT=T2; pN=N2; tumor_stage=Stage IIIA; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-64-5779 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2 pN: N2 tumor_stage: Stage IIIA histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 incorrect

    Lung Adenocarcinoma

Item 992% solve rateanswer: pT=T3; pN=N0; tumor_stage=Stage IIB; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-95-7039 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T3 pN: N0 tumor_stage: Stage IIB histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

Item 1092% solve rateanswer: pT=T3; pN=N2; tumor_stage=Stage IIIA; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-86-8359 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T3 pN: N2 tumor_stage: Stage IIIA histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

Item 11100% solve rateanswer: pT=T2a; pN=N0; tumor_stage=Stage IB; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-67-3773 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2a pN: N0 tumor_stage: Stage IB histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

Item 12100% solve rateanswer: pT=T2a; pN=N0; tumor_stage=Stage IB; histologic_diagnosis=Lung Adenocarcinoma

TCGA lung-cancer staging case (information extraction). Patient id: TCGA-55-7815 (subtype: luad). Raw pathology-report text is TCGA access-gated and not redistributed; the model read the full note and extracted the four staging fields below. Gold extracted fields: pT: T2a pN: N0 tumor_stage: Stage IB histologic_diagnosis: Lung Adenocarcinoma

How subjects answered

  • gpt-3.5-turbo-1106 correct

    Lung Adenocarcinoma

  • gpt-4o-2024-05-13 correct

    Lung Adenocarcinoma

  • gpt-4o-mini correct

    Lung Adenocarcinoma

Subjects

The models, agents, and reward models evaluated.

3 subjects, ranked by mean response (accuracy) across this benchmark's items.

  1. 1gpt-4o-2024-05-130.8751
  2. 2gpt-4o-mini0.8092
  3. 3gpt-3.5-turbo-11060.7794